%GetTweet-A New SAS Macro to Fetch and Summarize Tweets

Download Report

Transcript %GetTweet-A New SAS Macro to Fetch and Summarize Tweets

%GetTweet - A New SAS Macro
To Fetch and Summarize Tweets
Satish Garla
Goutam Chakraborty
Oklahoma State University
Agenda
 Background
 Process
 %GetTweet Macro
 Examples on macro utility
 Tweets vs Retweets
 Tweet Report
 Understanding Networks
 Conclusion
Background
 Growth of Social Media Analytics
 Rise of Twitter
 Average number of users per day: 460,000*
 Average number of Tweets per day: 140 Million*
 1 Billion Searches per day**
 Challenge: Getting the right data (Tweets)
 Twitter Search API and SAS
* During Feb-Mar 2011 Source: blog.twitter.com
** During Oct 2010 Source: blog.twitter.com
Process
Summary
Reports
GetTweet
SAS Data set
Text Mining
Sentiment
Analysis
• HTTP Procedure communicates with Twitter via the Search API
• Twitter returns the data in XML format
• SAS XML Engine is used to convert XML files to SAS Data sets
%GetTweet Macro
 USP: Collecting customized tweets
 %GetTweet(WORDS=,PHRASE=,ANY=,NONE=,HASH=,FROM=,
TO=,SINCE=,UNTIL=,QUESTION=,CODE=,PATH=);
 CODE=
 Base 64 encoded Twitter
Username and Password
Ex: dXhsdgfhsdefhJ….
 PATH=
 Folder Directory
 Other requirements,
 XML Mapper Code
Examples
 Collecting Tweets that Mention the Word “Cancer”
%GetTweet (WORDS=Cancer, CODE= &authorization, PATH=&path);
 “Cancer” referred as

Disease

Sun sign / Astrology
Examples
 Collecting Tweets without a Specific Set of Words
%GetTweet (WORDS=Cancer, NONE=Zodiac Scorpio Pisces Astrology,..);
 “Cancer” referred as

Disease

Sun sign
Examples
 Collecting Tweets on a specific day
%GetTweet (WORDS=Cancer, SINCE=2010-10-03, UNTIL=2010-10-03,..);
 Collecting Tweets with the words “SAS” or “Analytics”
%GetTweet ( ANY=SAS Analytics, CODE= &authorization, PATH=&path);
 Other Keyword parameters

FROM=
TO=
HASH=
QUESTION=
Tweets Vs. Retweets
 Retweets
Tweet: Monday Keynote Update ~ Bret Michaels: Music Icon, Actor, 'Celebrity Apprentice'
Winner Keynotes #DMA2010 - http://bit.ly/bswBgX
Retweet: RT@DMA_USA Monday Keynote Update ~ Bret Michaels: …………………
Re-Retweet: RT @neolane: RT@DMA_USA Monday Keynote Update ~ Bret Michaels: …….
 Strength of Networks and Influencers
 Challenges in Text Mining Retweets
Tweet: Rigorous exercise would help in curing breast cancer
 Data sets of Tweets and Retweets are created
Tweet Report
 Basic data cleaning using PERL
 Generate PDF report
 Report for #DMA2010
 Page 1 (Tweet Report)
 Page 2 (Top Retweets)
Tweet Report (#SASGF11)
Understanding Networks
 Insights on Influencers in Twitter networks
 SAS/GRAPH® NV Workshop
Conclusion
 Capture data from Twitter using PROC HTTP and
Twitter API
 Use GetTweet Macro
 SAS Data set of customized Tweets and Retweets
 Generate Summary Reports
 Exploring Twitter Networks using SAS NV workshop
 Customize the macro to solve research requirements
*Full Macro Program is available in the paper
[email protected]
[email protected]