Page 12 - MSDN Magazine, May 2018
P. 12
of this article, the Web site address does not matter, so enter a valid URL. Click the checkbox to agree to the terms of the Twitter Devel- oper Agreement and click the Create your Twitter application button.
On the following screen, look for Consumer Key (API Key) under the Application Settings section. Click on the “manage keys and access tokens” link. On the page that follows, click the Create my access token button, as shown in Figure 3, to create an access token. Make note of the following four values shown on this page: Consumer Key (API Key), Consumer Secret (API Secret), Access Token and Access Token Secret.
Using Tweepy to Read Tweets
Tweepy is a Python library that simplifies the interaction between Python code and the Twitter API. More information about Tweepy can be found at docs.tweepy.org/en/v3.5.0. At this time, return to the Jupyter notebook and enter the following code to install the Tweepy API. The exclamation mark instructs Jupyter to execute a command in the shell:
!pip install tweepy
Once the code executes successfully, the response text in the cell willread:“Successfullyinstalledtweepy-3.6.0,”althoughthespecific version number may change. In the following cell, enter the code in Figure 4 into the newly created empty cell and execute it.
The results that come back should look similar to the following:
#ElonMusk deletes own, #SpaceX and #Tesla Facebook pages after #deletefacebook https://t.co/zKGg4ZM2pi https://t.co/d9YFboUAUj Sentiment(polarity=0.0, subjectivity=0.0)
RT @loislane28: Wow. did @elonmusk just delete #SpaceX and #Tesla from Facebook? https://t.co/iN4N4zknca
Sentiment(polarity=0.0, subjectivity=0.0)
Keep in mind that as the code executes a search on live Twitter data, your results will certainly vary. The formatting is a little con- fusing to read. Modify the for loop in the cell to the following and then re-execute the code.
for tweet in spacex_tweets:
analysis = TextBlob(tweet.text)
print('{0} | {1} | {2}'.format(tweet.text, analysis.sentiment.polarity,
analysis.sentiment.subjectivity))
Adding the pipe characters to the output should make it easier to read. Also note that the sentiment property’s two fields, polarity and subjectivity, can be displayed individually.
Load Twitter Sentiment Data Into a DataFrame
The previous code created a pipe-delineated list of tweet content and sentiment scores. A more useful structure for further analysis would be a DataFrame. A DataFrame is a two-dimensional-labeled data structure. The columns may contain different value types.
Figure 4 Use Tweepy to Access the Twitter API
Similar to a spreadsheet or SQL table, DataFrames provide a famil- iar and simple mechanism to work with datasets.
DataFrames are part of the Pandas library. As such, you will need to import the Pandas library along with Numpy. Insert a blank cell below the current cell, enter the following code and execute:
import pandas as pd import numpy as np
tweet_list = []
for tweet in spacex_tweets:
analysis = TextBlob(tweet.text)
tweet_list.append({'Text': tweet.text, 'Polarity': analysis.sentiment.polarity,
'Subjectivity':analysis.sentiment.subjectivity }) tweet_df = pd.DataFrame(tweet_list)
tweet_df
The results now will display in an easier to read tabular format. However, that’s not all that the DataFrames library can do. Insert a blank cell below the current cell, enter the following code, and execute:
print ("Polarity Stats")
print ("Avg", tweet_df["Polarity"].mean()) print ("Max", tweet_df["Polarity"].max()) print ("Min", tweet_df["Polarity"].min()) print ("Subjectivity Stats")
print ("Avg", tweet_df["Subjectivity"].mean()) print ("Max", tweet_df["Subjectivity"].max()) print ("Min", tweet_df["Subjectivity"].min())
By loading the tweet sentiment analysis data into a DataFrame, it’s easier to run and analyze the data at scale. However, these descrip- tive statistics just scratch the surface of the power that DataFrames provide. For a more complete exploration of Pandas DataFrames in Python, please watch the webcast, “Data Analysis in Python with Pandas,” by Jonathan Wood at bit.ly/2urCxQX.
Wrapping Up
With the velocity and volume of data continuing to rise, businesses large and small must find ways to leverage machine learning to make sense of the data and turn it into actionable insight. Natural Language Processing, or NLP, is a class of algorithms that can ana- lyze unstructured text and parse it into machine-readable structures, giving access to one of the key attributes of any body of text— sentiment. Not too long ago, this was out of reach of the average devel- oper, but now the TextBlob Python library brings this technology to the Python ecosystem. While the algorithms can sometimes struggle with the subtleties and nuances of human language, they provide an excellent foundation for making sense of unstructured data.
As demonstrated in this article, the effort to analyze a given block of text for sentiment in terms of negativity or subjectivity is now trivial. Thanks to a vibrant Python ecosystem of third-party open source libraries, it’s also easy to source data from live social media sites, such as Twitter, and pull in users’ tweets in real time. Another Python library, Pandas, simplifies the process to perform advanced analytics on this data. With thoughtful analysis, busi- nesses can monitor social media feeds and obtain awareness of what customers are saying and sharing about them. n
Frank La Vigne leads the Data & Analytics practice at Wintellect and co-hosts the DataDriven podcast. He blogs regularly at FranksWorld.com and you can watch him on his YouTube channel, “Frank’s World TV” (FranksWorld.TV).
Thanks to the following technical expert for reviewing this article: Andy Leonard
import tweepy
consumer_key = "[Insert Consumer Key value]"
consumer_secret = "[Insert Consumer Secret value]"
access_token = "[Insert Access Token value]"
access_token_secret = "[Insert Access Token Secret value]" authentication_info = tweepy.OAuthHandler(consumer_key, consumer_secret) authentication_info.set_access_token(access_token, access_token_secret) twitter_api = tweepy.API(authentication_info)
spacex_tweets = twitter_api.search("#spacex")
for tweet in spacex_tweets: print(tweet.text)
analysis = TextBlob(tweet.text) print(analysis.sentiment)
8 msdn magazine
Artificially Intelligent