Page 10 - MSDN Magazine, May 2018
P. 10

ArtificiAlly intelligent
Text Sentiment Analysis
FRANK LA VIGNE
One of the truisms of the modern data-driven world is that the velocity and volume of data keeps increasing. We’re seeing more data generated each day than ever before in human history. And nowhere is this rapid growth more evident than in the world of social media, where users generate content at a scale previously unimaginable. Twitter users, for example, collectively send out approximately 6,000 tweets every second, according to tracking site Internet Live Stats (internetlivestats.com/twitter-statistics). At that rate, there are about 350,000 tweets sent per minute, 500 million tweets per day, and about 200 billion tweets per year. Keeping up with this data stream to evaluate content would be impossible even for the largest teams—you just couldn’t hire enough people to scan Twitter to evaluate the sentiment of its user base at any given moment.
Fortunately, the use case for analyzing every tweet would be an extreme edge case. There are, however, valid business motives for track- ing sentiment, be it against a specific topic, search term or hashtag. While this narrows the number of tweets to analyze significantly, the sheer volume of the data to analyze still makes it impractical to ana- lyze the sentiments of the tweets in any meaningful way.
Thankfully, analyzing the overall sentiment of text is a process that can easily be automated through sentiment analysis. Sentiment analysis is the process of computationally classifying and catego- rizing opinions expressed in text to determine whether the attitude expressed within demonstrates a positive, negative or neutral tone. In short, the process can be automated and distilled to a mathemat- ical score indicating tone and subjectivity.
Setting Up an Azure Notebook
In February (msdn.com/magazine/mt829269), I covered in detail Jupyter notebooks and the environments in which they can run. While any Python 3 environment can run the code in this article, for the sake of simplicity, I’ll use Azure Notebooks. Browse over to the Azure Notebooks service Web site at notebooks.azure.com and sign in with your Microsoft ID credentials. Create a new Library with the name Artificially Intelligent. Under the Library ID field enter “ArtificiallyIntelligent” and click Create. On the following page, click on New to create a new notebook. Enter a name in the Item Name textbox, choose Python 3.6 Notebook from the Item type dropdown list and click New (Figure 1).
Click on the newly created notebook and wait for the service to connect to a kernel.
Figure 1 Creating a New Notebook with a Python 3.6 Kernel
Sentiment Analysis in Python
Once the notebook is ready, enter the following code in the empty cell and run the code in the cell.
from textblob import TextBlob
simple_text = TextBlob("Today is a good day for a picnic.") print(simple_text.sentiment)
The results that appear will resemble the following:
Sentiment(polarity=0.7, subjectivity=0.6000000000000001)
Polarity refers to how negative or positive the tone of the input text rates from -1 to +1, with -1 being the most negative and +1 being the most positive. Subjectivity refers to how subjective the statement rates from 0 to 1 with 1 being highly subjective. With just three lines of code, I could analyze not just sentiment of a fragment of text, but also its subjectivity. How did something like sentiment analysis, once considered complicated, become so seemingly simple?
Python enjoys a thriving ecosystem, particularly in regard to machine learning and natural language processing (NLP). The code snippet above relies on the TextBlob library (textblob.readthedocs.io/ en/dev). TextBlob is an open source library for processing textual data, providing a simple API for diving into common natural language processing (NLP) tasks. These tasks include sentiment analysis and much more.
In the blank cell below the results, enter the following code and execute it:
simple_text = TextBlob("the sky is blue.") print(simple_text.sentiment)
The results state that the phrase “the sky is blue” has a polarity of 0.0 and a subjectivity of 0.1. This means that the text is neutral in tone and scores low in subjectivity. In the blank cell immediately underneath the results, enter the following code and execute the cell:
simple_text1 = TextBlob("I hate snowstorms.") print(simple_text1.sentiment)
simple_text2 = TextBlob("Bacon is my favorite!") print(simple_text2.sentiment)
Code download available at bit.ly/2pPFIMM.
6 msdn magazine


































































































   8   9   10   11   12