Page 28 - MSDN Magazine, November 2017
P. 28
As you can see, the key phrases endpoint has detected appro- priate phrases from the example review.
Now let’s see what this review denotes on the scale of positivity. I’ll run the same text through the sentiment endpoint and see what comes back. Here’s my Curl command:
curl -v --silent -X POST "https://westus.api.cognitive.microsoft.com/text/ analytics/v2.0/sentiment"
-H "Content-Type: application/json"
-H "Ocp-Apim-Subscription-Key: my-trial-key" --data-ascii '{
"documents": [
{ "language": "en",
This experiment demonstrates what the Text Analytics Cogni- tive Service can do for text reviews, survey responses and customer input. However, combined with the Linguistic Analysis services, I can distil even deeper insights.
A Look at Linguistic Analysis: What’s the Difference?
The Text Analytics API uses a pre-trained model to analyze text to detect information such as language, key phrases and sentiments. In contrast, the Linguistic Analysis API uses advanced linguistic analysis tools to process the text inputs and allows you to under- stand the structure of the sentence. This understanding can then be used to mine information from text records, interpret user commands and process free-form text from any source.
I’m not a linguist, so I’ll leave the job of explaining the ins and outs of this service to someone who is. In this article, I’ll just cover the basics of this service and then come back to the original intent of this article, which is to generate deeper and meaningful insights from text inputs.
The Linguistic Analysis API uses advanced linguistic analysis tools to process the text inputs and allows you to understand the structure of the sentence.
The Linguistic Analysis API uses the concept of analysers to under- stand the structure of text records. Currently, three kinds are supported:
•Tokens
• POS Tags
• Constituency Tree
These analyzers come from the Penn Treebank, which is the annotated, parsed text corpus that allows the Linguistic Analysis API to understand whether a given word in a text input is a noun or a verb. For example, “I love Bing!” and “Let me bing this for you” use the word Bing in different capacity.
Let’s use this example to understand how Linguistic Analysis works its magic. Just as with Text Analytics, you’ll need a trial key if you don’t have an Azure subscription.
Once you have the key, just fire up your tool of choice to send a request to the Linguistic Analysis API. There are two operations available with this API:
• List analyzers return the list of analysers to parse the text for Tokens, POS Tags and a Constituency Tree.
• Analyze text parses the text inputs you provide using the analyzers you supply in the request.
Figure 3 shows what a simple GET request to the List Analyzer endpoint returns.
I’ll use the previously mentioned analyzers to parse this text: “I LoveBing!LetmeBingthisforyou,”formattingtherequestbody as follows:
}
] }'
"id": "1",
"text": "This phone has a great battery. The display is sharp and bright.
But the store does not have the apps I need."
And the result this time is:
{
"documents": [
{
"score": 0.770478801630976, "id": "1"
} ],
"errors": [] }
The outcome is simpler this time, and you can see that the sentiment score for the review is .77, which, on a scale of 0 to 1 is 77 percent. This denotes a mostly positive sentiment for the text—which you can infer from text.
Now that I’ve run this text review with all the available opera- tions, I’ll combine them to see the overall result:
• Text: This phone has a great battery. The display is sharp and bright. But the store does not have the apps I need.
• Language: English
• Key Phrases: phone, great battery, display, store, apps • Sentiment: 77 percent
Figure 3 The Result of a Simple GET Request to the List Analyzer Endpoint
[ {
"en" ],
"kind": "POS_Tags", "specification": "PennTreebank3", "implementation": "cmm"
}, {
"id": "22a6b758-420f-4745-8a3c-46835a67c0d2", "languages": [
"en" ],
"kind": "Constituency_Tree", "specification": "PennTreebank3", "implementation": "SplitMerge"
}, {
"id": "08ea174b-bfdb-4e64-987e-602f85da7f72", "languages": [
"en" ],
"kind": "Tokens", "specification": "PennTreebank3", "implementation": "regexes"
} ]
"id": "4fa79af1-f22c-408d-98bb-b7d7aeef7f04", "languages": [
24 msdn magazine
Cognitive Services