Page 30 - MSDN Magazine, November 2017
P. 30
{
}
"language" : "en",
"analyzerIds" : ["4fa79af1-f22c-408d-98bb-b7d7aeef7f04",
"22a6b758-420f-4745-8a3c-46835a67c0d2",
"08ea174b-bfdb-4e64-987e-602f85da7f72"], "text" : "I love Bing! Let me bing this for you"
As you can see, the POS Tags are just tags for each of the words in the text input, while the Constituency Tree analyzer returns the tree structure of the text input marked with tags and the words. The Tokens analyzer returns the most readable result where it includes the information about each of the words in the text input along with their position in the record.
For this article, I’ll be using the Constituency Tree and Tokens analyzers to break text reviews into separate records based on Sentence Separation information and the Conjunction words.
If you’d like to read more about the Linguistic Analysis API and related concepts, I encourage you to read the complete API docu- mentation available at bit.ly/2eTc2Nj.
Now let’s go back the original example used for the Text Analytics API: “This phone has a great battery. The display is sharp and bright but the store does not have the apps I need.”
There’s a subtle change in the example this time as I’ve removed the period at the end of the second sentence of the original example
Here’s my Curl command:
curl -v --silent -X POST https://westus.api.cognitive.microsoft.com/ linguistics/v1.0/analyze
-H "Content-Type: application/json"
-H "Ocp-Apim-Subscription-Key: my-trial-key" --data-ascii ' {
"language" : "en",
"analyzerIds" : ["4fa79af1-f22c-408d-98bb-b7d7aeef7f04",
"22a6b758-420f-4745-8a3c-46835a67c0d2",
"08ea174b-bfdb-4e64-987e-602f85da7f72"], "text" : "I love Bing! Let me bing this for you"
}'
The response to this request is shown in Figure 4.
The result is segregated based on each of the analyzers sent in the request—POS Tags, Constituency Tree and Tokens in this example.
Figure 4 Parsing the Example Text
[ {
[ "PRP",
"VBP", "VBG", "."
], [
"VB", "PRP", "JJ", "DT", "IN", "PRP"
] ]
}, {
"analyzerId":"22a6b758-420f-4745-8a3c-46835a67c0d2", "result":[
"(TOP (S (NP (PRP I)) (VP (VBP love) (NNP Bing)) (. !)))", "(VP (VBD Let) (S (NP (PRP me)) (VP (VBG bing) (NP (DT this)) (PP (IN for) (NP (PRP you))))))"
] },
{ "analyzerId":"08ea174b-bfdb-4e64-987e-602f85da7f72", "result":[
{
"Len":12, "Offset":0, "Tokens":[
{
"Len":1, "NormalizedToken":"I", "Offset":0, "RawToken":"I"
}, {
"Len":4, "NormalizedToken":"love", "Offset":2, "RawToken":"love"
}, {
"Len":4, "NormalizedToken":"Bing", "Offset":7, "RawToken":"Bing"
"analyzerId":"4fa79af1-f22c-408d-98bb-b7d7aeef7f04", "result":[
},
{
"Len":1, "NormalizedToken":"!", "Offset":11, "RawToken":"!"
} ]
}, {
"Len":24, "Offset":13, "Tokens":[
{
"Len":3, "NormalizedToken":"Let", "Offset":13, "RawToken":"Let"
}, {
"Len":2, "NormalizedToken":"me", "Offset":17, "RawToken":"me"
}, {
"Len":4, "NormalizedToken":"bing", "Offset":20, "RawToken":"bing"
}, {
"Len":4, "NormalizedToken":"this", "Offset":25, "RawToken":"this"
}, {
"Len":3, "NormalizedToken":"for", "Offset":30, "RawToken":"for"
}, {
"Len":3, "NormalizedToken":"you", "Offset":34, "RawToken":"you"
} ]
} ]
} ]
26 msdn magazine
Cognitive Services