Eduction Sentiment Analysis allows you find whether text has positive, negative, or neutral sentiment. For example, you can use it to determine whether users of a particular product or service are satisfied or not, based on an automated analysis of reviews.
The following table lists the languages that support sentiment analysis, and lists the name of the standard sentiment grammar and the user modification file. Each of these languages also support component extraction and user modification.
Language | Sentiment Grammar | User Modification File |
---|---|---|
Arabic | sentiment_ara.ecr
|
sentiment_user_ara.xml
|
Chinese | sentiment_chi.ecr
|
sentiment_user_chi.xml
|
Czech | sentiment_cze.ecr
|
sentiment_user_cze.xml
|
Dutch | sentiment_dut.ecr
|
sentiment_user_dutch.xml
|
English | sentiment_eng.ecr
|
sentiment_user_eng.xml
|
French | sentiment_fre.ecr
|
sentiment_user_fre.xml
|
German | sentiment_ger.ecr
|
sentiment_user_ger.xml
|
Italian | sentiment_ita.ecr
|
sentiment_user_ita.xml
|
Polish | sentiment_pol.ecr
|
sentiment_user_pol.xml
|
Portuguese | sentiment_por.ecr
|
sentiment_user_por.xml
|
Russian | sentiment_rus.ecr
|
sentiment_user_rus.xml
|
Spanish | sentiment_spa.ecr
|
sentiment_user_spa.xml
|
Turkish | sentiment_tur.ecr
|
sentiment_user_tur.xml
|
Eduction matches input data to patterns defined with regular expressions (grammars).
The sentiment analysis grammar first defines dictionaries with the parts of speech. There are different dictionaries for positive and negative words, and other categories that describe different effects on the sentiment, where appropriate.
These dictionaries are combined to form simple phrases that convey positive or negative sentiments. Finally, these phrases are padded, usually with other phrases, to form various patterns for the final entities, which match strings from the text that express positive or negative sentiment.
The grammar files are designed to be used out of the box. You just need to load the appropriate grammar file, and optionally choose the entities (usually positive or negative) to match with.
The sentiment grammar files have ‘lite’ counterparts. These can process data up to twice as fast compared to the full versions, depending on language. The ‘lite’ versions are identical to the full versions in most respects, but they do not support components or user modification. Micro Focus recommends that you use the ‘lite’ versions except in cases where you want to enable components or modify the built-in dictionaries.
The 'lite' versions are distinguished from the full versions by the addition of lite to the file name, preceded by an underscore. For example, the file name of the Chinese sentiment grammar file is sentiment_chi.ecr
, and the file name of the ‘lite’ version is sentiment_chi_lite.ecr
.
The grammar files generally contain sufficient information to work with a wide range of data, from formal reports to user reviews and social media feeds. However, the recall (the percentage of matches that are actually returned, out of the total number of matches that should return in theory) can be low for some input data. Also, some examples might convey a different sentiment depending on your viewpoint.
The phrase Company A is much better than Company B might convey a positive or negative sentiment depending on whether you are with Company A or Company B.
In these situations, you can improve the recall or adjust the sentiment analysis by extending the grammar.
You can extend the grammar by adding to the appropriate dictionaries in the sentiment grammar file. For example, if you are on the side of Company A, you can add Company A to the positive list (for some of the languages).
NOTE: There are slight variations in the grammar files of different languages, so this does not apply to all languages.
For more information, see Extend Grammars.
|