Sentiment analysis

Sentiment analysis is a process which extracts information from written texts and aims to determine the attitude of the writer.

Short overview

Traditionally texts are categorized either by analyzing them at document level or by digging deeper using aspect-based analysis. The former works well on large datasets of individual documents concentrating on single subjects, but becomes inherently more inaccurate or even useless on texts that cover multiple different subjects. The latter method is better by many aspects and Pickmybrain's sentiment analysis plugin is partially based on a similar technology.

Why sentiment analysis works best with a search engine

Luckily for you, Pickmybrain combines the two techniques discussed above. When user searches and chooses to sort results by positivity or negativity, each document is examined individually and given a numeric ratio that decides the weightning between document's sentiment score and aspect-based sentiment score. In other words, being able to sort documents by relevance comes in handy even when we are actually sorting by positivity or negativity.

... But it is not written into stone

As many of the other features, also the sentiment search mode is fine-tunable. Users can choose to use only document level or aspect-based scoring or a mixture of these two with a pre-defined weightning ratio. See more at API documentation.

Frequently asked questions

Q: How fast is the sentiment analysis plugin?

A: In our own test scenario the plugin can analyse a group of 100 000 messages (~30,4 MB) in 32 seconds, which is about 3125 messages per second. This is with PHP 7 and a single process. The pickmybrain search engine supports multiprocessing and the performance scales up pretty well!

Q: Do you ever update the sentiment analysis plugin?

A: Yes! We will provide updates for each language pack from time to time. You don't pay anything for these updates.

Q: Do I have to use the sentiment analysis library with the Pickmybrain search engine?

A: Absolutely not! You can use it also as a stand-alone library for document scoring and classification.

# include the pmbsentiment library

$lang "fi"# finnish language
$lang "en"# english language
$pmbsentiment = new PMBSentiment($lang);
# provide some text for analysis
$scores $pmbsentiment->score("Broccoli is delicious and good for your health.");
# the $scores variable will now include an associative array containing 
# the sentiment scores for each class (pos, neu, neg)