Sentiment Analysis

Connexun | news api
7 min readJan 12, 2022

Knowing the public sentiment on various topics can help us make informed & profitable business decisions. It will help us in knowing the satisfaction/resentment of the consumers towards company policies. Usually, this is done on the entire document. But a document as whole talks about multiple topics or “entities”. It’s these individual entity sentiments that help us in setting up a feedback system, allowing the business entity to work towards improving the user experience.

The primary source of gathering user experience regarding the service is social media. Now, consider a company such as Google or Amazon that has a very large user base. In such a case, collecting user reviews manually, understanding them, and then categorizing them can be a very challenging task. Instead, we have specialized applications that collect entity-related sentiments on social media and feed them to a sentiment language model that outputs the probability of the review being a positive, neutral, or negative sentiment.

Even product-based companies can use sentiment analysis similarly to find out user compatibility with their products or services especially after major policy changes.

Simple sentiment classification of text into different emotional is well solved with recent ML progress. However, the classical sentiment analysis just tells about the average tonality of text instead of a detailed explanation on the entities and words levels. For instance, how would you classify such a sentence as “I hate coffee on this sunny morning”,? Positive because the morning is sunny, or Negative because I hate coffee, or maybe Neutral on average? In fact, the sentiment very much depends on the aspect to which or whom we define a sentiment polarity.

Usage Example

Consider PC reviews on a website such as Amazon. In this case, the product seller would not only like to know about the overall review but also specific reviews such as battery life, sound quality, storage space, etc. This will also help prospective buyers to know in-depth about product specificities. In such a case, we need to extract the topic-specific user reviews. These topics are known as ‘entities’ or ‘aspects’. So, given a review, we first try to identify the aspects or entities and then classify the aspect-specific review as positive, negative, or neutral.

For the given review, a lot of entities are mentioned. For us to do entity sentiment analysis, we need to first identify the entities and then find out the sentiment associated with them.

Our Sentiment Analysis Models

We provide the following three models for sentiment analysis which differ from each other based on the type of word vectorization and model classifiers used.

  1. Our model to evaluate sentiments of English language news is based on TF-IDF vectorization of text. To deal with ambitious word combinations like “not terrible”, “not bad” we applied n-grams splitting of the text. C-Support Vector Classifier was used at the top of it. To train our model we used a human-labeled dataset of news titles sentiment created by Connexun team.
  2. Multilingual sentiment model based on the pre-trained joeddav/xlm-roberta-large-xnli model. This model is a result of fine-tuning of xlm-roberta-large on a combination of Natural Language Inference (NLI) data in many languages. It is widely used for zero-shot text classification. NLI approach defines is two sentences in entailment between each other or in contradiction or neutral with respect to each other.
  3. Sentiment analysis of entities with respect to their context exploits the Aspect Based Sentiment Analysis. On top of this model, we fine-tuned the logistic regression classifier which provides probabilities that a given NER has positive or negative sentiment in a given sentence. The logistic regression was trained on a human-labeled training dataset created in Connexun.

Application in various industries. Recommender systems for e-commerce

Recommender systems predict user preferences by recommending new products based on user history or based on similar users’ interests. Suppose we are working with a retail store site and our user is looking for a new book. Our entity sentiment model assigns a sentiment score to each of the entities in the given review. Our recommender system will recommend this book to the user if the entities have a good sentiment score and they match with our user’s preferences based on his/her search history and previously purchased products.

Bank Performance and News Sentiment

The movement in stock prices of a firm depends on its overall market sentiments. Thus, people who perform stock trading also need to know the news sentiment of that particular firm in the market at that point in time. This can be done by monitoring public opinions about the company performance or the company stock on a real-time basis. News articles and social media sites such as Twitter and Facebook can be a very good source of monitoring public views through tweets. A dataset of the firm-related sentiments can be created on a periodic basis which can be used to find the daily average news sentiment of the company. We can plot these values against the closing index prices to identify certain trends in sentiments that can be used to predict stock prices with precision.

In the below example, we have plotted the change in daily closing index prices for UNICREDIT bank against the average daily sentiment. The average daily sentiment, which is calculated based on news reports mentioning the bank, is a value that lies between -1 to +1 where -1 is the maximum negative sentiment and +1 is the maximum positive sentiment. The Pearson correlation coefficient, which tells us the relation between two different variables, is 0.575 in this case suggesting a strong positive relation between index price changes and firm-news sentiment.

Our Services

Connexun offers a range of API services for its users that perform varied tasks. Our Text Analysis APIs performs text summarization, language detection, sentiment evaluation, etc. Our Sentiment Evaluation API takes in as input a paragraph of 15 to 500 words and assigns an overall sentiment-positive, negative or neutral to it and sentiment score which is a finite floating point number telling us the confidence score for the sentiment label.

The API assigns a negative label to the paragraph with a sentiment score of -0.43125 which tells us that it conveys a moderately negative emotion.

Shown below is the working of the Short Text Geoparser API. The API takes as input a short sentence and returns a list of countries ranked according to their proximity within the semantic space constructed with the help of millions of world news articles present in our archive.

The API returns a list of countries with Italy having the highest score because the ‘Leaning tower of Pisa’ is a monument in Italy. This can be of particular use when selecting potential markets pertaining to the expansion of a new product.

Check out our NEWS APIs that not only give you the top news from all over the world but also local news, inter-country news, and country-specific news. Shown below is the working of the Topic Research API. Given a keyword or a set of keywords, this API will return all articles on the internet related to the keyword. This will be helpful for not only school and college students but also for research practitioners.

The InterCountry API allows for getting new in relation with countries along with the overall sentiment pertaining to that particular piece of news. Key entities in each piece of news are also mentioned for your reference. It also clubs articles pertaining to the same piece of news together.

Conclusion

Market sentiment is a powerful tool, which when harnessed in a timely manner, can help turn tides for both new & existing products alike. So now you know the what, why & how of entity-based sentiment analysis. We hope you’ll be able to use our APIs to the fullest to gain the edge over competitors by roping in the public sentiments

About Connexun

Connexun is an innovative tech startup based in Milano, Italy.

Connexun crawls news content from tens of thousands of open web sources worldwide; turning unstructured web content into machine-readable news data APIs. Its AI powered news engine B.I.R.B.AL. empowers organizations to transform the world’s news into real-time business insight.

To learn more about Connexun, subscribe to our medium blog, follow us on Linkedin, Twitter, Facebook, and visit our demos.

--

--

Connexun | news api

Connexun is the ultimate AI news engine — turning unstructured news content into multi-purpose actionable data.