Sentiment Analysis
In the previous article, we saw how a judge LLM can be used to evaluate the behaviour of an LLM. By reasoning from first principles, we concluded that the delta in user satisfaction between the beginning and the end of a conversation is the best metric to evaluate the quality of our agent. But how to extract this metric in the first place? This is where sentiment analysis come into play. Luckly for us, this is not the first time that customers’ degree of satisfaction has to be extracted from text. Marketplaces for once, have gathered a tremendous number of comments from their customers and have long been develeping tools to extract a quantitative metric from these complex qualitative inputs. Appart from this sentiment analysis is leveraged for stock price prediction, spam detection and more. Nowaday, sentiment analysis is ubiquitous and the tools mature and readily available for all kinds of usage. We need to undersatand a few concepts in order to choose which tool to use in our specific use case of agent evaluation.
Polarity, Scales, Neutrality and Objectivity
The earlier systems of sentiment analysis simply focused in figuring out the polarity of a text, negative or positive, For instance by filtering the neutral or objective statements out (“my product is broken”, is negative, but also an objective fact, and don’t necessarly indicate a negative feeling from the customer) and focusing on finding words or expressions that displayed a polarised sentiment. The more modern approach however are more refined and locate the text input on contiuous scales (0 to 1, -10 to +10…), those can also be derived to target precise aspects such as the detection of a specific emotions. In the previous article, we identified a strategic feature, customer satisfaction, and even though other aspects of the conversation could be found in the conversation, such as the speed of resolution of the problems, the goal is to keep our tool as simple and focused as possible, so that a single metric is enough to condence all the relevant informations. In the case of a customer support agent, we want to filter out the objective and neutral statements before extracting the customers’ feelings.
Extraction Process
Historically, the feature extraction was done through searching for hand picked or automatically extracted (through unsupervised learning) keywords and syntactic structure. Such a technique can be expected to be limited to the nature of the training data-set. If the dataset did not contain Ukrainian data, no matter how perfoment the algorithm, no model can be expected to succeed in the task at hand. On the other hand, using an LLM directly could be an overkill and costly depending on the size of the dataset to analyze. A solution could be to use an English model able to extract subjective statement in English on top of a translation layer to turn the statement from Ukrainian to English. Both tasks, translation and sentiment analysis proper can be handled with the Transformers library and carefully selected models on Hugging Face, provided that enough RAM is available.