Sentiment analysis is the art of training an algorithm to classify text as positive/negative. Follow along to build a basic sentiment analyser which is trained on twitter data.
We would need the
textblob python package for this, which can be installed by executing:
pip install textblob.
And, then you should run the following to download the necessary NLTK’s english corpus:
python -m textblob.download_corpora
Let’s import the packages we need:
We will use the STS Gold tweet dataset for training the Naive Bayes classifier. The tweets in this dataset are already annotated with its positive and negative polarity. STS Gold tweet data set was selected from the several available based on several suggestions which can be found in this paper.
Once you have downloaded the STS Gold dataset, please unzip it and place “sts_gold_tweet.csv” file in the current folder. Then:
Now, let’s train the Naive Bayes Classifier on this training data. (P.S: on a 8GB RAM laptop, it took a good portion of an hour to train!)
To know the sentiment of a text, pass it to the TextBlob function and use it’s sentiment property to know its positivity and negativity.
Great! So looks like our classifier is ready.
Now, let’s pickle it and save ourselves some training time, later.
To load a classifier from a pickled object, use pickle.load:
Now, let’s put this classifier to use against a real world data set. The MAGAtweets.csv data set which can be downloaded from here contains tweets about women’s march and the #MAGA hashtag. Feel free to use any other tweet data that you are comfortable with.
TextBlobclass trains the classifier each time it is executed. On this machine, it took 5 seconds per execution. Hence, we will use
tb = Blobber(analyzer=NaiveBayesAnalyzer())
Loading the tweet data to be classified -
Classifying the tweets now -
Let’s try doing the same with
textblob’s in-built Pattern analyser -
Observe the difference in the results -
textblob is just one of the packages available in Python that can be used to perform sentiment analysis. Other capable packages that we know of would be -