Each subgroup had four contraceptives as predictive classes, and the data was randomly split into 80% training and 20% test set using sklearn’s model selection.
For both subgroups, the GNB model had a slightly higher accuracy than random (0.60), which in this case would be predicting the majority class every time (0.57).
To try to mitigate this issue, I used Synthetic Minority Over-Sampling (SMOTE from imbalanced-learn API imblearn; image below) to generate some data points in the minority classes to help train the model better.
You can think of this as a customized “bag of words” model, counting the number of times a side effect was mentioned in particular sentence about a contraceptive.
Clearly, VADER is doing slightly better than chance at correctly classifying a sentence as positive, negative or neutral, which is still valuable for women who have previously had no information on how other women feel about each contraceptive and their side effects.