How do I deal with an imbalanced dataset?
I am attempting to develop a classification model using an unbalanced dataset of messages. Could I take the keyword set from the minority labeled records and boost the TF-IDF values of those particular keywords. I’m looking for a way to improve recall and precision metrics for the minority labeled records. Instead of trying to manipulate …