Technical

Text classification with python nlp classifier

Build Your First Text Classifier in Python with Logistic Regression

Text classification is the automatic process of predicting one or more categories given a piece of text. For example, predicting if an email is legit or spammy. Thanks to Gmail’s spam classifier, I don’t see or hear from spammy emails! Other than spam detection, text classifiers can be used to determine sentiment in social media …

Build Your First Text Classifier in Python with Logistic Regression Read More »

Easily Access Pre-trained Word Embeddings with Gensim

What are pre-trained embeddings and why? Pre-trained word embeddings are vector representation of words trained on a large dataset. With pre-trained embeddings, you will essentially be using the weights and vocabulary from the end result of the training process done by….someone else! (It could also be you) One benefit of using pre-trained embeddings is that …

Easily Access Pre-trained Word Embeddings with Gensim Read More »

how to preprocess text data for machine learning and nlp

Text Preprocessing for Machine Learning & NLP

Learn what text preprocessing is, the different techniques for text preprocessing and a way to estimate how much preprocessing you may need. For those interested, I’ve also made some text preprocessing code snippets in python for you to try. Now, let’s get started!

What is term frequency

What is Term-Frequency?

Term Frequency (TF) Term frequency (TF) often used in Text Mining, NLP and Information Retrieval tells you how frequently a term occurs in a document. In the context natural language, terms correspond to words or phrases. Since every document is different in length, it is possible that a term would appear more often in longer …

What is Term-Frequency? Read More »