To get in touch with me send me a note below. You could also send me a message via linkedin or email me directly at
Easily Access Pre-trained Word Embeddings with Gensim
What are pre-trained embeddings and why? Pre-trained word embeddings are vector representation of words trained on a large dataset. With pre-trained embeddings, you will essentially be using the weights and vocabulary from the end result of the training process done by….someone else! (It could also be you) One benefit of using pre-trained embeddings is that … Easily Access Pre-trained Word
How to Use Tfidftransformer & Tfidfvectorizer?
Scikit-learn’s Tfidftransformer and Tfidfvectorizer aim to do the same thing, which is to convert a collection of raw documents to a matrix of TF-IDF features. The differences between the two modules can be quite confusing and it’s hard to know when to use which. This article shows you how to correctly use each module, the … How to Use Tfidftransformer
All you need to know about Text Preprocessing for Machine Learning & NLP
Learn what text preprocessing is, the different techniques for text preprocessing and a way to estimate how much preprocessing you may need. For those interested, I’ve also made some text preprocessing code snippets in python for you to try. Now, let’s get started!
Industrial Strength Natural Language Processing
Having spent a big part of my career as a graduate student researcher and now a Data Scientist in the industry, I have come to realize that a vast majority of solutions proposed both in academic research papers and in the work place are just not meant to ship — they just don’t scale! And … Industrial Strength Natural Language