Text Classification

Classify news articles with logistic regression and python

Build Your First Text Classifier in Python with Logistic Regression

Text classification is the automatic process of predicting one or more categories given a piece of text. For example, predicting if an email is legit or spammy. Thanks to Gmail’s spam classifier, I don’t see or hear from spammy emails! Other than spam detection, text classifiers can be used to determine sentiment in social media …

Build Your First Text Classifier in Python with Logistic Regression Read More »

A General Supervised Approach for Segmentation of Clinical Texts

Segmentation of clinical texts into logical groups is critical for all sorts of tasks such as medical coding for billing, auto drafting of discharge summaries, patient problem list generation, population study on allergies, etc. While there have been previous studies on using supervised approaches to segmentation of clinical texts, these existing approaches were trained and tested on a fairly limited data set showing low adaptability to new unseen documents. We propose a highly generalized model for segmenting clinical texts, based on a set of line-wise predictions by a classifier with constraints imposing their coherence. Evaluation results on 5 independent test sets show that the proposed approach can work on all sorts of note types and performs consistently across different organizations (i.e. hospitals).

Automated story capture from conversational speech

Abstract While storytelling has long been recognized as an important part of effective knowledge management in organizations, knowledge management technologies have generally not distinguished between stories and other types of discourse. In this paper we describe a new type of technological support for storytelling that involves automatically capturing the stories that people tell to each …

Automated story capture from conversational speech Read More »