Segmentation of clinical texts into logical groups is critical for all sorts of tasks such as medical coding for billing, auto drafting of discharge summaries, patient problem list generation, population study on allergies, etc. While there have been previous studies on using supervised approaches to segmentation of clinical texts, these existing approaches were trained and tested on a fairly limited data set showing low adaptability to new unseen documents. We propose a highly generalized model for segmenting clinical texts, based on a set of line-wise predictions by a classifier with constraints imposing their coherence. Evaluation results on 5 independent test sets show that the proposed approach can work on all sorts of note types and performs consistently across different organizations (i.e. hospitals).
In this work, we explore an unsupervised graphical approach to mine related concepts by leveraging the volume within large amounts of clinical notes. Our evaluation shows that we are able to use a data driven approach to discovering highly related concepts for various search terms including medications, symptoms and diseases.