publication

OpinoFetch: A Practical and Efficient Approach to Collecting Opinions on Arbitrary Entities

Abstract The abundance of opinions on the Web is now becoming a critical source of information in a variety of application areas such as business intelligence, market research and online shopping. Unfortunately, due to the rapid growth of online content, there is no one source to obtain a comprehensive set of opinions about a specific …

OpinoFetch: A Practical and Efficient Approach to Collecting Opinions on Arbitrary Entities Read More »

A General Supervised Approach for Segmentation of Clinical Texts

Segmentation of clinical texts into logical groups is critical for all sorts of tasks such as medical coding for billing, auto drafting of discharge summaries, patient problem list generation, population study on allergies, etc. While there have been previous studies on using supervised approaches to segmentation of clinical texts, these existing approaches were trained and tested on a fairly limited data set showing low adaptability to new unseen documents. We propose a highly generalized model for segmenting clinical texts, based on a set of line-wise predictions by a classifier with constraints imposing their coherence. Evaluation results on 5 independent test sets show that the proposed approach can work on all sorts of note types and performs consistently across different organizations (i.e. hospitals).

Opinion-Based Entity Ranking

In this paper, we propose a different way of leveraging opinionated content, by directly ranking entities based on a user’s preferences. Our idea is to represent each entity with the text of all the reviews of that entity. Given a user’s keyword query that expresses the desired features of an entity, we can then rank all the candidate entities based on how well opinions on these entities match the user’s preferences. We study several methods for solving this problem, including both standard text retrieval models and some extensions of these models. Experiment results on ranking entities based on opinions in two different domains (hotels and cars) show that the proposed extensions are effective and lead to improvement of ranking accuracy over the standard text retrieval models for this task.

Comprehensive Review of Opinion Summarization

This survey zooms into recent research in the area of opinion mining summarization, which is related to generating effective summaries of opinions so that users can get a quick understanding of the underlying sentiments. Since there are various formats of summaries, the survey breaks down the approaches into the commonly studied aspect-based summarization and non-aspect based ones (which includes visualization, contrastive summarization and text summarization). This survey also has a listing of opinion related dataset and available demos.

Micropinion Generation: An Unsupervised Approach to Generating Ultra-Concise Summaries of Opinions

This paper presents a new unsupervised approach to generating ultra-concise summaries of opinions. We formulate the problem of generating such a micropinion summary as an optimization problem, where we seek a set of concise and non-redundant phrases that are readable and represent key opinions in text. We measure representativeness based on a modified mutual information function and model readability with an n-gram language model.

Linguistic Understanding of Complaints and Praises in User Reviews

This is a short study paper that categorizes positive and negative review sentences into 4 categories: positive only, praise, negative only and complaint. The intuition is that praise sentences and complaints tend to be more informative than plain positive only or negative only sentences. This paper thus tries to understand the properties of such text that we consider as complaints and praises.

Discovering Related Clinical Concepts Using Large Amounts of Clinical Notes

In this work, we explore an unsupervised graphical approach to mine related concepts by leveraging the volume within large amounts of clinical notes. Our evaluation shows that we are able to use a data driven approach to discovering highly related concepts for various search terms including medications, symptoms and diseases.

Mining tag clouds and emoticons behind community feedback

In this paper we describe our mining system which automatically mines tags from feedback text in an eCommerce scenario. It renders these tags in a visually appealing manner. Further, emoticons are attached to mined tags to add sentiment to the visual aspect.