Publication

Opinion-Driven Decision Support System (ODSS)

Opinion Driven Decision Support is a term that I coined as part of my Ph.D. thesis refering to the use of large amounts of online opinions to facilitate business and consumer decision making. The idea in my thesis is to combine the strengths of search technologies with opinion mining and analysis tools to provide a powerful decision making platform. This special platform encompasses research problems related to opinion acquisition, opinion based search and opinion summarization. Opinions in this case can be aggregation of user reviews, blog comments, facebook status updates and so on. Essentially any opinion containing texts on specific topics or entities qualify as candidates for building an Opinion Driven Decision Support System. The building blocks towards devloping an Opinion-Driven Decision Support System are as follows:

Components of an ODSS

1. Search Capabilities Based on Opinions

The goal of opinion-based search is to help users find entities of interest based on their key requirements. Since a user is often interested in choosing an entity based on opinions on that entity, a system that ranks entities based on a user’s personal preferences would provide a more direct support for a user’s decision-making task. For example, in the case of finding hotels at a destination, a user may only want to consider hotels where other people thought was clean. By finding and ranking hotels based on how well it satisfies such a requirement would significantly reduce the number of entities in consideration, facilitating decision making. Unlike traditional search, the query in this case is a set of preferences and the results is a set of entities that match these preferences. The challenge is to accurately match the user’s preferences with existing opinions in order to recommend the best entities. Existing opinion mining techniques can be used for this purpose as well as information retrieval based techniques such as the one I have explored Opinion-Based Entity Ranking.

2. Opinion Summarization

Opinion summaries play a critical role in helping users better analyze entities in consideration (e.g. product, physician, cars, politican). Users are often looking out for major concerns or advantages in selecting a specific entity. Thus, a summary that can quickly highlight the key opinions about the entity would significantly help exploration of entities and aid decision making. The field of opinion summarization has been long explored with most techniques being focused on generating structured summaries on a fixed set of topics. In the last few years, textual summaries of opinions have been gaining more and more popularity. In my thesis, I focus on several textual summarization approaches:

3. Opinion Acquisition

To support accurate search and analysis based on opinions, opinionated content is imperative. Relying on opinions from just one specific source not only makes the information unreliable, but also incomplete due to variations in opinions as well as potential bias present in a specific source. Although many applications rely on large amounts of opinions, there has been very limited work on collecting and integrating a complete set of opinions.


Download

More Info

Software Related to ODSS

Slides from Thesis Defense Presentation

 

 

OpinoFetch: A Practical and Efficient Approach to Collecting Opinions on Arbitrary Entities

Abstract

The abundance of opinions on the Web is now becoming a critical source of information in a variety of application areas such as business intelligence, market research and online shopping. Unfortunately, due to the rapid growth of online content, there is no one source to obtain a comprehensive set of opinions about a specific entity or a topic, making access to such content severely limited. While previous works have been focused on mining and summarizing online opinions, there is limited work on exploring the automatic collection of opinion content on the Web. In this paper, we propose a lightweight and practical approach to collecting opinion containing pages, namely review pages on the Web for arbitrary entities. We leverage existing Web search engines and use a novel information network called the FetchGraph to efficiently obtain review pages for entities of interest. Our experiments in three different domains show that our method is more effective than plain search engine results and we are able to collect entity specific review pages efficiently with reasonable precision and accuracy.

Links

The Idea

The goal of this paper is to discover review content from arbitrary sources. The intuition here is that, reviews are often scattered and looking into just a few sources would often result in data sparsity problems. The OpinoFetch approach makes no assumption on the type of entity that it can gather reviews on or the sources that should show up for each entity, thus making it a very general approach. In one run, you could be looking for all reviews related to cars, in the next run it could be all review content related to restaurants. In the OpinoFetch paper, we looked into gathering review pages from three distinct sources namely electronics, hotels and attractions. This is an example of sites discovered for various entities:
site distribution

Citation

A General Supervised Approach for Segmentation of Clinical Texts

Abstract

Segmentation of clinical texts into logical groups is critical for all sorts of tasks such as medical coding for billing, auto drafting of discharge summaries, patient problem list generation, population study on allergies, etc. While there have been previous studies on using supervised approaches to segmentation of clinical texts, these existing approaches were trained and tested on a fairly limited data set showing low adaptability to new unseen documents. We propose a highly generalized model for segmenting clinical texts, based on a set of line-wise predictions by a classifier with constraints imposing their coherence. Evaluation results on 5 independent test sets show that the proposed approach can work on all sorts of note types and performs consistently across different organizations (i.e. hospitals).

Downloads

Example segmented document:

Presentation Slides

Opinion-Based Entity Ranking

Ganesan, Kavita, and Chengxiang Zhai. “Opinion-based entity ranking.” Information retrieval 15.2 (2012): 116-150.

Abstract

The deployment of Web 2.0 technologies has led to rapid growth of various opinions and reviews on the web, such as reviews on products and opinions about people. Such content can be very useful to help people find interesting entities like products, businesses and people based on their individual preferences or tradeoffs. Most existing work on leveraging opinionated content has focused on integrating and summarizing opinions on entities to help users better digest all the opinions. In this paper, we propose a different way of leveraging opinionated content, by directly ranking entities based on a user’s preferences. Our idea is to represent each entity with the text of all the reviews of that entity. Given a user’s keyword query that expresses the desired features of an entity, we can then rank all the candidate entities based on how well opinions on these entities match the user’s preferences. We study several methods for solving this problem, including both standard text retrieval models and some extensions of these models. Experiment results on ranking entities based on opinions in two different domains (hotels and cars) show that the proposed extensions are effective and lead to improvement of ranking accuracy over the standard text retrieval models for this task.

Links

Slides

Citation

Findilike: Preference Driven Entity Search

Traditional web search engines enable users to find documents based on topics. However, in finding entities such as restaurants, hotels and products, traditional search engines fail to suffice as users are often interested in finding entities based on structured attributes such as price and brand and unstructured information such as opinions of other web users. In this paper, we showcase a preference driven search system, that enables users to find entities of interest based on a set of structured preferences as well as unstructured opinion preferences. We demonstrate our system in the context of hotel search.

Links

Bib

 

 

 

Comprehensive Review of Opinion Summarization

Comprehensive Review Of Opinion Summarization (Opinion Mining Survey)Kim, Hyun Duk, Ganesan Kavita A., Sondhi Parikshit, and Zhai ChengXiang , (2011)

This survey zooms into recent research in the area of opinion mining summarization, which is related to generating effective summaries of opinions so that users can get a quick understanding of the underlying sentiments. Since there are various formats of summaries, the survey breaks down the approaches into the commonly studied aspect-based summarization and non-aspect based ones (which includes visualization, contrastive summarization and text summarization). This survey also has a listing of opinion related dataset and available demos.

Links and Downloads

Citation

 

Micropinion Generation: An Unsupervised Approach to Generating Ultra-Concise Summaries of Opinions

Ganesan, Kavita, ChengXiang Zhai, and Evelyne Viegas. “Micropinion generation: an unsupervised approach to generating ultra-concise summaries of opinions.” Proceedings of the 21st international conference on World Wide Web. ACM, 2012.

Abstract

This paper presents a new unsupervised approach to generating ultra-concise summaries of opinions. We formulate the problem of generating such a micropinion summary as an optimization problem, where we seek a set of concise and non-redundant phrases that are readable and represent key opinions in text. We measure representativeness based on a modified mutual information function and model readability with an n-gram language model. We propose some heuristic algorithms to efficiently solve this optimization problem. Evaluation results show that our unsupervised approach outperforms other state of the art summarization methods and the generated summaries are informative and readable.

Links

Related Articles

Micropinion Generation Presentation Slides

View more PowerPoint from Kavita Ganesan

Citation

Linguistic Understanding of Complaints and Praises in User Reviews

Ganesan, Kavita, and Guangyu Zhou. “Linguistic Understanding of Complaints and Praises in User Reviews.” Proceedings of NAACL-HLT. 2016.

Gist of paper

This is a short study paper that categorizes positive and negative review sentences into 4 categories: positive only, praise, negative only and complaint. The intuition is that praise sentences and complaints tend to be more informative than plain positive only or negative only sentences. This paper thus tries to understand the properties of such text that we consider as complaints and praises. Our analysis shows several interesting findings including:

  •  complaints tend to have more past tense than the other 3 categories
  •  complaints and praises are generally longer and contain more nouns than positive only or negative only sentences
  •  praise sentences tend to use more adjectives than other types of sentences

Downloads

  • Paper
  • Dataset – coming soon!

Citation

Discovering Related Clinical Concepts Using Large Amounts of Clinical Notes

Abstract

The ability to find highly related clinical concepts is essential for many applications such as for hypothesis generation, query expansion for medical literature search, search results filtering, ICD-10 code filtering and many other applications. While manually constructed medical terminologies such as SNOMED CT can surface certain related concepts, these terminologies are inadequate as they depend on expertise of several subject matter experts making the terminology curation process open to geographic and language bias. In addition, these terminologies also provide no quantifiable evidence on how related the concepts are. In this work, we explore an unsupervised graphical approach to mine related concepts by leveraging the volume within large amounts of clinical notes. Our evaluation shows that we are able to use a data driven approach to discovering highly related concepts for various search terms including medications, symptoms and diseases.

Mining Related Concepts

The Concept-Graph is used to mine related clinical terminology. For example if the query term is advair, related concepts can be singulair, combivent, inhaler, nebs, etc. The Concept-Graph is an undirected graph with each node representing a concept and the link between the nodes indicate a presence of relationship between two concepts. The results of this work was evaluated by experts in the medical field.

A similar graph data structure has been used for text summarization tasks.

Links

  • Paper
  • Journal link

    Example Related Concepts

    Concepts related to chest pain

    concepts related to chest pain

    Concepts related to advair

    concepts related to advair

    Citation

  • Mining tag clouds and emoticons behind community feedback

    Ganesan, K. A., N. Sundaresan, and H. Deo, “Mining tag clouds and emoticons behind community feedback“, WWW ’08: Proceeding of the 17th international conference on World Wide Web, Beijing, China, ACM, pp. 1181–1182, 2008.

    Abstract

    In this paper we describe our mining system which automatically mines tags from feedback text in an eCommerce scenario. It renders these tags in a visually appealing manner. Further, emoticons are attached to mined tags to add sentiment to the visual aspect.

    Download Paper

    Related Articles