Opinosis Dataset – Topic related review sentences


This dataset contains sentences extracted from user reviews on a given topic. Example topics are “performance of Toyota Camry” and “sound quality of ipod nano”, etc. In total there are 51 such topics  with each topic having approximately 100 sentences (on average). The reviews were obtained from various sources – Tripadvisor (hotels), Edmunds.com (cars) and Amazon.com (various electronics).  This dataset was used for the following automatic text summarization project .

The dataset file also comes with gold standard summaries  used for the summarization paper listed above. I have also provided some scripts to help with the summarization/evaluation tasks using ROUGE. Detailed information about the dataset and the list of scripts is provided in the documentation.

Citing Dataset

If you use this dataset for your own research please cite the following paper to mark the dataset: