Micropinion Generation Dataset

Dataset Description

This dataset is based on user reviews from CNET. The reviews are on products from various categories like tv, cell phones, gps etc. You will find two versions of the dataset :- “raw” and “pre-processed”.  The “raw” folder has the original reviews from CNET without any pre-processing (each review is delimited by “$$;”). The “pre-processed” folder contains sentences from the full review section of the reviews. All the pros and cons from the original reviews are omitted in this version and this was the version used for summarization (See Section 5.1 of paper).  In addition in the pre-processed version, a simple sentence splitter was used to split the review texts into different sentences.


Citing Dataset

If you use this dataset for your own research please cite the following to mark the dataset:

Ganesan, K. A., C. X. Zhai, and Evelyne Viegas, Micropinion Generation: An Unsupervised Approach To Generating Ultra-Concise Summaries Of Opinions“, Proceedings of the 21st International Conference on World Wide Web 2012 (WWW ’12).