This dataset contains 20,000 stack overflow questions in json, with 19 attributes including post body, title, tags, and etc.
Dataset for Text Mining and NLP tasks
If you are looking for user review data sets for opinion analysis / sentiment analysis tasks, there are quite a few out there. These dataset below contain reviews from Rotten Tomatoes, Amazon, TripAdvisor, Yelp, Edmunds.com and so on.Here are some of the many dataset available out there: Dataset Domain Description Courtesy Of Movie Reviews Data …
This dataset contains sentences extracted from user reviews on a given topic. Example topics are “performance of Toyota Camry” and “sound quality of ipod nano”, etc. In total there are 51 such topics with each topic having approximately 100 sentences (on average). The reviews were obtained from various sources – Tripadvisor (hotels), Edmunds.com (cars) and Amazon.com (various electronics). This dataset was used for the following automatic text summarization project .
This data set contains full reviews for cars and and hotels collected from Tripadvisor (~259,000 reviews) and Edmunds (~42,230 reviews). This dataset was used for the following paper: Opinion-Based Entity Ranking