ROUGE 2.0 – Overview

ROUGE 2.0 is a Java Package for Evaluation of Summarization Tasks building on the Perl Implementation of ROUGE.

ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It consists of a set of metrics for evaluating automatic summarization of texts as well as machine translation. It works by comparing an automatically produced summary or translation against a set of reference summaries (typically human-produced) or translations.

ROUGE 2.0 is a lightweight open-source tool that allows for easy evaluation of summaries or translation by limiting the amount of formatting needed in terms of reference summaries as well as system summaries. In addition, it also allows for evaluation of unicode texts known to be an issue with other implementations of ROUGE. One can also add new evaluation metrics to the existing code base or improve on existing ones.

More info can be found on Github.

rouge2csv – Script to Interpret ROUGE Scores

This is a perl script that helps in interpreting ROUGE scores generated by the perl (original) implementation of ROUGE. If you need Instructions on how to set-up ROUGE for evaluation of your summarization tasks go here.

Assuming you have piped all your ROUGE results to a file, this tool will collect all rouge scores into separate CSV files depending on the n-grams used. For example, all ROUGE-1 scores will be collected into a ROUGE-1.csv file, similarly all ROUGE-2 scores will be in a ROUGE-2.csv. The precision, recall and f-scores will be comma separated. This will allow you to easily visualize your results in Excel or OpenOffice. If you have ROUGE scores of identical runs (usually happens when you use Jackknifing), the scores will be averaged.

Here is a sample input and corresponding output file: [ Input | Output ]. You will notice multiple results with the same run id+ROUGE-N combination in the input file. This is due to the Jackknifing procedure that I used. In the output however, you will see only one instance as the scores have been averaged. If you do not use Jackknifing, you will most likely have one ROUGE score for one particular run, so you need not worry about this.