List of the available benchmarks.

  • MateCat post-edits: documents including source text, post-edited suggestions and final post-editions
  • BinQE: collection of multilingual source texts and MT outputs with binary quality estimation labels
  • BitterCorpus: collection of parallel English-Italian documents in the IT domain, with domain-specific terms manually marked and aligned
  • Word-alignment Gold Reference: collection of human-checked word-alignment of English-Italian sentence pairs in the Legal domain

A detailed description of each resource and download info can be found by clicking the corresponding link.