cdec translation inputs can be annotated with additional information via SGML(-like) markup. The input sentences that are to be translated should be wrapped in <seg> tags, and the following attributes can be specified and are all optional:

  • id - indicate the sentence number. This must be an non-negative integer.
  • grammar - supplemental grammar file for this sentence (read about per-sentence grammars).
  • src_tree - Penn TreeBank parse of the source sentence.
