Cdec sample grammar and test set
From cdec Decoder
If you just want to see cdec run, you this page describes how to download a (smallish) translation model, language model, weights file, and test set and decode it with cdec. To run this demo, you will need to have cdec downloaded, installed, and built (build instructions). You will also need to have at least 1.5GB memory free on your system.
This page assumes that $CDEC refers to the root path of your cdec software tree.
Download the data
You can download the data with the following command on most Linux-like systems:
wget http://ling.umd.edu/~redpony/cdec-demo.tar.bz2
Or you can get it here. The download size is 164MB.
Unzip the data and run the decoder
tar xjf cdec-demo.tar.bz2 cd cdec-demo/ $CDEC -c cdec-mt03.ini -w weights.tuned -i mt03.src.txt > mt03.trans
The decoding process will take a couple of minutes on most modern systems with enough (i.e., > 1.5GB) memory.
Score the output
cdec includes a script to score a translation using a number of different metrics. By default, the Papineni et al. (2002) definition of BLEU is used:
$CDEC/vest/fast_score -r mt03.ref.0 -r mt03.ref.1 -r mt03.ref.2 -r mt03.ref.3 -i mt03.trans
The following output is expected:
Loading references (4 files) Loaded reference translations for 919 sentences. Loaded 919 references for scoring with ibm_bleu BLEU = 32.20, 76.2|42.8|24.2|13.8 (brev=0.997) 0.321954
