Multi-pass scoring

From cdec Decoder

Jump to: navigation, search

cdec supports multi-pass scoring of a translation forest. That is, a large search space can be rescored with coarse, but inexpensive features, to determine the least promising portions, which can then be pruned. Then progressively more fine-grained (and expensive-to-compute) features can be applied on the pruned space.


Example cdec.ini:

formalism=scfg
add_pass_through_rules=true
# first pass features -- LM is a bigram LM
feature_function=NonLatinCount
feature_function=WordPenalty
feature_function=KLanguageModel /fs/clip-dissertation/twophase/a2e.2gm.klm
#first pass weights
weights=/fs/clip-dissertation/twophase/second-pass-mert/weights.first
#the amount of the search space to prune
density_prune=240

# second pass features
feature_function2=LanguageModel lm://dsub01.umiacs.umd.edu:6668 -o 5 -n LM2

Using this functionality, cdec could implement "Coarse-to-Fine Syntactic Machine Translation using Language Projections" Slav Petrov, Aria Haghighi and Dan Klein, EMNLP 2008

Personal tools