# Cdec k-best framework

### From cdec Decoder

In the semiring literature, it is customary to view Viterbi and *k*-best derivation algorithms (I will refer to these as top-*k* algorithms) as inside algorithms where the as semiring used has an addition operation that is max(a,b). Since cdec supports a generic semiring framework, this is one implementation option for these algorithms. However, since one typically wants to compute a variety properties (e.g., the string yield, the length, the number of rules) over the top-*k* derivations as ordered by a single quantity (derivation probability, for example), cdec provides a set of top-*k* algorithms templatized on a semiring that provides the ordering (using `operator<`) and a traversal function, which is essentially a multiplication operation in another domain (such as "concatenate string").

Thus, to get the E-side yield of the best derivation in terms of probability, one can write:

vector<WordID> yield; prob_t best = Viterbi<vector<WordID>, ESentenceTraversal, prob_t, EdgeProb>(hg, &yield);

To get the top 25 E-side yields, you can use the same traversal function with the `KBestDerivations` class.

KBest::KBestDerivations<vector<WordID>, ESentenceTraversal, KBest::NoFilter, prob_t, EdgeProb> kbest(forest, 25); for (int i = 0; i < 25; ++i) { const KBest::KBestDerivations<vector<WordID>, ESentenceTraversal, KBest::NoFilter, prob_t, EdgeProb>::Derivation* d = kbest.LazyKthBest(i); if (!d) break; // there were less than 25 derivations in the forest! cout << TD::GetString(d->yield) << " ||| " << d->feature_values << endl; }

The *k*-best extractor uses the lazy extraction algorithm of Huang and Chiang (IWPT 2005). It supports filtering of derivations during extraction which can be used, for example, to efficiently extract a unique *k*-best list.