• Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 6.2
    • Component/s: None
    • Labels:


      This is a sub ticket of JOSHUA-273.

      Joshua output formatting is a mess. The StructuredTranslation piece is a good step in the right direction, but many problems remain. Here is a list of problems and corrections.

      • There are currently four variables that contribute to defining separate paths for formatting the output: server mode (two different types) or regular mode, whether use_structured_translations is set, whether topN == 0 (i.e., whether we are outputting k-best or just quick viterbi best), and whether we are doing projecting case or doing denormalization of the output.
      • In TCP mode, iterates over Translation objects returned by Translations. Translation.toString() is then called. %S and recasing are applied.
      • In HTTP mode, builds a JSONMessage, which in turn calls translation.getStructuredTranslations.get(0).getTranslationString(). No recasing or %S formatting are applied.
      • In regular mode, we call Translation.toString(), which formats output in a complicated way in the constructor, using different methods depending on whether (a) use_structured_translations is set (b) topN == 0. This is a veritable mess of nested redundant output formatting. Some of these in turn use separate formatting applied in KBestExtractor's constructor.


      • Get rid of topN==0. Viterbi extraction should be quicker than k-best and is used automatically if possible. The same output formatting should apply in either case.
      • We should always use structured outputs, even collapsing StructuredTranslation into Translation
      • Move all output formatting out of KBestExtractor. This should just return k-best items.




            • Assignee:
              post Matt Post
              post Matt Post
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: