Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4412

LanguageIdentifier lcmap for language field

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.1
    • Fix Version/s: 4.4, 6.0
    • Component/s: contrib - LangId
    • Labels:
      None

      Description

      For some languages, the detector will detect sub-languages, such as LangDetect detecting zh-tw or zh-cn for Chinese. Tika detector only detects zh. Today you can use lcmap to map these two into one code, e.g. langid.map.lcmap=zh-cn:zh zh-tw:zh. But the langField output is not changed.

      We need an option for langField as well.

        Attachments

          Activity

            People

            • Assignee:
              janhoy Jan Høydahl
              Reporter:
              janhoy Jan Høydahl
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: