Tika
  1. Tika
  2. TIKA-491

Add language identification support for Norwegian Bokmål and Norwegian Nynorsk

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.7
    • Fix Version/s: None
    • Component/s: languageidentifier
    • Labels:
      None

      Description

      Currently there is one Norwegian language profile in Tika - "no". We need to distinguish between the two official Norwegian languages defined by ISO 639-1 codes "nb" and "nn". Those codes are recommended used instead of the common "no" tag.

      Proposed solved by removing the current language profile no.ngp and replacing it with two new ones for nb and nn.

      We must also add tests for Norwegian

        Issue Links

          Activity

          Jan Høydahl created issue -
          Jan Høydahl made changes -
          Field Original Value New Value
          Link This issue is cloned as TIKA-492 [ TIKA-492 ]
          Jan Høydahl made changes -
          Link This issue is cloned as TIKA-492 [ TIKA-492 ]
          Ken Krugler made changes -
          Assignee Ken Krugler [ kkrugler ]
          Ken Krugler made changes -
          Link This issue relates to TIKA-1723 [ TIKA-1723 ]

            People

            • Assignee:
              Ken Krugler
              Reporter:
              Jan Høydahl
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development