Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-491

Add language identification support for Norwegian Bokmål and Norwegian Nynorsk

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.7
    • Fix Version/s: None
    • Component/s: languageidentifier
    • Labels:
      None

      Description

      Currently there is one Norwegian language profile in Tika - "no". We need to distinguish between the two official Norwegian languages defined by ISO 639-1 codes "nb" and "nn". Those codes are recommended used instead of the common "no" tag.

      Proposed solved by removing the current language profile no.ngp and replacing it with two new ones for nb and nn.

      We must also add tests for Norwegian

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kkrugler Ken Krugler
                Reporter:
                janhoy Jan Høydahl
              • Votes:
                1 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: