Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-1190

CONLL02 format

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • tools-1.5.3
    • None
    • Formats
    • None
    • Patch, Important

    Description

      According to the documentation, the following should work

       bin/opennlp TokenNameFinderConverter conll02 -data esp.train -lang es -types per > es_corpus_train_persons.txt

      However currently it delivers error message since  it expects 3 columns instead of 2 that are in the dataset.

      This is a bug, introduced at line 130 of   opennlp.tools.formats.Conll02NameSampleStream.java where a length of 3 is imposed.

      Attachments

        Issue Links

          Activity

            People

              mawiesne Martin Wiesner
              lucatoldo Luca
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 1h
                  1h
                  Remaining:
                  Remaining Estimate - 1h
                  1h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified