Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-1346

The Training API code for Tokenization is outdated in manual (1/2)

    XMLWordPrintableJSON

Details

    Description

      The Training API example code at https://opennlp.apache.org/docs/1.9.4/manual/opennlp.html in the section dealing with Tokenizer training  incorrect. The current code sample is:

      ObjectStream<String> lineStream = new PlainTextByLineStream(new FileInputStream("en-sent.train"),
          StandardCharsets.UTF_8);

      But PlainTextByLineStream no longer takes an InputStream as the first argument to its constructor. It now requires an InputStreamFactory.

      NOTE: this same pattern reappears in multiple places in the current manual. See also, OPENNLP-1319 and OPENNLP-1345

       

      Attachments

        Issue Links

          Activity

            People

              mawiesne Martin Wiesner
              sprhodes Phillip Rhodes
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: