Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4590

WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • None
    • modules/benchmark
    • None

    Description

      It may be convenient to split Wikipedia's line file into two separate files: category-pages and non-category ones.
      It is possible to split the original line file with grep or such.
      It is more efficient to do it in advance.

      Attachments

        1. LUCENE-4590.patch
          13 kB
          Doron Cohen

        Activity

          People

            doronc Doron Cohen
            doronc Doron Cohen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: