Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1423

Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 1.4
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      Because of some backwards compatibility problems (LUCENE-1906) we changed the CharStream/CharFilter API a little bit. Tokenizer now only has a input field of type java.io.Reader (as before the CharStream code). To correct offsets, it is now needed to call the Tokenizer.correctOffset(int) method, which delegates to the CharStream (if input is subclass of CharStream), else returns an uncorrected offset. Normally it is enough to change all occurences of input.correctOffset() to this.correctOffset() in Tokenizers. It should also be checked, if custom Tokenizers in Solr do correct their offsets.

        Attachments

        1. SOLR-1423.patch
          7 kB
          Koji Sekiguchi
        2. SOLR-1423-FieldType.patch
          0.6 kB
          Uwe Schindler
        3. SOLR-1423.patch
          8 kB
          Uwe Schindler
        4. SOLR-1423.patch
          8 kB
          Koji Sekiguchi
        5. SOLR-1423-with-empty-tokens.patch
          11 kB
          Uwe Schindler
        6. SOLR-1423-fix-empty-tokens.patch
          13 kB
          Uwe Schindler
        7. SOLR-1423-fix-empty-tokens.patch
          14 kB
          Uwe Schindler

          Issue Links

            Activity

              People

              • Assignee:
                koji Koji Sekiguchi
                Reporter:
                thetaphi Uwe Schindler
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: