Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-2259

Improve analyzer/version handling in Solr

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.1, 4.0-ALPHA
    • None
    • None

    Description

      We added Version for backwards compatibility support in Lucene.
      We use this to fire deprecated code to emulate old version to ensure index backwards compat.
      Related: we deprecate old analysis components and eventually remove them.

      To hook into Solr, at first it defaulted to Version 2.4 emulation everywhere, with the example having the latest.
      if you don't specify a version in your solrconfig, it defaults to 2.4 though.

      However, as of LUCENE-2781 2.4 is removed: but users with old configs that don't specify a version should not be silently "upgraded" to the Version 3.0 emulation... this is bad.

      Additionally, when users are using deprecated emulation or using deprecated factories they might not know it, and it might come as a surprise if they upgrade, especially if they arent looking at java apis or java code.

      I propose:

      1. in trunk: we make the solrconfig luceneMatchVersion mandatory.
        This is simple: Uwe already has a method that will error out if its not present, we just use that.
      2. in 3.x: we warn if you don't specify luceneMatchVersion in solrconfig: telling you that its going to be required in 4.0 and that you are defaulting to 2.4 emulation.
        For example: Warning: luceneMatchVersion is not specified in solrconfig.xml. Defaulting to 2.4 emulation. You should at some point declare and reindex to at least 3.0, because 2.4 emulation is deprecated in 3.x and will be removed in 4.0. This parameter will be mandatory in 4.0.
      3. in 3.x,trunk: we warn if you are using a deprecated matchVersion constant somewhere in general, even for a specific tokenizer, telling you that you need to at some point reindex with a current version before you can move to the next release.
        For example: Warning: you are using 2.4 emulation, at some point you need to bump and reindex to at least 3.0, because 2.4 emulation is deprecated in 3.x and will be removed in 4.0
      4. in 3.x,trunk: we warn if you are using a deprecated TokenStreamFactory so that you know its going to be removed.
        For example: Warning: the ISOLatin1FilterFactory is deprecated and will be removed in the next release. You should migrate to ASCIIFoldingFilterFactory.

      Attachments

        1. SOLR-2259_part3.patch
          1 kB
          Robert Muir
        2. SOLR-2259.patch
          25 kB
          Robert Muir
        3. SOLR-2259.patch
          22 kB
          Robert Muir
        4. SOLR-2259part2.patch
          1 kB
          Robert Muir
        5. SOLR-2259part4.patch
          4 kB
          Robert Muir

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rcmuir Robert Muir
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment