Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      Create a TypeTokenFilterFactory to make the TypeTokenFilter (filtering tokens depending on token types, see LUCENE-3671) available in Solr too.

      1. SOLR-3054_2.patch
        9 kB
        Tommaso Teofili
      2. SOLR-3054_3.patch
        9 kB
        Tommaso Teofili
      3. SOLR-3054.patch
        10 kB
        Tommaso Teofili

        Issue Links

          Activity

          Hide
          Tommaso Teofili added a comment -

          attached factory with unit test patch

          Show
          Tommaso Teofili added a comment - attached factory with unit test patch
          Hide
          Uwe Schindler added a comment -

          Hi Tommaso,

          patch look good, a few things:

          • I think the default should be to enable positionIncrements. I have no idea what the default for the other FilteringTFs is, but Solr has the correct posIncr handling in its default configs enabled.
          • The MockResourceLoader is funny, but why do you need it? For the creation test, simply also use the supplied stoptype-files.
          Show
          Uwe Schindler added a comment - Hi Tommaso, patch look good, a few things: I think the default should be to enable positionIncrements. I have no idea what the default for the other FilteringTFs is, but Solr has the correct posIncr handling in its default configs enabled. The MockResourceLoader is funny, but why do you need it? For the creation test, simply also use the supplied stoptype-files.
          Hide
          Tommaso Teofili added a comment -

          Thanks Uwe, I've fixed the patch as per your comments.
          Also I've added that TypeTokenFF implements the ResourceLoaderAware interface.

          Show
          Tommaso Teofili added a comment - Thanks Uwe, I've fixed the patch as per your comments. Also I've added that TypeTokenFF implements the ResourceLoaderAware interface.
          Hide
          Uwe Schindler added a comment -

          I looked up the other TokenFilters that filter tokens, unfortunately all of then default to enablePosIncr=false. I am not sure what the right solution is here? Consistency or correctness? Robert, whats your opinion?

          The rest of the patch looks fine, I would only remove the try-catch blocks in the test methods and let the test method declare the exception. It then gets reported by JUnit with a failure automatically.

          The question is, the wordset is initialized to be empty if missing. Does it make sense? I would maybe make the types file mandatory, as without the filter makes no sense.

          Show
          Uwe Schindler added a comment - I looked up the other TokenFilters that filter tokens, unfortunately all of then default to enablePosIncr=false. I am not sure what the right solution is here? Consistency or correctness? Robert, whats your opinion? The rest of the patch looks fine, I would only remove the try-catch blocks in the test methods and let the test method declare the exception. It then gets reported by JUnit with a failure automatically. The question is, the wordset is initialized to be empty if missing. Does it make sense? I would maybe make the types file mandatory, as without the filter makes no sense.
          Hide
          Tommaso Teofili added a comment -

          I looked up the other TokenFilters that filter tokens, unfortunately all of then default to enablePosIncr=false. I am not sure what the right solution is here? Consistency or correctness?

          in the first patch I went for consistency but then your comment made me realize the enablePosIncr should be true by default. I mean, as a user I'd expect it to be true by default.

          I would only remove the try-catch blocks in the test methods and let the test method declare the exception. It then gets reported by JUnit with a failure automatically.

          ok

          The question is, the wordset is initialized to be empty if missing. Does it make sense? I would maybe make the types file mandatory, as without the filter makes no sense.

          right, need to fix that

          Show
          Tommaso Teofili added a comment - I looked up the other TokenFilters that filter tokens, unfortunately all of then default to enablePosIncr=false. I am not sure what the right solution is here? Consistency or correctness? in the first patch I went for consistency but then your comment made me realize the enablePosIncr should be true by default. I mean, as a user I'd expect it to be true by default. I would only remove the try-catch blocks in the test methods and let the test method declare the exception. It then gets reported by JUnit with a failure automatically. ok The question is, the wordset is initialized to be empty if missing. Does it make sense? I would maybe make the types file mandatory, as without the filter makes no sense. right, need to fix that
          Hide
          Robert Muir added a comment -

          in the first patch I went for consistency but then your comment made me realize the enablePosIncr should be true by default. I mean, as a user I'd expect it to be true by default.

          I think we should open an issue to fix this in 4.0... I don't care if everything
          in the defaults^H^H^H^Hexample has it set to true, the factories should also default to true.

          Show
          Robert Muir added a comment - in the first patch I went for consistency but then your comment made me realize the enablePosIncr should be true by default. I mean, as a user I'd expect it to be true by default. I think we should open an issue to fix this in 4.0... I don't care if everything in the defaults^H^H^H^Hexample has it set to true, the factories should also default to true.
          Hide
          Uwe Schindler added a comment -

          OK, I think we should commit them with "false" as default for consistency (especially in 3.x), and upgrade 4.0 in a separate issue.

          Show
          Uwe Schindler added a comment - OK, I think we should commit them with "false" as default for consistency (especially in 3.x), and upgrade 4.0 in a separate issue.
          Hide
          Tommaso Teofili added a comment -

          updated patch:

          • enablePosIncr set to false
          • fixed unit tests for throwing Exceptions rather than try/catching
          • 'types' TypeTokenFF parameter is now mandatory (SolrException raised if not supplied)
          Show
          Tommaso Teofili added a comment - updated patch: enablePosIncr set to false fixed unit tests for throwing Exceptions rather than try/catching 'types' TypeTokenFF parameter is now mandatory (SolrException raised if not supplied)
          Hide
          Robert Muir added a comment -

          +1

          Show
          Robert Muir added a comment - +1
          Hide
          Uwe Schindler added a comment -

          +1, I will commit this later!

          Show
          Uwe Schindler added a comment - +1, I will commit this later!
          Hide
          Uwe Schindler added a comment -

          Committed trunk revision: 1234573 (main), 1234579 (removal of unneeded assureMatchVersion check)
          Committed 3.x revision: 1234580

          As noted, I removed the not-needed assureMatchVersionCheck(), as this filter does not use the matchVersion at all.

          Show
          Uwe Schindler added a comment - Committed trunk revision: 1234573 (main), 1234579 (removal of unneeded assureMatchVersion check) Committed 3.x revision: 1234580 As noted, I removed the not-needed assureMatchVersionCheck(), as this filter does not use the matchVersion at all.
          Hide
          Tommaso Teofili added a comment -

          Thanks Uwe

          Show
          Tommaso Teofili added a comment - Thanks Uwe

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Tommaso Teofili
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development