Solr
  1. Solr
  2. SOLR-6163

special chars and ManagedSynonymFilterFactory

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.8
    • Fix Version/s: 4.10, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Hey,

      I was playing with the ManagedSynonymFilterFactory to create a synonym list with the API. But I have difficulties when my keys contains special characters (or spaces) to delete them...

      I added a key ééé that matches with some other words. It's saved in the synonym file as ééé.

      When I try to delete it, I do:

      curl -X DELETE "http://localhost/solr/mycore/schema/analysis/synonyms/english/ééé"

      error message: %C3%A9%C3%A9%C3%A9%C2%B5 not found in /schema/analysis/synonyms/english

      A wild guess from me is that %C3%A9 isn't decoded back to ééé. And that's why he can't find the keyword?

      1. SOLR-6163.patch
        5 kB
        Vitaliy Zhovtyuk
      2. SOLR-6163-v2.patch
        12 kB
        Timo Hund
      3. SOLR-6163-v3.patch
        6 kB
        Timo Hund
      4. SOLR-6163-v4.patch
        8 kB
        Timo Hund

        Issue Links

          Activity

          Hide
          Timothy Potter added a comment -

          I'll take a look but I think the fix should be upstream from the managed resource implementations, seems like Restlet should have already done the decoding?

          Show
          Timothy Potter added a comment - I'll take a look but I think the fix should be upstream from the managed resource implementations, seems like Restlet should have already done the decoding?
          Hide
          ASF GitHub Bot added a comment -

          GitHub user timoschmidt opened a pull request:

          https://github.com/apache/lucene-solr/pull/73

          SOLR-6163: special chars and ManagedSynonymFilterFactory

          Special characters could not be used for update or deletion because the url was not decoded before the resource was used.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/timoschmidt/lucene-solr origin/branch_4x

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/lucene-solr/pull/73.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #73


          commit 0168e160e4a9236b047b2e24909d1f59dfd3eb7b
          Author: timo.schmidt <timo-schmidt@gmx.net>
          Date: 2014-07-25T12:44:26Z

          SOLR-6163: special chars and ManagedSynonymFilterFactory


          Show
          ASF GitHub Bot added a comment - GitHub user timoschmidt opened a pull request: https://github.com/apache/lucene-solr/pull/73 SOLR-6163 : special chars and ManagedSynonymFilterFactory Special characters could not be used for update or deletion because the url was not decoded before the resource was used. You can merge this pull request into a Git repository by running: $ git pull https://github.com/timoschmidt/lucene-solr origin/branch_4x Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/73.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #73 commit 0168e160e4a9236b047b2e24909d1f59dfd3eb7b Author: timo.schmidt <timo-schmidt@gmx.net> Date: 2014-07-25T12:44:26Z SOLR-6163 : special chars and ManagedSynonymFilterFactory
          Hide
          Hoss Man added a comment -

          A quick glance at timo's patch and the javadocs for the assocaited restlet classes seems to suggest that this is correct general course of action...

          http://restlet.com/learn/javadocs/2.1/jse/api/org/restlet/data/Reference.html#getPath%28%29
          "Note that no URI decoding is done by this method. "

          A cleaner fix is probably to use this alternative restlet method that an decode for us ...

          http://restlet.com/learn/javadocs/2.1/jse/api/org/restlet/data/Reference.html#getPath%28boolean%29

          There are lots of similar "Note that no URI decoding is done by this method." and "Returns the optionnally decoded ______" combinations in the Request class – we should probably audit all of our usages of this class.

          Show
          Hoss Man added a comment - A quick glance at timo's patch and the javadocs for the assocaited restlet classes seems to suggest that this is correct general course of action... http://restlet.com/learn/javadocs/2.1/jse/api/org/restlet/data/Reference.html#getPath%28%29 "Note that no URI decoding is done by this method. " A cleaner fix is probably to use this alternative restlet method that an decode for us ... http://restlet.com/learn/javadocs/2.1/jse/api/org/restlet/data/Reference.html#getPath%28boolean%29 There are lots of similar "Note that no URI decoding is done by this method." and "Returns the optionnally decoded ______" combinations in the Request class – we should probably audit all of our usages of this class.
          Hide
          Vitaliy Zhovtyuk added a comment -

          Added change with decode=true
          Checked org.restlet.data.Reference methods usage, used only in org.apache.solr.rest.RestManager

          Show
          Vitaliy Zhovtyuk added a comment - Added change with decode=true Checked org.restlet.data.Reference methods usage, used only in org.apache.solr.rest.RestManager
          Hide
          Timothy Potter added a comment -

          Hi Vitaly, Thanks for posting a patch ... looks good, except I think we should add some specific tests for this issue in the TestManagedStopFilterFactory and TestManagedSynonymFilterFactory for two reasons: 1) to guard against regression in case something in the RestManager layer changes and 2) to serve as an example to remind developers to test with data requiring decoding when developing test cases for new managed resources.

          Show
          Timothy Potter added a comment - Hi Vitaly, Thanks for posting a patch ... looks good, except I think we should add some specific tests for this issue in the TestManagedStopFilterFactory and TestManagedSynonymFilterFactory for two reasons: 1) to guard against regression in case something in the RestManager layer changes and 2) to serve as an example to remind developers to test with data requiring decoding when developing test cases for new managed resources.
          Hide
          Timo Hund added a comment -

          Hello together,

          I've added a specific test for special characters and added a new patch+commited the changes to the pull request.
          If you need further changes please let me know

          Show
          Timo Hund added a comment - Hello together, I've added a specific test for special characters and added a new patch+commited the changes to the pull request. If you need further changes please let me know
          Hide
          Timo Hund added a comment -

          Grouped the commits of the last patch into one single patch

          Show
          Timo Hund added a comment - Grouped the commits of the last patch into one single patch
          Hide
          Timothy Potter added a comment -

          Thanks Timo! Will get this committed today.

          Show
          Timothy Potter added a comment - Thanks Timo! Will get this committed today.
          Hide
          ASF subversion and git services added a comment -

          Commit 1616361 from Timothy Potter in branch 'dev/trunk'
          [ https://svn.apache.org/r1616361 ]

          SOLR-6163: Correctly decode special characters in managed stopwords and synonym endpoints.

          Show
          ASF subversion and git services added a comment - Commit 1616361 from Timothy Potter in branch 'dev/trunk' [ https://svn.apache.org/r1616361 ] SOLR-6163 : Correctly decode special characters in managed stopwords and synonym endpoints.
          Hide
          ASF subversion and git services added a comment -

          Commit 1616366 from Timothy Potter in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1616366 ]

          SOLR-6163: Correctly decode special characters in managed stopwords and synonym endpoints.

          Show
          ASF subversion and git services added a comment - Commit 1616366 from Timothy Potter in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1616366 ] SOLR-6163 : Correctly decode special characters in managed stopwords and synonym endpoints.
          Hide
          Steve Rowe added a comment -

          Timothy Potter, can this issue be resolved? (Fix version 4.10 & 5.0, I think?)

          Show
          Steve Rowe added a comment - Timothy Potter , can this issue be resolved? (Fix version 4.10 & 5.0, I think?)

            People

            • Assignee:
              Timothy Potter
              Reporter:
              Wim Kumpen
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development