Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9972

SpellCheckComponent to return collations and suggestions as a JSON object rather than a list

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.5, 6.6
    • Component/s: Response Writers
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      original title: JSON-Specific Parameters arrntv causing some error for spellcheck component

      So I tried using the new array named list arrntv from solr 6.5 jenkins build
      but the json returned was broken when it returned response for spellcheck with word break.
      for example :

       {"name":"collation",{
              "type":"str","value":"collationQuery":"indomie kuing",
              "hits":81,
              "misspellingsAndCorrections":
              [
                {"name":"indomee","type":"str","value":"indomie"},
                {"name":"kuih","type":"str","value":"kuing"}
              ]}
       }
      

      as you may see that "collationQuery":"indomie kuing" was considered as value thus causing the json to fail.
      i think the correct json was :

      {"name":"collation",
              "type":"object",
              "value":{
              "collationQuery":"indomie kuing",
              "hits":81,
              "misspellingsAndCorrections":
              [
                {"name":"indomee","type":"str","value":"indomie"},
                {"name":"kuih","type":"str","value":"kuing"}
              ]}
       }
      

      sorry for bad grammar english was not my first language and i know that object was not supported by current arrntv options.

      1. SOLR-9972.patch
        13 kB
        Christine Poerschke
      2. SOLR-9972-hunch-no-test.patch
        1 kB
        Christine Poerschke
      3. SOLR-9972-impact
        3 kB
        Christine Poerschke
      4. SOLR-9972-impact.out
        18 kB
        Christine Poerschke
      5. SOLR-9972-with-test-after-SOLR-9975.patch
        4 kB
        Christine Poerschke

        Issue Links

          Activity

          Hide
          cpoerschke Christine Poerschke added a comment -

          Hello Ricky Oktavianus Lazuardy - thanks for opening this ticket.

          If you have it available, could you also share example responses for the other json.nl choices for comparison e.g. arrmap, arrarr, flat, map - thank you.

          Show
          cpoerschke Christine Poerschke added a comment - Hello Ricky Oktavianus Lazuardy - thanks for opening this ticket. If you have it available, could you also share example responses for the other json.nl choices for comparison e.g. arrmap, arrarr, flat, map - thank you.
          Hide
          cpoerschke Christine Poerschke added a comment -

          From a quick look around, this might be another case of SimpleOrderedMap vs. NamedList similar to SOLR-6064 though that is only a hunch at this point.

          Show
          cpoerschke Christine Poerschke added a comment - From a quick look around, this might be another case of SimpleOrderedMap vs. NamedList similar to SOLR-6064 though that is only a hunch at this point.
          Hide
          cpoerschke Christine Poerschke added a comment -

          Attached patch is my hunch as code (without tests).

          I have used the bin/solr -e techproducts example and the http://localhost:8983/solr/techproducts/spell?df=text&spellcheck.q=delll+ultra+sharp&spellcheck=true&spellcheck.collateParam.q.op=AND query mentioned in the Solr Reference Guide's Spell Checking to obtain a response with a collationQuery element, similar to the response above.

          The patch would also need a test e.g. in SpellCheckComponentTest class.


          An alternative, wider reaching solution might be to try and remove the "collation" elements within the "collations" element. Links into the code:

          Show
          cpoerschke Christine Poerschke added a comment - Attached patch is my hunch as code (without tests). I have used the bin/solr -e techproducts example and the http://localhost:8983/solr/techproducts/spell?df=text&spellcheck.q=delll+ultra+sharp&spellcheck=true&spellcheck.collateParam.q.op=AND query mentioned in the Solr Reference Guide's Spell Checking to obtain a response with a collationQuery element, similar to the response above. The patch would also need a test e.g. in SpellCheckComponentTest class. An alternative, wider reaching solution might be to try and remove the "collation" elements within the "collations" element. Links into the code: https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java#L303-L322 https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java#L439-L456
          Hide
          cpoerschke Christine Poerschke added a comment -

          Attaching proposed change with test (the test extends the test proposed to be added in SOLR-9975).

          Show
          cpoerschke Christine Poerschke added a comment - Attaching proposed change with test (the test extends the test proposed to be added in SOLR-9975 ).
          Hide
          dragonyui Ricky Oktavianus Lazuardy added a comment - - edited

          Christine Poerschke : Sorry for late response and thanks for quick response
          this one is arrmap one example (i think you don't need it anymore):

          {
                  "collation": {
                    "collationQuery": "indomie kuah",
                    "hits": 98,
                    "misspellingsAndCorrections": [
                      {
                        "indomee": "indomie"
                      },
                      {
                        "kuih": "kuah"
                      }
                    ]
                  }
                },
          

          and yes i think removing collation object may fix this problems.

          Show
          dragonyui Ricky Oktavianus Lazuardy added a comment - - edited Christine Poerschke : Sorry for late response and thanks for quick response this one is arrmap one example (i think you don't need it anymore): { "collation": { "collationQuery": "indomie kuah", "hits": 98, "misspellingsAndCorrections": [ { "indomee": "indomie" }, { "kuih": "kuah" } ] } }, and yes i think removing collation object may fix this problems.
          Hide
          cpoerschke Christine Poerschke added a comment -

          Attaching proposed fix. James Dyer, would you have a chance to review perhaps? Thank you.

          Show
          cpoerschke Christine Poerschke added a comment - Attaching proposed fix. James Dyer , would you have a chance to review perhaps? Thank you.
          Hide
          jdyer James Dyer added a comment -

          Christine Poerschke This looks straightforward, and I like your clean-up on existing tests!

          Just to be clear, this fixes the case of invalid json with "extended collate". It does not break backwards-compatibility in any other way, right? Is the xml rendering the same either way? Does solrj handle the change? (I'm pretty sure if we broke backwards in either case, either oas.spelling.SpellCheckCollatorTest#testExtendedCollate or oas.client.solrj.response.TestSpellCheckResponse#testSpellCheckResponse_Extended would fail, but its good to keep these things in mind.)

          But besides being doubly sure we don't break backwards on the response format (except for fixing the invalid json), this looks great, +1 .

          Show
          jdyer James Dyer added a comment - Christine Poerschke This looks straightforward, and I like your clean-up on existing tests! Just to be clear, this fixes the case of invalid json with "extended collate". It does not break backwards-compatibility in any other way, right? Is the xml rendering the same either way? Does solrj handle the change? (I'm pretty sure if we broke backwards in either case, either oas.spelling.SpellCheckCollatorTest#testExtendedCollate or oas.client.solrj.response.TestSpellCheckResponse#testSpellCheckResponse_Extended would fail, but its good to keep these things in mind.) But besides being doubly sure we don't break backwards on the response format (except for fixing the invalid json), this looks great, +1 .
          Hide
          cpoerschke Christine Poerschke added a comment -

          James Dyer, thanks for the code review. I am attaching a utility shell script and its output to try and answer your backward compat questions.

          As can be seen from the output file, the impact of the "SpellCheckComponent collations and suggestions returned as a JSON object rather than a list" change here goes beyond use of collateExtendedResults=true and changes not only some wt=json output but also some wt=ruby and wt=python output.


          Now to the 'elephant in the room' question, if there were to be an RC2 for the 6.4.0 Solr release, would it be worth putting forward the inclusion of the change here? I don't know enough about use of the SpellCheckComponent to have an informed opinion on this.

          Hoss Man and Jonny Marks (json.nl=arrntv stakeholders from SOLR-9442 and SOLR-9787, the json.nl=arrntv addition flagged up NamedList vs. SimpleOrderedMap in the existing SpellCheckComponent code, similar to SOLR-6064, the proposed fix impacts not just those using json.nl=arrntv) - if you have any thoughts on this, please let us know. Thank you.

          Show
          cpoerschke Christine Poerschke added a comment - James Dyer , thanks for the code review. I am attaching a utility shell script and its output to try and answer your backward compat questions. As can be seen from the output file, the impact of the "SpellCheckComponent collations and suggestions returned as a JSON object rather than a list" change here goes beyond use of collateExtendedResults=true and changes not only some wt=json output but also some wt=ruby and wt=python output. Now to the 'elephant in the room' question, if there were to be an RC2 for the 6.4.0 Solr release, would it be worth putting forward the inclusion of the change here? I don't know enough about use of the SpellCheckComponent to have an informed opinion on this. Hoss Man and Jonny Marks (json.nl=arrntv stakeholders from SOLR-9442 and SOLR-9787 , the json.nl=arrntv addition flagged up NamedList vs. SimpleOrderedMap in the existing SpellCheckComponent code, similar to SOLR-6064 , the proposed fix impacts not just those using json.nl=arrntv) - if you have any thoughts on this, please let us know. Thank you.
          Hide
          jdyer James Dyer added a comment -

          Christine Poerschke As I understand it, the python/ruby output are specialized json responses and so I think users will like it better if the json is well-formed. So no problem altering these, imo. I appreciate the painstaking steps you've taken to check this.

          If it were me, I'd just wait for the next release rather than putting this into the next release-candidate. This feature has been around since Solr 1.4, I believe, and no doubt the bug is a longstanding one, this being the first time anyone has raised the issue in this forum.

          Show
          jdyer James Dyer added a comment - Christine Poerschke As I understand it, the python/ruby output are specialized json responses and so I think users will like it better if the json is well-formed. So no problem altering these, imo. I appreciate the painstaking steps you've taken to check this. If it were me, I'd just wait for the next release rather than putting this into the next release-candidate. This feature has been around since Solr 1.4, I believe, and no doubt the bug is a longstanding one, this being the first time anyone has raised the issue in this forum.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 01878380226c5be6bfedc45b8fb6174de4181a7c in lucene-solr's branch refs/heads/master from Christine Poerschke
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0187838 ]

          SOLR-9972: SpellCheckComponent collations and suggestions returned as a JSON object rather than a list
          (Christine Poerschke in response to bug report from Ricky Oktavianus Lazuardy)

          Show
          jira-bot ASF subversion and git services added a comment - Commit 01878380226c5be6bfedc45b8fb6174de4181a7c in lucene-solr's branch refs/heads/master from Christine Poerschke [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0187838 ] SOLR-9972 : SpellCheckComponent collations and suggestions returned as a JSON object rather than a list (Christine Poerschke in response to bug report from Ricky Oktavianus Lazuardy)
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 4cd3d15da8ef77ef50e2bda91ed6d3c6e87b5426 in lucene-solr's branch refs/heads/branch_6x from Christine Poerschke
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4cd3d15 ]

          SOLR-9972: SpellCheckComponent collations and suggestions returned as a JSON object rather than a list
          (Christine Poerschke in response to bug report from Ricky Oktavianus Lazuardy)

          Show
          jira-bot ASF subversion and git services added a comment - Commit 4cd3d15da8ef77ef50e2bda91ed6d3c6e87b5426 in lucene-solr's branch refs/heads/branch_6x from Christine Poerschke [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4cd3d15 ] SOLR-9972 : SpellCheckComponent collations and suggestions returned as a JSON object rather than a list (Christine Poerschke in response to bug report from Ricky Oktavianus Lazuardy)
          Hide
          cpoerschke Christine Poerschke added a comment -

          Ricky Oktavianus Lazuardy - thanks for creating this issue.

          James Dyer - thanks for code reviewing.

          Show
          cpoerschke Christine Poerschke added a comment - Ricky Oktavianus Lazuardy - thanks for creating this issue. James Dyer - thanks for code reviewing.
          Hide
          jdyer James Dyer added a comment -

          Re-open to revert for Solr 6.6, as this fix breaks the "original" json response format. See SOLR-10522.

          Show
          jdyer James Dyer added a comment - Re-open to revert for Solr 6.6, as this fix breaks the "original" json response format. See SOLR-10522 .
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 571264bcc007a3d853f678dd7ac5b529644e938d in lucene-solr's branch refs/heads/master from jdyer1
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=571264b ]

          SOLR-10522: Revert "SOLR-9972: SpellCheckComponent collations and suggestions returned as a JSON object rather than a list"

          This reverts commit 4cd3d15da8ef77ef50e2bda91ed6d3c6e87b5426.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 571264bcc007a3d853f678dd7ac5b529644e938d in lucene-solr's branch refs/heads/master from jdyer1 [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=571264b ] SOLR-10522 : Revert " SOLR-9972 : SpellCheckComponent collations and suggestions returned as a JSON object rather than a list" This reverts commit 4cd3d15da8ef77ef50e2bda91ed6d3c6e87b5426.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 82b78bd95e95b95b9e7527d9396be24a0dc221df in lucene-solr's branch refs/heads/master from jdyer1
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=82b78bd ]

          SOLR-10522: Revert "SOLR-9972: SpellCheckComponent collations and suggestions returned as a JSON object rather than a list"

          Show
          jira-bot ASF subversion and git services added a comment - Commit 82b78bd95e95b95b9e7527d9396be24a0dc221df in lucene-solr's branch refs/heads/master from jdyer1 [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=82b78bd ] SOLR-10522 : Revert " SOLR-9972 : SpellCheckComponent collations and suggestions returned as a JSON object rather than a list"
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 68f51e6a128e0d56ae37b070834bb36e00ceca50 in lucene-solr's branch refs/heads/branch_6x from jdyer1
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=68f51e6 ]

          SOLR-10522: Revert "SOLR-9972: SpellCheckComponent collations and suggestions returned as a JSON object rather than a list"

          This reverts commit 4cd3d15da8ef77ef50e2bda91ed6d3c6e87b5426.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 68f51e6a128e0d56ae37b070834bb36e00ceca50 in lucene-solr's branch refs/heads/branch_6x from jdyer1 [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=68f51e6 ] SOLR-10522 : Revert " SOLR-9972 : SpellCheckComponent collations and suggestions returned as a JSON object rather than a list" This reverts commit 4cd3d15da8ef77ef50e2bda91ed6d3c6e87b5426.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 854347d5e09ea92f1d738458a05fc2224ca3d7aa in lucene-solr's branch refs/heads/branch_6x from jdyer1
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=854347d ]

          SOLR-10522: Revert "SOLR-9972: SpellCheckComponent collations and suggestions returned as a JSON object rather than a list"

          Show
          jira-bot ASF subversion and git services added a comment - Commit 854347d5e09ea92f1d738458a05fc2224ca3d7aa in lucene-solr's branch refs/heads/branch_6x from jdyer1 [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=854347d ] SOLR-10522 : Revert " SOLR-9972 : SpellCheckComponent collations and suggestions returned as a JSON object rather than a list"
          Hide
          jdyer James Dyer added a comment -

          Revert complete. This fixes SOLR-10522. I opened SOLR-10635 to investigate fixing the response output issue on the response writer end.

          My apologies Christine Poerschke for missing this when you asked me to look at the issue in the first place!

          Show
          jdyer James Dyer added a comment - Revert complete. This fixes SOLR-10522 . I opened SOLR-10635 to investigate fixing the response output issue on the response writer end. My apologies Christine Poerschke for missing this when you asked me to look at the issue in the first place!

            People

            • Assignee:
              jdyer James Dyer
              Reporter:
              dragonyui Ricky Oktavianus Lazuardy
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development