Solr
  1. Solr
  2. SOLR-2627

Solr's JSON request format isn't valid JSON

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Duplicate
    • Affects Version/s: 3.2
    • Fix Version/s: None
    • Component/s: update
    • Labels:

      Description

      I've been working with Solr's JSON request and response to get it up and running in my application and it looks like the JSON request format is not properly formatted JSON. Here's an example of a request with multiple documents (from the Wiki):

      {
       "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
       "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
      }
      

      Unfortunately, this is not valid JSON because according to RFC-4627 section 2.2, "The names within an object SHOULD be unique." This means that defining the name "add" twice is not allowed. Instead, the JSON should use an array for multiple documents like this:

      {
       "add": [{"doc": {"id" : "TestDoc1", "title" : "test1"}},
               {"doc": {"id" : "TestDoc2", "title" : "another test"}}]
      }
      

      An alternate form that simplifies this entire thing is to remove the "doc" identifier as it doesn't appear to provide useful information. That form would be:

      {
       "add": [{"id" : "TestDoc1", "title" : "test1"},
               {"id" : "TestDoc2", "title" : "another test"}]
      }
      

      It looks like Noggit uses a stream based Parser that doesn't put these values into a Map or JavaBean, otherwise this would have revealed itself much sooner. I run into the issue when attempting to create a Map that I could pass to a JSON binder such as Jackson or Google-GSON. Given the current format, none of those tools will work with Solr.

      It also looks like Noggit is not really moving out of labs. It would be nice to use a more well known and active project for the JSON handling as it is quickly becoming the de-facto standard. I can open a ticket for that separately if needed and help out with the code.

        Activity

        Hide
        Brian Pontarelli added a comment -

        Got it. Thanks for the clarification.

        I read RFC2119 for the definition of SHOULD:

        "3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course."

        Based on this definition, I contend that it is indeed invalid JSON, for this situation. However, interpretation of these types of rules are always debatable.

        Show
        Brian Pontarelli added a comment - Got it. Thanks for the clarification. I read RFC2119 for the definition of SHOULD: "3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course." Based on this definition, I contend that it is indeed invalid JSON, for this situation. However, interpretation of these types of rules are always debatable.
        Hide
        Yonik Seeley added a comment -

        RFC4627
        "The names within an object SHOULD be unique."
        "JSON parser MUST accept all texts that conform to the JSON grammar."

        "SHOULD" is very specific (and a synonym for RECOMMENDED) and is not equivalent to "MUST" in RFCs.
        So I agree it's best practice to avoid repeated names in a JSON object, but a parser that could not deal with repeated names is clearly not a conformant JSON parser. Valid JSON may have repeated names in an object.

        Show
        Yonik Seeley added a comment - RFC4627 "The names within an object SHOULD be unique." "JSON parser MUST accept all texts that conform to the JSON grammar." "SHOULD" is very specific (and a synonym for RECOMMENDED) and is not equivalent to "MUST" in RFCs. So I agree it's best practice to avoid repeated names in a JSON object, but a parser that could not deal with repeated names is clearly not a conformant JSON parser. Valid JSON may have repeated names in an object.
        Hide
        Mark Miller added a comment -

        Heh - now I have to read this spec. A 'SHOULD' rule leading to a 'not properly formatted json' sounds a little fishy

        Show
        Mark Miller added a comment - Heh - now I have to read this spec. A 'SHOULD' rule leading to a 'not properly formatted json' sounds a little fishy
        Hide
        Brian Pontarelli added a comment -

        Actually, I'll bite on your comment because it appears you are incorrect. I tested this in a couple of browsers and also some parsers and it doesn't work. Most browsers will end up clobbering the fist object with the second one with the same name. You can test this easily in Chrome with the console.log statement.

        Additionally, most JSON parsers return objects as Maps, which don't allow duplicate keys. You would need to use a MultiMap to accomplish that.

        Glad this is fixed, but please do update the docs.

        Show
        Brian Pontarelli added a comment - Actually, I'll bite on your comment because it appears you are incorrect. I tested this in a couple of browsers and also some parsers and it doesn't work. Most browsers will end up clobbering the fist object with the second one with the same name. You can test this easily in Chrome with the console.log statement. Additionally, most JSON parsers return objects as Maps, which don't allow duplicate keys. You would need to use a MultiMap to accomplish that. Glad this is fixed, but please do update the docs.
        Hide
        Brian Pontarelli added a comment -

        Well, according to the spec it isn't valid, but that's not important. The most important thing is to update the docs. In general the JSON docs are really lacking and I spent about 30 minutes searching in JIRA without success, so I opened the issue. Avoid duplicate issues by updating the docs.

        Show
        Brian Pontarelli added a comment - Well, according to the spec it isn't valid, but that's not important. The most important thing is to update the docs. In general the JSON docs are really lacking and I spent about 30 minutes searching in JIRA without success, so I opened the issue. Avoid duplicate issues by updating the docs.
        Hide
        Yonik Seeley added a comment -

        This is a duplicate of SOLR-2496 (and a nicer alternate syntax is already possible).

        Repeated tags are valid JSON, and a parser that didn't allow that would be broken.

        Show
        Yonik Seeley added a comment - This is a duplicate of SOLR-2496 (and a nicer alternate syntax is already possible). Repeated tags are valid JSON, and a parser that didn't allow that would be broken.

          People

          • Assignee:
            Unassigned
            Reporter:
            Brian Pontarelli
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development