Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.8, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      We should be able to index block documents in JSON format

      1. SOLR-5183.patch
        9 kB
        Varun Thacker
      2. SOLR-5183.patch
        8 kB
        Hoss Man
      3. SOLR-5183.patch
        5 kB
        Varun Thacker
      4. SOLR-5183.patch
        5 kB
        Varun Thacker
      5. SOLR-5183.patch
        2 kB
        Varun Thacker

        Activity

        Hide
        Varun Thacker added a comment -

        Example Json:

        Unable to find source-code formatter for language: json. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml
         
        {
          "add": {
            "doc" : {
              "id" : "1",
              "parent" : "true",
              "doc" : {
                "id" : "2",
                "subject" : "black"
              },
              "doc" : {
                "id" : "3",
                "subject" : "blue"
              }      
            }
          },
          "add": {
            "doc" : {
              "id" : "4",
              "parent" : "true",
              "doc" : {
                "id" : "5",
                "subject" : "black"
              },
              "doc" : {
                "id" : "6",
                "subject" : "red"
              }      
            }
          }
        } 
        
        Show
        Varun Thacker added a comment - Example Json: Unable to find source-code formatter for language: json. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml { "add" : { "doc" : { "id" : "1" , "parent" : " true " , "doc" : { "id" : "2" , "subject" : "black" }, "doc" : { "id" : "3" , "subject" : "blue" } } }, "add" : { "doc" : { "id" : "4" , "parent" : " true " , "doc" : { "id" : "5" , "subject" : "black" }, "doc" : { "id" : "6" , "subject" : "red" } } } }
        Hide
        Varun Thacker added a comment -

        Patch which can parse the above mentioned format. If this is okay I'll add tests in AddBlockUpdateTest.java

        Show
        Varun Thacker added a comment - Patch which can parse the above mentioned format. If this is okay I'll add tests in AddBlockUpdateTest.java
        Hide
        Mikhail Khludnev added a comment -

        Varun,

        I'm not experienced in JSON, wouldn't it better to put them in array?

           childrenDocs:[
              {
                "id" : "5",
                "subject" : "black"
              },
              {
                "id" : "6",
                "subject" : "red"
              }      
           ]
        

        wdyt?

        Show
        Mikhail Khludnev added a comment - Varun, I'm not experienced in JSON, wouldn't it better to put them in array? childrenDocs:[ { "id" : "5" , "subject" : "black" }, { "id" : "6" , "subject" : "red" } ] wdyt?
        Hide
        Varun Thacker added a comment -

        Hi Mikhail,

        Ideally that would be the best way to represent the child docs.

        The reason why I thought of this format was because the way we do single doc updates in JSON currently. We use

        {
          "add": {
            "doc" : {
              "id" : "1"
            }
          },
          "add": {
            "doc" : {
              "id" : "2"
            }
           }
        }
        

        Instead of...

        {
          "add": {
            "docs" : [
              { "id" : "1" },
              { "id" : "2" }
            ]
          }
        }
        
        Show
        Varun Thacker added a comment - Hi Mikhail, Ideally that would be the best way to represent the child docs. The reason why I thought of this format was because the way we do single doc updates in JSON currently. We use { "add" : { "doc" : { "id" : "1" } }, "add" : { "doc" : { "id" : "2" } } } Instead of... { "add" : { "docs" : [ { "id" : "1" }, { "id" : "2" } ] } }
        Hide
        Varun Thacker added a comment -

        Can we finalize the format? Personally I am okay with Mikhail Khludnev suggestion.

        Show
        Varun Thacker added a comment - Can we finalize the format? Personally I am okay with Mikhail Khludnev suggestion.
        Hide
        Varun Thacker added a comment -
        • Takes nested products as an array of childDocs
        • There is a nocommit in the patch on what happens when the key "childDoc" is used to add normal data and not nested products.
          ( How do we validate if the user does not put in a field called root in the document ? )
           childDocs:[
                {
                  "id" : "5",
                  "subject" : "black"
                },
                {
                  "id" : "6",
                  "subject" : "red"
                }      
             ]
          
        Show
        Varun Thacker added a comment - Takes nested products as an array of childDocs There is a nocommit in the patch on what happens when the key "childDoc" is used to add normal data and not nested products. ( How do we validate if the user does not put in a field called root in the document ? ) childDocs:[ { "id" : "5", "subject" : "black" }, { "id" : "6", "subject" : "red" } ]
        Hide
        Peng Cheng added a comment -

        +1 this is amazing, I'm waiting for this for a fairly long time of one month

        Show
        Peng Cheng added a comment - +1 this is amazing, I'm waiting for this for a fairly long time of one month
        Hide
        Yonik Seeley added a comment -

        Seems like at a minimum we should use something like
        _childDocuments_ (the underscores generally indicating a namespace reserved for internal use or otherwise special... like _root_, _docid_, _version_, etc)

        Show
        Yonik Seeley added a comment - Seems like at a minimum we should use something like _childDocuments_ (the underscores generally indicating a namespace reserved for internal use or otherwise special... like _root_, _docid_, _version_, etc)
        Hide
        Varun Thacker added a comment -

        Changes in this patch

        • Key for child documents is "childDocuments"
        • Fixes a parsing issue

        Thus this is what an add command would look like

        {
            "add": {
                "doc": {
                    "id": "1",
                    "_childDocuments_": [
                        {
                            "id": "2"
                        },
                        {
                            "id": "3"
                        }
                    ]
                }
            }
        }
        
        Show
        Varun Thacker added a comment - Changes in this patch Key for child documents is " childDocuments " Fixes a parsing issue Thus this is what an add command would look like { "add": { "doc": { "id": "1", "_childDocuments_": [ { "id": "2" }, { "id": "3" } ] } } }
        Hide
        Hoss Man added a comment -

        Varun: your patch looks pretty good to me,

        I beefed up the test a bit to convince myself that it would correctly handle:

        • grand child docs
        • childDocument and regular fields in various orders
        • duplicate childDocument keys

        ...and in the process discovered what appears to be a pre-existing bug regarding field value ordering when the fieldName key is duplicated in the JSON. It looks like it should be fairly trivial to fix, so i'm going to open a new issue for that and get it fixed & backported to 4x, and then i'll come back and revist this patch

        Show
        Hoss Man added a comment - Varun: your patch looks pretty good to me, I beefed up the test a bit to convince myself that it would correctly handle: grand child docs childDocument and regular fields in various orders duplicate childDocument keys ...and in the process discovered what appears to be a pre-existing bug regarding field value ordering when the fieldName key is duplicated in the JSON. It looks like it should be fairly trivial to fix, so i'm going to open a new issue for that and get it fixed & backported to 4x, and then i'll come back and revist this patch
        Hide
        Varun Thacker added a comment -

        New patch which takes into account changes made on SOLR-5777

        Show
        Varun Thacker added a comment - New patch which takes into account changes made on SOLR-5777
        Hide
        ASF subversion and git services added a comment -

        Commit 1572797 from hossman@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1572797 ]

        SOLR-5183: JSON updates now support nested child documents using a "childDocument" object key

        Show
        ASF subversion and git services added a comment - Commit 1572797 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1572797 ] SOLR-5183 : JSON updates now support nested child documents using a " childDocument " object key
        Hide
        ASF subversion and git services added a comment -

        Commit 1572802 from hossman@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1572802 ]

        SOLR-5183: JSON updates now support nested child documents using a "childDocument" object key (merge r1572797)

        Show
        ASF subversion and git services added a comment - Commit 1572802 from hossman@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572802 ] SOLR-5183 : JSON updates now support nested child documents using a " childDocument " object key (merge r1572797)
        Hide
        Hoss Man added a comment -

        Awesome,

        Thanks Varun!

        Show
        Hoss Man added a comment - Awesome, Thanks Varun!
        Hide
        Varun Thacker added a comment -

        Thanks Hoss for reviewing.
        I have added a comment on the ref guide containing an example of adding nested documents in JSON - https://cwiki.apache.org/confluence/display/solr/Other+Parsers?focusedCommentId=39621617#comment-39621617

        Show
        Varun Thacker added a comment - Thanks Hoss for reviewing. I have added a comment on the ref guide containing an example of adding nested documents in JSON - https://cwiki.apache.org/confluence/display/solr/Other+Parsers?focusedCommentId=39621617#comment-39621617
        Hide
        Uwe Schindler added a comment -

        Close issue after release of 4.8.0

        Show
        Uwe Schindler added a comment - Close issue after release of 4.8.0

          People

          • Assignee:
            Hoss Man
            Reporter:
            Varun Thacker
          • Votes:
            5 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development