Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26958

JsonSerDe data corruption when scalar type is a json object

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • File Formats
    • None

    Description

       

      JsonSerDe uses the Jackson JsonParser.getText for decoding scalar values from json strings.  The problem is this method in Jackson converts any token to text including START_OBJECT '{}{'.  This means when a scalar field is actually a json object, JsonSerDe will process the open curly bracket for BOOLEAN, DECIMAL, CHAR, VARCHAR, and VARBINARY. Then it continues processing field inside of the json object as if they are part of the outer json object. When the closing curly bracket is encountered it pops a level, which can end parsing early. This bug will result in corrupted data for the following JSON:

       

      { "boolean_field" : {}, "other_field" : 99 } 
        => [boolean_field=false, other_field=null]
      
      
      { "boolean_field" : { "other_field" : 42 }, "other_field" : 99 } => (false, null) 
       => [boolean_field=false, other_field=42]

       

      BTW, when a json array is passed instead of an object, you get an error because the array does not contain fields which the code checks for.

      I think the behavior should result in an error like you get when a json array is field value for a scalar.  If so the fix is to make sure the value token a scalar for non-complex types in extractCurrentField, so something like this:

      if (!hcatFieldSchema.isComplex() && !valueToken.isScalarValue()) {
          throw new IOException(type + " value must be a scalar json value");
      } 

       

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            dain Dain Sundstrom

            Dates

              Created:
              Updated:

              Slack

                Issue deployment