Solr
  1. Solr
  2. SOLR-4891

JsonLoader should preserve field value types from the JSON content stream

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.4
    • Component/s: update
    • Labels:
      None

      Description

      JSON content streams carry some basic type information for their field values, as parsed by Noggit: LONG, NUMBER, BIGNUMBER, and BOOLEAN. JsonLoader should set field value object types in the SolrInputDocument according to the content stream's data types.

      Currently JsonLoader converts all non-String-typed field values to String-s.

      There is a comment in JsonLoader.parseSingleFieldValue(), where the convert-everything-to-string logic happens, that says "for legacy reasons, single values s are expected to be strings", but other content streams' type information is not flattened like this, e.g. JavabinLoader.

      1. SOLR-4891.patch
        9 kB
        Steve Rowe
      2. SOLR-4891-BigInteger-bugfix.patch
        6 kB
        Steve Rowe

        Issue Links

          Activity

          Hide
          Steve Rowe added a comment -

          Simple patch preserving JSON types in SolrInputDocument, with some tests.

          Show
          Steve Rowe added a comment - Simple patch preserving JSON types in SolrInputDocument , with some tests.
          Hide
          Steve Rowe added a comment -

          If there are no objections I'll commit this later today.

          Show
          Steve Rowe added a comment - If there are no objections I'll commit this later today.
          Hide
          Steve Rowe added a comment -

          Committed:

          Show
          Steve Rowe added a comment - Committed: trunk: r1489676 branch_4x: r1489677
          Hide
          Steve Rowe added a comment -

          At Hoss's suggestion on #solr IRC last night, I tested whether JsonLoader behavior has changed around BigInteger and BigDecimal values as a result of the changes committed under this issue.

          I'm reopening to address an issue with adding JSON BIGNUMBER-s (returned by the Noggit parser when a number won't fit in either a long or a double) to trie integer or long fields: a NumberFormatException is no longer triggered, and the values are silently corrupted.

          Before committing the patch on this issue, BigInteger-typed values were not created for BIGNUMBER-s in SolrInputDocument; instead, they (along with every other JSON value) were converted to String-s, and then adding such a value to an integer or long field would cause a NumberFormatException to be thrown from Integer.parseInt() or Long.parseLong(). This was proper and good.

          But now, BigInteger-typed values are converted (in TrieField.createField() to int/long using BigInteger's intValue() and longValue() methods, which return only the low-order 32 and 64 bits, respectively. These values are always corrupted: the truncated high-order bits are guaranteed to be non-zero, since BigInteger typing only happens when values won't fit into 64 bits.

          Reverting back to String-typed BIGNUMBER values fixes the problem.

          By contrast, BigDecimal's doubleValue() and floatValue() methods truncate the low-order bits, resulting in loss of precision rather than corruption. This is the same behavior used by Double.parseDouble() and Float.parseFloat(). Reverting back to String-typing for decimal BIGNUMBER-s in addition to integral BIGNUMBER-s won't be a problem.

          Patch forthcoming.

          Show
          Steve Rowe added a comment - At Hoss's suggestion on #solr IRC last night, I tested whether JsonLoader behavior has changed around BigInteger and BigDecimal values as a result of the changes committed under this issue. I'm reopening to address an issue with adding JSON BIGNUMBER -s (returned by the Noggit parser when a number won't fit in either a long or a double) to trie integer or long fields: a NumberFormatException is no longer triggered, and the values are silently corrupted. Before committing the patch on this issue, BigInteger -typed values were not created for BIGNUMBER -s in SolrInputDocument ; instead, they (along with every other JSON value) were converted to String -s, and then adding such a value to an integer or long field would cause a NumberFormatException to be thrown from Integer.parseInt() or Long.parseLong() . This was proper and good. But now, BigInteger -typed values are converted (in TrieField.createField() to int/long using BigInteger 's intValue() and longValue() methods, which return only the low-order 32 and 64 bits, respectively. These values are always corrupted: the truncated high-order bits are guaranteed to be non-zero, since BigInteger typing only happens when values won't fit into 64 bits. Reverting back to String -typed BIGNUMBER values fixes the problem. By contrast, BigDecimal 's doubleValue() and floatValue() methods truncate the low-order bits, resulting in loss of precision rather than corruption. This is the same behavior used by Double.parseDouble() and Float.parseFloat() . Reverting back to String -typing for decimal BIGNUMBER -s in addition to integral BIGNUMBER -s won't be a problem. Patch forthcoming.
          Hide
          Steve Rowe added a comment -

          patch - committing shortly

          Show
          Steve Rowe added a comment - patch - committing shortly
          Hide
          Steve Rowe added a comment - - edited

          Committed:

          Show
          Steve Rowe added a comment - - edited Committed: trunk: r1489914 branch_4x: r1489915
          Hide
          Steve Rowe added a comment -

          Bulk close resolved 4.4 issues

          Show
          Steve Rowe added a comment - Bulk close resolved 4.4 issues

            People

            • Assignee:
              Steve Rowe
              Reporter:
              Steve Rowe
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development