Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-16427

[Java] jdbcToArrowVectors / sqlToArrowVectorIterator fails to handle variable decimal precision / scale

    XMLWordPrintableJSON

Details

    Description

      When a JDBC driver returns a Numeric type that doesn't exactly align with what is in the JDBC metadata, jdbcToArrowVectors / sqlToArrowVectorIterator fails to process the result (failing on serializing the the value into the BigDecimalVector). 

      It appears as though this is because JDBC drivers can return BigDecimal / Numeric values that are different between the metadata and not consistent between each of the rows. 

      Is there a recommended course of action to represent a variable precision / scale decimal vector? In any case it does not seem possible to convert JDBC data with the built in utilities that uses these numeric types when they come in this form. 

      It seems like both the Oracle and the Postgres JDBC driver also returns metadata with a 0,0 precision / scale when values in the result set have different (and varied) precision / scale. 

      An example: 

      Against postgres, running a simple SQL query that produces numeric types can lead to a JDBC result set with BigDecimal values with variable decimal precision/scale. 

      SELECT value FROM (
        SELECT 1000000000000000.01 AS "value" 
        UNION SELECT 1000000000300.0000001
      ) a 

       

      The postgres JDBC adapter produces a result set that looks like the following: 

       

        value precision scale
      metadata N/A 0 0
      row 1 1000000000000000.01 18 2
      row 2 1000000000300.0000001 20 7

       

      Even a result set that returns a single value may Numeric values with precision / scale that do not match the precision / scale in the ResultSetMetadata. 

       

      SELECT AVG(one) from (
        SELECT 1000000000000000.01 as "one" 
        UNION select 1000000000300.0000001
      ) a 

      produces a result set that looks like this

       

        value precision scale
      metadata N/A 0 0
      row 1 500500000000150.0050001 22 7

       

      When processing the result set using the simple jdbcToArrowVectors (or sqlToArrowVectorIterator) this fails to set the values extracted from the result set into the the DecimalVector

       

      val calendar = JdbcToArrowUtils.getUtcCalendar()
      val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar)
      val root = VectorSchemaRoot.create(schema, RootAllocator())
      val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) 

      Error:

       

      Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
          at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
          at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
          at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
          at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
          at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
          at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
          at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
          at org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
          at org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321) 

       

      using `sqlToArrowVectorIterator` also fails with an error trying to set data into the vector: (requires a little bit of trickery to force creation of the package private configuration)

       

      Exception in thread "main" java.lang.RuntimeException: Error occurred while getting next schema root.
          at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179)
          at com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31)
          at com.acme.AppKt.main(App.kt:54)
          at com.acme.AppKt.main(App.kt)
      Caused by: java.lang.RuntimeException: Error occurred while consuming data.
          at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121)
          at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153)
          at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175)
          ... 3 more
      Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must equal that in the Arrow vector: 7 != 0
          at org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95)
          at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355)
          at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
          at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
          at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113)
          ... 5 more 

       

       

      Attachments

        Issue Links

          Activity

            People

              toddfarmer Todd Farmer
              jswenson Jonathan Swenson
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m