Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
7.0.0
Description
When a JDBC driver returns a Numeric type that doesn't exactly align with what is in the JDBC metadata, jdbcToArrowVectors / sqlToArrowVectorIterator fails to process the result (failing on serializing the the value into the BigDecimalVector).
It appears as though this is because JDBC drivers can return BigDecimal / Numeric values that are different between the metadata and not consistent between each of the rows.
Is there a recommended course of action to represent a variable precision / scale decimal vector? In any case it does not seem possible to convert JDBC data with the built in utilities that uses these numeric types when they come in this form.
It seems like both the Oracle and the Postgres JDBC driver also returns metadata with a 0,0 precision / scale when values in the result set have different (and varied) precision / scale.
An example:
Against postgres, running a simple SQL query that produces numeric types can lead to a JDBC result set with BigDecimal values with variable decimal precision/scale.
SELECT value FROM (
SELECT 1000000000000000.01 AS "value"
UNION SELECT 1000000000300.0000001
) a
The postgres JDBC adapter produces a result set that looks like the following:
value | precision | scale | |
---|---|---|---|
metadata | N/A | 0 | 0 |
row 1 | 1000000000000000.01 | 18 | 2 |
row 2 | 1000000000300.0000001 | 20 | 7 |
Even a result set that returns a single value may Numeric values with precision / scale that do not match the precision / scale in the ResultSetMetadata.
SELECT AVG(one) from (
SELECT 1000000000000000.01 as "one"
UNION select 1000000000300.0000001
) a
produces a result set that looks like this
value | precision | scale | |
---|---|---|---|
metadata | N/A | 0 | 0 |
row 1 | 500500000000150.0050001 | 22 | 7 |
When processing the result set using the simple jdbcToArrowVectors (or sqlToArrowVectorIterator) this fails to set the values extracted from the result set into the the DecimalVector
val calendar = JdbcToArrowUtils.getUtcCalendar() val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar) val root = VectorSchemaRoot.create(schema, RootAllocator()) val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar)
Error:
Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
at org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
at org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321)
using `sqlToArrowVectorIterator` also fails with an error trying to set data into the vector: (requires a little bit of trickery to force creation of the package private configuration)
Exception in thread "main" java.lang.RuntimeException: Error occurred while getting next schema root. at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179) at com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31) at com.acme.AppKt.main(App.kt:54) at com.acme.AppKt.main(App.kt) Caused by: java.lang.RuntimeException: Error occurred while consuming data. at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121) at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153) at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175) ... 3 more Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must equal that in the Arrow vector: 7 != 0 at org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95) at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355) at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61) at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46) at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113) ... 5 more
Attachments
Issue Links
- relates to
-
ARROW-16600 [Java] Enable configurable scale coercion of BigDecimal
-
- Resolved
-
-
ARROW-16538 [Java] Refactor FakeResultSet to support arbitrary tests
-
- Resolved
-
- links to