[HUDI-2250] [SQL] Bulk insert support for tables w/ primary key - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.9.0
Component/s: None
Labels:
- release-blocker

Epic Link:
Hudi Spark SQL

Description

we want to support bulk insert for any table. Right now, we have a constraint that only tables w/o any primary key can be bulk_inserted.

> set hoodie.sql.bulk.insert.enable = true;

hoodie.sql.bulk.insert.enable true

Time taken: 2.019 seconds, Fetched 1 row(s)

spark-sql> set hoodie.datasource.write.row.writer.enable = true;

hoodie.datasource.write.row.writer.enable true

Time taken: 0.026 seconds, Fetched 1 row(s)

spark-sql>

> create table hudi_17Gb_ext1 using hudi location 's3a://siva-test-bucket-june-16/hudi_testing/gh_arch_dump/hudi_5/' options (

> type = 'cow',

> primaryKey = 'randomId',

> preCombineField = 'date_col'

> )

> partitioned by (type) as select * from gh_17Gb_date_col;

21/07/29 04:26:15 ERROR SparkSQLDriver: Failed in [create table hudi_17Gb_ext1 using hudi location 's3a://siva-test-bucket-june-16/hudi_testing/gh_arch_dump/hudi_5/' options (

type = 'cow',

primaryKey = 'randomId',

preCombineField = 'date_col'

)

partitioned by (type) as select * from gh_17Gb_date_col]

java.lang.IllegalArgumentException: Table with primaryKey can not use bulk insert.

at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.buildHoodieInsertConfig(InsertIntoHoodieTableCommand.scala:219)

at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:78)

at org.apache.spark.sql.hudi.command.CreateHoodieTableAsSelectCommand.run(CreateHoodieTableAsSelectCommand.scala:86)

at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:108)

at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:106)

at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:120)

Attachments

Activity

People

Assignee:: pengzhiwei

Reporter:: sivabalan narayanan

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 29/Jul/21 04:55

Updated:: 04/Jan/22 00:13

Resolved:: 13/Aug/21 02:03