Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
Description
Code to reproduce -
Github Issue - https://github.com/apache/hudi/issues/9967
```
schema = StructType(
[
StructField("id", IntegerType(), True),
StructField("name", StringType(), True)
]
)
data = [
Row(1, "a"),
Row(2, "a"),
Row(3, "c"),
]
hudi_configs =
{ "hoodie.table.name": TABLE_NAME, "hoodie.datasource.write.recordkey.field": "name", "hoodie.datasource.write.precombine.field": "id", "hoodie.datasource.write.operation":"insert_overwrite_table", "hoodie.table.keygenerator.class": "org.apache.hudi.keygen.NonpartitionedKeyGenerator", }df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)
df.write.format("org.apache.hudi").options(**hudi_configs).mode("append").save(PATH)
spark.read.format("hudi").load(PATH).show()
– Showing no records
```
df.write.format("org.apache.hudi").options(**hudi_configs).option("hoodie.datasource.write.insert.drop.duplicates","true").mode("append").save(PATH)
spark.read.format("hudi").load(PATH).show()
Attachments
Issue Links
- links to