Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24766

CreateHiveTableAsSelect and InsertIntoHiveDir won't generate decimal column stats in parquet

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 3.0.0
    • SQL

    Description

      How to reproduce:

      INSERT OVERWRITE LOCAL DIRECTORY '/tmp/spark/parquet/dir' STORED AS parquet select cast(1 as decimal) as decimal1;
      
      create table test_parquet stored as parquet as select cast(1 as decimal) as decimal1;
      
      $ java -jar ./parquet-tools/target/parquet-tools-1.10.1-SNAPSHOT.jar meta  file:/tmp/spark/parquet/dir/part-00000-cb96a617-4759-4b21-a222-2153ca0e8951-c000
      
      file:        file:/tmp/spark/parquet/dir/part-00000-cb96a617-4759-4b21-a222-2153ca0e8951-c000
      
      creator:     parquet-mr version 1.6.0 (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb)
      
      
      
      file schema: hive_schema
      
      --------------------------------------------------------------------------------
      
      decimal1:    OPTIONAL FIXED_LEN_BYTE_ARRAY O:DECIMAL R:0 D:1
      
      
      
      row group 1: RC:1 TS:46 OFFSET:4
      
      --------------------------------------------------------------------------------
      
      decimal1:     FIXED_LEN_BYTE_ARRAY SNAPPY DO:0 FPO:4 SZ:48/46/0.96 VC:1 ENC:BIT_PACKED,PLAIN,RLE ST:[no stats for this column]
      
      

      because spark still use com.twitter.parquet-hadoop-bundle.1.6.0.

      May be we should refactor CreateHiveTableAsSelectCommand and InsertIntoHiveDirCommand or upgrade built-in Hive.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yumwang Yuming Wang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: