Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8120 Umbrella JIRA tracking Parquet improvements
  3. HIVE-10016

Remove duplicated Hive table schema parsing in DataWritableReadSupport

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      In DataWritableReadSupport.init(), the table schema is created and its string format is set in conf. When construct the ParquetRecordReaderWrapper , the schema is fetched from conf and parsed several times.

      We could remove these schema parsing, and improve the speed of getRecordReader a bit.

      Attachments

        1. HIVE-10016.patch
          5 kB
          Dong Chen
        2. HIVE-10016.1-parquet.patch
          5 kB
          Dong Chen
        3. HIVE-10016-parquet.patch
          4 kB
          Dong Chen

        Activity

          People

            dongc Dong Chen
            dongc Dong Chen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: