Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3209

[Umbrella] Plan reads of Hive tables as native Drill reads when a native reader for the underlying table format exists

    XMLWordPrintableJSON

    Details

      Description

      All reads against Hive are currently done through the Hive Serde interface. While this provides the most flexibility, the API is not optimized for maximum performance while reading the data into Drill's native data structures. For Parquet and Text file backed tables, we can plan these reads as Drill native reads. Currently reads of these file types provide untyped data. While parquet has metadata in the file we currently do not make use of the type information while planning. For text files we read all of the files as lists of varchars. In both of these cases, casts will need to be injected to provide the same datatypes provided by the reads through the SerDe interface.

        Attachments

        1. tpch13-native-scan-on.sys.drill
          25 kB
          Chun Chang
        2. tpch13-native-scan-off.sys.drill
          329 kB
          Chun Chang

          Issue Links

            Activity

              People

              • Assignee:
                venki387 Venki Korukanti
                Reporter:
                jaltekruse Jason Altekruse
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: