Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20200

Huge performance gap when processing ORC files created by Spark

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1.0
    • Fix Version/s: None
    • Component/s: Hive, ORC
    • Labels:
      None

      Description

      Seeing a huge performance difference while running a simple filter query on ORC files created by Spark. I'm seeing better performance if the files are written by Hive i.e. after doing a "create table x as select * from y". 

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vinoths Vinoth Sathappan
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: