Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3688

Drill should honor "skip.header.line.count" and "skip.footer.line.count" attributes of Hive table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0
    • 1.6.0
    • Storage - Hive
    • 1.1

    Description

      Currently Drill does not honor the "skip.header.line.count" attribute of Hive table.
      It may cause some other format conversion issue.

      Reproduce:

      1. Create a Hive table

      create table h1db.testheader(col0 string)
      ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
      STORED AS TEXTFILE
      tblproperties("skip.header.line.count"="1");
      

      2. Prepare a sample data:

      # cat test.data
      col0
      2015-01-01
      

      3. Load sample data into Hive

      LOAD DATA LOCAL INPATH '/xxx/test.data' OVERWRITE INTO TABLE h1db.testheader;
      

      4. Hive

      hive> select * from h1db.testheader ;
      OK
      2015-01-01
      Time taken: 0.254 seconds, Fetched: 1 row(s)
      

      5. Drill

      >  select * from hive.h1db.testheader ;
      +-------------+
      |    col0     |
      +-------------+
      | col0        |
      | 2015-01-01  |
      +-------------+
      2 rows selected (0.257 seconds)
      
      > select cast(col0 as date) from hive.h1db.testheader ;
      Error: SYSTEM ERROR: IllegalFieldValueException: Value 0 for monthOfYear must be in the range [1,12]
      
      Fragment 0:0
      
      [Error Id: 34353702-ca27-440b-a4f4-0c9f79fc8ccd on h1.poc.com:31010]
      
        (org.joda.time.IllegalFieldValueException) Value 0 for monthOfYear must be in the range [1,12]
          org.joda.time.field.FieldUtils.verifyValueBounds():236
          org.joda.time.chrono.BasicChronology.getDateMidnightMillis():613
          org.joda.time.chrono.BasicChronology.getDateTimeMillis():159
          org.joda.time.chrono.AssembledChronology.getDateTimeMillis():120
          org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.memGetDate():261
          org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getDate():218
          org.apache.drill.exec.test.generated.ProjectorGen0.doEval():67
          org.apache.drill.exec.test.generated.ProjectorGen0.projectRecords():62
          org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
          org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93
          org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
          org.apache.drill.exec.record.AbstractRecordBatch.next():147
          org.apache.drill.exec.physical.impl.BaseRootExec.next():83
          org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():79
          org.apache.drill.exec.physical.impl.BaseRootExec.next():73
          org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():261
          org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():255
          java.security.AccessController.doPrivileged():-2
          javax.security.auth.Subject.doAs():422
          org.apache.hadoop.security.UserGroupInformation.doAs():1566
          org.apache.drill.exec.work.fragment.FragmentExecutor.run():255
          org.apache.drill.common.SelfCleaningRunnable.run():38
          java.util.concurrent.ThreadPoolExecutor.runWorker():1142
          java.util.concurrent.ThreadPoolExecutor$Worker.run():617
          java.lang.Thread.run():745 (state=,code=0)
      

      Also "skip.footer.line.count" should be taken into account.
      If "skip.header.line.count" or "skip.footer.line.count" has incorrect value in Hive, throw appropriate exception in Drill.
      Ex: Hive table property skip.header.line.count value 'someValue' is non-numeric

      Attachments

        Activity

          People

            arina Arina Ielchiieva
            haozhu Hao Zhu
            Krystal Krystal
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: