Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2334

Text record reader should fail gracefully when encountering bad records

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • Storage - Text & CSV
    • None

    Description

      The attached file has 1 bad record. Running a simple count query on this file errors out with IOBE and/or possible schema change exception.

      The hex dump of the file shows a bunch of 0's (the '*' below indicates more lines of 0's):

      00001c0 3a 35 35 2e 35 30 35 35 30 00 00 00 00 00 00 00
      00001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      *
      02a01c0 00 00 00 00 00 00 00 00 00 35 35 35 0a 35 35 35
      
      0: jdbc:drill:zk=local> select count(*) from `badRecords2.dat`;
      +------------+
      |   EXPR$0   |
      +------------+
      Query failed: RemoteRpcException: Failure while running fragment., You tried to do a batch data read operation when you were in a state of STOP.  You can only do this type of operation when you are in a state of OK or OK_NEW_SCHEMA.
      

      log file also shows an IOBE related to this:

      18:49:00.003 [2b1024e4-5639-b4ec-392e-8d5879c3d4db:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - Failed to read the batch. Stopping...
      java.lang.IndexOutOfBoundsException: index: 374, length: 2752540 (expected: range(0, 65536))
              at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1143) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
              at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:272) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
              at io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:390) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
              at io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:25) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
              at io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:651) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
              at org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:481) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.vector.RepeatedVarCharVector$Mutator.addSafe(RepeatedVarCharVector.java:451) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.store.text.DrillTextRecordReader.next(DrillTextRecordReader.java:172) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
              at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:165) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
      

      Attachments

        1. badRecords2.dat
          2.63 MB
          Aman Sinha

        Issue Links

          Activity

            People

              sudheeshkatkam Sudheesh Katkam
              amansinha100 Aman Sinha
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: