Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5929

Misleading error for text file with blank line delimiter

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.11.0
    • Fix Version/s: 1.17.0
    • Component/s: None
    • Labels:
      None

      Description

      Consider the following functional test query:

      select * from table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\'))
      

      For some reason (yet to be determined), when running this from Java, the line delimiter ended up empty. This cases the following line to fail with an ArrayIndexOutOfBoundsException:

      class TextInput ...
        public final byte nextChar() throws IOException {
          if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0
      

      We then translate the exception:

      class TextReader ...
        public final boolean parseNext() throws IOException {
      ...
          } catch (Exception ex) {
            try {
              throw handleException(ex);
      ...
        private TextParsingException handleException(Exception ex) throws IOException {
      ...
          if (ex instanceof ArrayIndexOutOfBoundsException) {
            // Not clear this exception is still thrown...
      
            ex = UserException
                .dataReadError(ex)
                .message(
                    "Drill failed to read your text file.  Drill supports up to %d columns in a text file.  Your file appears to have more than that.",
                    MAXIMUM_NUMBER_COLUMNS)
                .build(logger);
          }
      

      That is, due to a missing delimiter, we get an index out of bounds exception, which we translate to an error about having too many fields. But, the file itself has only a handful of fields. Thus, the error is completely wrong.

      Then, we compound the error:

        private TextParsingException handleException(Exception ex) throws IOException {
      ...
          throw new TextParsingException(context, message, ex);
      
      class CompliantTextReader ...
        public boolean next() {
      ...
          } catch (IOException | TextParsingException e) {
            throw UserException.dataReadError(e)
                .addContext("Failure while reading file %s. Happened at or shortly before byte position %d.",
                  split.getPath(), reader.getPos())
                .build(logger);
      

      That is, our AIOB exception became a user exception that became a text parsing exception that became a data read error.

      But, this is not a data read error. It is an error in Drill's own validation logic. Not clear we should be wrapping user exceptions in other errors that we wrap in other user exceptions.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Paul.Rogers Paul Rogers
                Reporter:
                Paul.Rogers Paul Rogers
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: