[DRILL-5929] Misleading error for text file with blank line delimiter - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.11.0
Fix Version/s: 1.17.0
Component/s: None
Labels:
None

Description

Consider the following functional test query:

select * from table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\'))

For some reason (yet to be determined), when running this from Java, the line delimiter ended up empty. This cases the following line to fail with an ArrayIndexOutOfBoundsException:

class TextInput ...
  public final byte nextChar() throws IOException {
    if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0

We then translate the exception:

class TextReader ...
  public final boolean parseNext() throws IOException {
...
    } catch (Exception ex) {
      try {
        throw handleException(ex);
...
  private TextParsingException handleException(Exception ex) throws IOException {
...
    if (ex instanceof ArrayIndexOutOfBoundsException) {
      // Not clear this exception is still thrown...

      ex = UserException
          .dataReadError(ex)
          .message(
              "Drill failed to read your text file.  Drill supports up to %d columns in a text file.  Your file appears to have more than that.",
              MAXIMUM_NUMBER_COLUMNS)
          .build(logger);
    }

That is, due to a missing delimiter, we get an index out of bounds exception, which we translate to an error about having too many fields. But, the file itself has only a handful of fields. Thus, the error is completely wrong.

Then, we compound the error:

  private TextParsingException handleException(Exception ex) throws IOException {
...
    throw new TextParsingException(context, message, ex);

class CompliantTextReader ...
  public boolean next() {
...
    } catch (IOException | TextParsingException e) {
      throw UserException.dataReadError(e)
          .addContext("Failure while reading file %s. Happened at or shortly before byte position %d.",
            split.getPath(), reader.getPos())
          .build(logger);

That is, our AIOB exception became a user exception that became a text parsing exception that became a data read error.

But, this is not a data read error. It is an error in Drill's own validation logic. Not clear we should be wrapping user exceptions in other errors that we wrap in other user exceptions.

Attachments

Issue Links

Is contained by

DRILL-6986 Table function improvements / issues (UMBRELLA JIRA)

Open

Activity

People

Assignee:: Paul Rogers

Reporter:: Paul Rogers

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 03/Nov/17 22:07

Updated:: 11/Oct/19 12:09

Resolved:: 11/Oct/19 12:09