      See DRILL-5490 for background.

      Try this unit test case:

          FixtureBuilder builder = ClusterFixture.builder()
          try (ClusterFixture cluster = builder.build();
               ClientFixture client = cluster.clientFixture()) {
            TextFormatConfig csvFormat = new TextFormatConfig();
            csvFormat.fieldDelimiter = ',';
            csvFormat.skipFirstLine = false;
            csvFormat.extractHeader = true;
            cluster.defineWorkspace("dfs", "data", "/tmp/data", "csv", csvFormat);
            String sql = "SELECT * FROM `dfs.data`.`csv/test7.csv`";

      The test can also be run as a query using your favorite client.

      Using this input file:


      (The first line is blank.)

      The following is the result:

      Exception (no rows returned): org.apache.drill.common.exceptions.UserRemoteException: 
      SYSTEM ERROR: NullPointerException

      The RepeatedVarCharOutput class tries (but fails for the reasons outlined in DRILL-5490) to detect this case.

      The code crashes here in CompliantTextRecordReader.extractHeader():

          String [] fieldNames = ((RepeatedVarCharOutput)hOutput).getTextOutput();

      Because of bad code in RepeatedVarCharOutput.getTextOutput():

        public String [] getTextOutput () throws ExecutionSetupException {
          if (recordCount == 0 || fieldIndex == -1) {
            return null;
          if (this.recordStart != characterData) {
            throw new ExecutionSetupException("record text was requested before finishing record");

      Since there is no text on the line, special code elsewhere (see DRILL-5490) elects not to increment the recordCount. (BTW: recordCount is the total across-batch count, probably the in-batch count, batchIndex, was wanted here.) Since the count is zero, we return null.

      But, if the author probably thought we'd get a zero-length record, and the if-statement throws an exception in this case. But, see DRILL-5490 about why this code does not actually work.

      The result is one bug (not incrementing the record count), triggering another (returning a null), which masks a third (recordStart is not set correctly so the exception would not be thrown.)

      All that bad code is just fun and games until we get an NPE, however.




