Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6774

Wrong data types of empty batches schema for queries with aliases

    XMLWordPrintableJSON

Details

    Description

      0: jdbc:drill:zk=local> select name as full_name from (select CAST(Product AS VARCHAR) AS name from dfs.`/tmp/empty.json`);
      +------------+
      | full_name  |
      +------------+
      +------------+
      No rows selected (0.177 seconds)
      

      The data type for above query is INT:OPTIONAL, but should be VARCHAR:OPTIONAL.
      It can be verified via 1) Drill unit test framework or via 2) usage of UNION ALL operator with other query.

      1)

        @Test
        @Ignore // TODO: DRILL-6774: The type of ProductName filed should be VARCHAR:OPTIONAL, but not INT:OPTIONAL
        public void testRenameProjectWithCastEmptyDirectory() throws Exception {
          final BatchSchema expectedSchema = new SchemaBuilder()
              .addNullable("WeekId", TypeProtos.MinorType.INT)
              .addNullable("ProductName", TypeProtos.MinorType.VARCHAR, 65535)
              .build();
      
          testBuilder()
              .sqlQuery("select WeekId, Product as ProductName from (select CAST(`dir0` as INT) AS WeekId, " +
                  "CAST(Product AS VARCHAR) AS Product from dfs.tmp.`%s`)", EMPTY_DIR_NAME)
              .schemaBaseLine(expectedSchema)
              .build()
              .run();
        }
      
        @Test
        @Ignore // TODO: DRILL-6774: The type of ProductName filed should be VARCHAR:OPTIONAL, not INT:OPTIONAL
        public void testRenameProjectWithCastEmptyJson() throws Exception {
          final BatchSchema expectedSchema = new SchemaBuilder()
              .addNullable("WeekId", TypeProtos.MinorType.INT)
              .addNullable("ProductName", TypeProtos.MinorType.VARCHAR, 65535)
              .build();
      
          testBuilder()
              .sqlQuery("select WeekId, Product as ProductName from (select CAST(`dir0` as INT) AS WeekId, " +
                  "CAST(Product AS VARCHAR) AS Product from cp.`%s`)", SINGLE_EMPTY_JSON)
              .schemaBaseLine(expectedSchema)
              .build()
              .run();
        }
      

      2) The usual result:

      0: jdbc:drill:zk=local> SELECT full_name FROM cp.`employee.json` LIMIT 2;
      +------------------+
      |    full_name     |
      +------------------+
      | Sheri Nowmer     |
      | Derrick Whelply  |
      +------------------+
      2 rows selected (0.207 seconds)
      

      But after UNION ALL with the above empty output:

      0: jdbc:drill:zk=local> select name as full_name from (select CAST(Product AS VARCHAR) AS name from dfs.`/tmp/empty.json`) UNION ALL SELECT full_name FROM cp.`employee.json` LIMIT 2;
      +------------+
      | full_name  |
      +------------+
      | null       |
      | null       |
      +------------+
      2 rows selected (0.198 seconds)
      

      Perhaps it is a regression of DRILL-5546 and the solution could be similar to DRILL-6773

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vitalii Vitalii Diravka
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: