Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2763

[umbrella] Implement INFORMATION_SCHEMA.COLUMNS enough for relevant tools

    XMLWordPrintableJSON

Details

    Description

      [Edit: Problems marked with "[3216]" were fixed by DRILL-3216.]

      Note: This JIRA report is/was intended in part for for tracking specific noticed differences between Drill and standard SQL (not necessarily differences we've already decided to fix), and also in part as a reminder of this likelihood of running into problems with tools that we care about (even that set of problems isn't known now).

      Drill's INFORMATION_SCHEMA.COLUMNS is not compliant with SQL:2011.

      A. COLUMNS columns existing in Drill that are not compliant with standard SQL:

      1. TDB: TABLE_NAME holds the original form of the identifier but might need to be uppercased here to be compliant (or possibly internal matching queries (e.g., for JDBC's getColumns(...)) could be case-insensitive).
      2. TDB: COLUMN_NAME holds the original form of the identifier but might need to be uppercased here to be compliant (or possibly internal matching queries (e.g., for JDBC's getColumns(...)) could be case-insensitive).
      3. [3216] ORDINAL_POSITION values are zero-based rather than being one-based.
      4. [3216] CHARACTER_MAXIMUM_LENGTH, NUMERIC_PRECISION_RADIX, NUMERIC_SCALE, and NUMERIC_PRECISION use -1 instead of NULL for the "not-applicable" case.
      5. [3216] NUMERIC_PRECISION for non-DECIMAL/NUMERIC exact numeric types (e.g., INTEGER) and approximate numeric types (e.g., DOUBLE) is -1 (logical null) instead of the specified values.
      6. [3216] NUMERIC_SCALE for non-DECIMAL/NUMERIC exact numeric types is -1 instead of 0.
      7. [3216] NUMERIC_SCALE for approximate numeric types is -1 instead of the number of bits of precision (24 and 53).
      8. [3216] NUMERIC_PRECISION_RADIX for integral exact numeric types is -1 instead of 10.
      9. [3216] NUMERIC_PRECISION_RADIX for approximate exact numeric types is -1 instead of 2.
      10. [3216] CHARACTER_MAXIMUM_LENGTH for types CHAR, BINARY, and VAR BINARY is -1 instead of the corresponding length.
      11. [3216] DATA_TYPE values for INTERVAL with YEAR and/or MONTH and for INTERVAL with DAY, HOUR, MINUTE, and/or SECOND are "INTERVAL_YEAR_MONTH" and "INTERVAL_DATA_TIME", respectively, instead of the data type name "INTERVAL".
      12. [3216] DATA_TYPE values for non-atomic types seem to be type descriptors (<data type> syntax; e.g., "VARCHAR(65536) ARRAY") instead of just data type names (e.g., "ARRAY").

      B. Standard COLUMNS columns that don't exist in Drill and that probably are more relevant:
      1. [3216] CHARACTER_OCTET_LENGTH does not exist. (Drill's JDBC driver needs to return this, and currently tries to compute it itself.)
      2. [3216] DATETIME_PRECISION does not exist. (Drill's JDBC driver probably needs it to compute its getColumns()'s COLUMN_SIZE correctly.)
      3. [3216] INTERVAL_TYPE does not exist. (Drill's JDBC driver needs this to compute its getColumns()'s COLUMN_SIZE correctly once COLUMNS.DATA_TYPE is correct.}
      4. [3216] INTERVAL_PRECISION does not exist. (Drill's JDBC driver needs this to compute its getColumns()'s COLUMN_SIZE correctly.)
      [TBD]:
      5. MAXIMUM_CARDINALITY does not exist. (This might be relevant for JDBC's getColumns()'s COLUMN_SIZE.)

      C. Standard COLUMNS columns that don't exist in Drill but are less likely to be relevant (listed for completeness):

      • [3216] COLUMN_DEFAULT
      • CHARACTER_SET_CATALOG
      • CHARACTER_SET_SCHEMA
      • CHARACTER_SET_NAME
      • COLLATION_CATALOG
      • COLLATION_SCHEMA
      • COLLATION_NAME
      • DOMAIN_CATALOG
      • DOMAIN_SCHEMA
      • DOMAIN_NAME
      • UDT_CATALOG
      • UDT_SCHEMA
      • UDT_NAME
      • SCOPE_CATALOG
      • SCOPE_SCHEMA
      • SCOPE_NAME
      • DTD_IDENTIFIER
      • IS_SELF_REFERENCING
      • IS_IDENTITY
      • IDENTITY_GENERATION
      • IDENTITY_START
      • IDENTITY_INCREMENT
      • IDENTITY_MAXIMUM
      • IDENTITY_MINIMUM
      • IDENTITY_CYCLE
      • IS_GENERATED
      • GENERATION_EXPRESSION
      • IS_SYSTEM_TIME_PERIOD_START
      • IS_SYSTEM_TIME_PERIOD_END
      • SYSTEM_TIME_PERIOD_TIMESTAMP_GENERATION
      • IS_UPDATABLE
      • DECLARED_DATA_TYPE
      • DECLARED_NUMERIC_PRECISION
      • DECLARED_NUMERIC_SCALE

      Attachments

        Issue Links

          Activity

            People

              dsbos Daniel Barclay
              dsbos Daniel Barclay
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: