Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6361

Provide sqlTypeOf() and modeOf() functions

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.13.0
    • 1.14.0
    • None

    Description

      Drill provides a typeof() function to return the type of a column. The returned string, however, has only the base data type. A Drill data type (a "major type") also includes a cardinality (a "mode"). For example, OptionalĀ Int or Required VarChar.

      This type information is useful for handling data conversions. For example, if I could tell that a column value was a Nullable Int, I could guess that it is one Drill invented, and I could merge it, by hand, with the type from another file that had actual values.

      The two options are equivalent. Either provide a modeOf() to just return cardinality, or a dataTypeOf() that returns both. (Maybe the modeOf() might be more useful.)

      Documentation

      Documentation information (extracted from PR):

      sqlTypeOf()

      sqlTypeOf() returns the data type (using the SQL names) whether the column is NULL or not. The SQL name is the one that can be used in a CAST statement. Thus,

      sqlTypeOf( CAST(x AS <type> )) returns <type> as the type name.

      If the type is DECIMAL, then the type also includes precision and scale. Example: DECIMAL(6, 3).

      modeOf()

      modeOf() returns the cardinality (mode) of the column as "NOT NULL", "NULLABLE" or "ARRAY".

      drillTypeOf()

      The drillTypeOf() function that works just like typeOf(), but returns the internal Drill names even if the value is NULL.

      Example

      Here is an example usage that highlights our old friend, "nullable int" for a missing column:

      SELECT sqlTypeOf(a) AS a_type, modeOf(a) AS a_mode FROM `json/all-null.json`;
      
      +----------+-----------+
      |  a_type  |  a_mode   |
      +----------+-----------+
      | INTEGER  | NULLABLE  |
      +----------+-----------+
      

      For arrays (repeated) types:

      SELECT sqlTypeOf(columns) as col_type, modeOf(columns) as col_mode
      FROM `csv/cust.csv`;
      
      +--------------------+-----------+
      |      col_type      | col_mode  |
      +--------------------+-----------+
      | CHARACTER VARYING  | ARRAY     |
      +--------------------+-----------+
      

      For non-null types:

      SELECT sqlTypeOf(`name`) AS name_type, 
      modeOf(`name`) AS name_mode FROM `csvh/cust.csvh`;
      
      +--------------------+------------+
      |     name_type      | name_mode  |
      +--------------------+------------+
      | CHARACTER VARYING  | NOT NULL   |
      +--------------------+------------+
      

      Attachments

        Activity

          People

            paul-rogers Paul Rogers
            paul-rogers Paul Rogers
            Vova Vysotskyi Vova Vysotskyi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: