Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.13.0
-
None
Description
Drill provides a typeof() function to return the type of a column. The returned string, however, has only the base data type. A Drill data type (a "major type") also includes a cardinality (a "mode"). For example, OptionalĀ Int or Required VarChar.
This type information is useful for handling data conversions. For example, if I could tell that a column value was a Nullable Int, I could guess that it is one Drill invented, and I could merge it, by hand, with the type from another file that had actual values.
The two options are equivalent. Either provide a modeOf() to just return cardinality, or a dataTypeOf() that returns both. (Maybe the modeOf() might be more useful.)
Documentation
Documentation information (extracted from PR):
sqlTypeOf()
sqlTypeOf() returns the data type (using the SQL names) whether the column is NULL or not. The SQL name is the one that can be used in a CAST statement. Thus,
sqlTypeOf( CAST(x AS <type> )) returns <type> as the type name.
If the type is DECIMAL, then the type also includes precision and scale. Example: DECIMAL(6, 3).
modeOf()
modeOf() returns the cardinality (mode) of the column as "NOT NULL", "NULLABLE" or "ARRAY".
drillTypeOf()
The drillTypeOf() function that works just like typeOf(), but returns the internal Drill names even if the value is NULL.
Example
Here is an example usage that highlights our old friend, "nullable int" for a missing column:
SELECT sqlTypeOf(a) AS a_type, modeOf(a) AS a_mode FROM `json/all-null.json`; +----------+-----------+ | a_type | a_mode | +----------+-----------+ | INTEGER | NULLABLE | +----------+-----------+
For arrays (repeated) types:
SELECT sqlTypeOf(columns) as col_type, modeOf(columns) as col_mode FROM `csv/cust.csv`; +--------------------+-----------+ | col_type | col_mode | +--------------------+-----------+ | CHARACTER VARYING | ARRAY | +--------------------+-----------+
For non-null types:
SELECT sqlTypeOf(`name`) AS name_type, modeOf(`name`) AS name_mode FROM `csvh/cust.csvh`; +--------------------+------------+ | name_type | name_mode | +--------------------+------------+ | CHARACTER VARYING | NOT NULL | +--------------------+------------+