Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • Impala 3.3.0
    • None
    • None
    • ghx-label-7

    Description

      DATE values describe a particular year/month/day, in the form YYYY-­MM-­DD. For example, DATE '2013-­01-­01'. Date types do not have a time of day component. The range of values supported for the Date type is 0000-­01-­01 to 9999-­12-­31.

      The initial DATE type support should incluide the following changes:

      • new internal type
      • casting between DATE and other types
      • codegen infrastructure for expression evaluation
      • "IS [NOT] NULL" and "[NOT] IN" predicates
      • common comparison operators
      • BETWEEN operator
      • conditional functions
      • infrastructure changes for builtin scalar functions.
      • some built-in functions: aggregate functions, analytical functions, math functions.
      • support partitioning.
      • text support only.

      These items are tightly coupled and it makes sense to implement them in one change-set.

      Attachments

        Issue Links

          Activity

            Commit 9f05cf79fa385ee6a5245ec7cb9ec1b9302c543d in impala's branch refs/heads/branch-4.3.0 from Michael Smith
            [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9f05cf79f ]

            IMPALA-10086: Implicit cast comparing char and varchar

            Until IMPALA-7368, Impala allowed comparing char and varchar slots as in

            select * from functional.chars_tiny where cs = vc;

            IMPALA-7368 added DATE type support, and as part of that changed
            function call resolution to use a fit function based on the number of
            arguments that match the call types. Previously the comparison above
            would take the first matching function, which happened to be equality
            between STRING and STRING; CHAR and VARCHAR can both be implicitly cast
            to STRING, so this function worked. With the new function resolution,
            the best fit is equality between VARCHAR and VARCHAR, however implicit
            casting to VARCHAR from CHAR wasn't allowed.

            The behavior before IMPALA-7368 was somewhat accidental; it depended on
            the order that builtin EQ functions are added via
            BinaryPredicate.initBuiltins -> Type.getSupportedTypes. Supported types
            happened to be ordered with STRING preceding VARCHAR and CHAR. The fit
            function makes sense and changing its behavior may have other
            consequences; it also makes sense that CHAR should be castable to
            VARCHAR.

            This change allows implicit cast between matching types. Functionally it
            only changes how we handle char/varchar comparison with wildcard
            char/varchar, because decimals are handled before checking for matching
            types and other type matching is the same as equals. It now allows
            casting to a compatible type when it is a char or varchar and the target
            type is a wildcard version of the same.

            Does not attempt to address differences from CHAR padding (IMPALA-1652).

            Testing:

            • Adds tests covering cast comparison and other implicit conversions.
            • Passed exhaustive test run.

            Change-Id: Ib89d0a391bc8f2152ecd9151c8872a01ba19c436
            Reviewed-on: http://gerrit.cloudera.org:8080/20425
            Reviewed-by: Peter Rozsa <prozsa@cloudera.com>
            Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
            Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            jira-bot ASF subversion and git services added a comment - Commit 9f05cf79fa385ee6a5245ec7cb9ec1b9302c543d in impala's branch refs/heads/branch-4.3.0 from Michael Smith [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9f05cf79f ] IMPALA-10086 : Implicit cast comparing char and varchar Until IMPALA-7368 , Impala allowed comparing char and varchar slots as in select * from functional.chars_tiny where cs = vc; IMPALA-7368 added DATE type support, and as part of that changed function call resolution to use a fit function based on the number of arguments that match the call types. Previously the comparison above would take the first matching function, which happened to be equality between STRING and STRING; CHAR and VARCHAR can both be implicitly cast to STRING, so this function worked. With the new function resolution, the best fit is equality between VARCHAR and VARCHAR, however implicit casting to VARCHAR from CHAR wasn't allowed. The behavior before IMPALA-7368 was somewhat accidental; it depended on the order that builtin EQ functions are added via BinaryPredicate.initBuiltins -> Type.getSupportedTypes. Supported types happened to be ordered with STRING preceding VARCHAR and CHAR. The fit function makes sense and changing its behavior may have other consequences; it also makes sense that CHAR should be castable to VARCHAR. This change allows implicit cast between matching types. Functionally it only changes how we handle char/varchar comparison with wildcard char/varchar, because decimals are handled before checking for matching types and other type matching is the same as equals. It now allows casting to a compatible type when it is a char or varchar and the target type is a wildcard version of the same. Does not attempt to address differences from CHAR padding ( IMPALA-1652 ). Testing: Adds tests covering cast comparison and other implicit conversions. Passed exhaustive test run. Change-Id: Ib89d0a391bc8f2152ecd9151c8872a01ba19c436 Reviewed-on: http://gerrit.cloudera.org:8080/20425 Reviewed-by: Peter Rozsa <prozsa@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            Commit 9f05cf79fa385ee6a5245ec7cb9ec1b9302c543d in impala's branch refs/heads/master from Michael Smith
            [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9f05cf79f ]

            IMPALA-10086: Implicit cast comparing char and varchar

            Until IMPALA-7368, Impala allowed comparing char and varchar slots as in

            select * from functional.chars_tiny where cs = vc;

            IMPALA-7368 added DATE type support, and as part of that changed
            function call resolution to use a fit function based on the number of
            arguments that match the call types. Previously the comparison above
            would take the first matching function, which happened to be equality
            between STRING and STRING; CHAR and VARCHAR can both be implicitly cast
            to STRING, so this function worked. With the new function resolution,
            the best fit is equality between VARCHAR and VARCHAR, however implicit
            casting to VARCHAR from CHAR wasn't allowed.

            The behavior before IMPALA-7368 was somewhat accidental; it depended on
            the order that builtin EQ functions are added via
            BinaryPredicate.initBuiltins -> Type.getSupportedTypes. Supported types
            happened to be ordered with STRING preceding VARCHAR and CHAR. The fit
            function makes sense and changing its behavior may have other
            consequences; it also makes sense that CHAR should be castable to
            VARCHAR.

            This change allows implicit cast between matching types. Functionally it
            only changes how we handle char/varchar comparison with wildcard
            char/varchar, because decimals are handled before checking for matching
            types and other type matching is the same as equals. It now allows
            casting to a compatible type when it is a char or varchar and the target
            type is a wildcard version of the same.

            Does not attempt to address differences from CHAR padding (IMPALA-1652).

            Testing:

            • Adds tests covering cast comparison and other implicit conversions.
            • Passed exhaustive test run.

            Change-Id: Ib89d0a391bc8f2152ecd9151c8872a01ba19c436
            Reviewed-on: http://gerrit.cloudera.org:8080/20425
            Reviewed-by: Peter Rozsa <prozsa@cloudera.com>
            Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
            Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            jira-bot ASF subversion and git services added a comment - Commit 9f05cf79fa385ee6a5245ec7cb9ec1b9302c543d in impala's branch refs/heads/master from Michael Smith [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9f05cf79f ] IMPALA-10086 : Implicit cast comparing char and varchar Until IMPALA-7368 , Impala allowed comparing char and varchar slots as in select * from functional.chars_tiny where cs = vc; IMPALA-7368 added DATE type support, and as part of that changed function call resolution to use a fit function based on the number of arguments that match the call types. Previously the comparison above would take the first matching function, which happened to be equality between STRING and STRING; CHAR and VARCHAR can both be implicitly cast to STRING, so this function worked. With the new function resolution, the best fit is equality between VARCHAR and VARCHAR, however implicit casting to VARCHAR from CHAR wasn't allowed. The behavior before IMPALA-7368 was somewhat accidental; it depended on the order that builtin EQ functions are added via BinaryPredicate.initBuiltins -> Type.getSupportedTypes. Supported types happened to be ordered with STRING preceding VARCHAR and CHAR. The fit function makes sense and changing its behavior may have other consequences; it also makes sense that CHAR should be castable to VARCHAR. This change allows implicit cast between matching types. Functionally it only changes how we handle char/varchar comparison with wildcard char/varchar, because decimals are handled before checking for matching types and other type matching is the same as equals. It now allows casting to a compatible type when it is a char or varchar and the target type is a wildcard version of the same. Does not attempt to address differences from CHAR padding ( IMPALA-1652 ). Testing: Adds tests covering cast comparison and other implicit conversions. Passed exhaustive test run. Change-Id: Ib89d0a391bc8f2152ecd9151c8872a01ba19c436 Reviewed-on: http://gerrit.cloudera.org:8080/20425 Reviewed-by: Peter Rozsa <prozsa@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            Commit 684a54a89e1752534dc713b05185bae1a6b69ba8 in impala's branch refs/heads/master from Attila Jeges
            [ https://gitbox.apache.org/repos/asf?p=impala.git;h=684a54a ]

            IMPALA-7368: Change supported year range for DATE values to 1..9999

            Before this patch the supported year range for DATE type started with
            year 0. This contradicts the ANSI SQL standard that defines the valid
            DATE value range to be 0001-01-01 to 9999-12-31.

            Change-Id: Iefdf1c036834763f52d44d0c39a25a1f04e41e07
            Reviewed-on: http://gerrit.cloudera.org:8080/14349
            Reviewed-by: Attila Jeges <attilaj@cloudera.com>
            Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            jira-bot ASF subversion and git services added a comment - Commit 684a54a89e1752534dc713b05185bae1a6b69ba8 in impala's branch refs/heads/master from Attila Jeges [ https://gitbox.apache.org/repos/asf?p=impala.git;h=684a54a ] IMPALA-7368 : Change supported year range for DATE values to 1..9999 Before this patch the supported year range for DATE type started with year 0. This contradicts the ANSI SQL standard that defines the valid DATE value range to be 0001-01-01 to 9999-12-31. Change-Id: Iefdf1c036834763f52d44d0c39a25a1f04e41e07 Reviewed-on: http://gerrit.cloudera.org:8080/14349 Reviewed-by: Attila Jeges <attilaj@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            Commit 27fa27e808d74db3e07be419dd0657c4448534e5 in impala's branch refs/heads/master from Attila Jeges
            [ https://gitbox.apache.org/repos/asf?p=impala.git;h=27fa27e ]

            IMPALA-8198: DATE: Read from avro.

            This change is a follow-up to IMPALA-7368 and adds support for DATE
            type to the avro scanner.

            Similarly to parquet, avro uses DATE logical type for dates. DATE
            logical type annotates an INT32 that stores the number of days since
            the unix epoch, 1 January 1970.

            This representation introduces an avro interoperability issue between
            Impala and older versions of Hive:

            • Before version 3.1, Hive used Julian calendar to represent dates
              up to 1582-10-05 and Gregorian calendar for dates starting with
              1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost.
            • Impala uses proleptic Gregorian calendar, extending the Gregorian
              calendar backward to dates preceding its official introduction in
              1582-10-15.
              This means that pre-1582-10-15 dates written to an avro table by Hive
              will be read back incorrectly by Impala.

            Note that Hive 3.1 switched to proleptic Gregorian calendar too, so
            for Hive 3.1+ this is no longer an issue.

            Dependency changes:

            • BE uses avro 1.7.4-p5 from native-toolchain.

            Change-Id: I7a9d5b93a22cf3a00244037e187f8c145cacc959
            Reviewed-on: http://gerrit.cloudera.org:8080/13944
            Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
            Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            jira-bot ASF subversion and git services added a comment - Commit 27fa27e808d74db3e07be419dd0657c4448534e5 in impala's branch refs/heads/master from Attila Jeges [ https://gitbox.apache.org/repos/asf?p=impala.git;h=27fa27e ] IMPALA-8198 : DATE: Read from avro. This change is a follow-up to IMPALA-7368 and adds support for DATE type to the avro scanner. Similarly to parquet, avro uses DATE logical type for dates. DATE logical type annotates an INT32 that stores the number of days since the unix epoch, 1 January 1970. This representation introduces an avro interoperability issue between Impala and older versions of Hive: Before version 3.1, Hive used Julian calendar to represent dates up to 1582-10-05 and Gregorian calendar for dates starting with 1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost. Impala uses proleptic Gregorian calendar, extending the Gregorian calendar backward to dates preceding its official introduction in 1582-10-15. This means that pre-1582-10-15 dates written to an avro table by Hive will be read back incorrectly by Impala. Note that Hive 3.1 switched to proleptic Gregorian calendar too, so for Hive 3.1+ this is no longer an issue. Dependency changes: BE uses avro 1.7.4-p5 from native-toolchain. Change-Id: I7a9d5b93a22cf3a00244037e187f8c145cacc959 Reviewed-on: http://gerrit.cloudera.org:8080/13944 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            Commit 2bce974990e19788ec359deec50f06d44ec92048 in impala's branch refs/heads/master from Attila Jeges
            [ https://gitbox.apache.org/repos/asf?p=impala.git;h=2bce974 ]

            IMPALA-7370: DATE: Read/Write to parquet.

            This change is a follow-up to IMPALA-7368 and adds support for DATE
            type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET
            statements associated with data files that contain dates are also
            supported.

            Parquet uses DATE logical type for dates. DATE logical type annotates
            an INT32 that stores the number of days from the Unix epoch, 1 January
            1970.

            This representation introduces a parquet interoperability issue
            between Impala and older versions of Hive:

            • Before version 3.1, Hive used Julian calendar to represent dates
              up to 1582-10-05 and Gregorian calendar for dates starting with
              1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost.
            • Impala uses proleptic Gregorian calendar, extending the Gregorian
              calendar backward to dates preceding its official introduction in
              1582-10-15.
              This means that pre-1582-10-15 dates written to a parquet table by
              Hive will be read back incorrectly by Impala and vice versa.

            Note that Hive 3.1 switched to proleptic Gregorian calendar too, so
            for Hive 3.1+ this is no longer an issue.

            Change-Id: I67da03754531660bc8de3b6935580d46deae1814
            Reviewed-on: http://gerrit.cloudera.org:8080/13189
            Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
            Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            jira-bot ASF subversion and git services added a comment - Commit 2bce974990e19788ec359deec50f06d44ec92048 in impala's branch refs/heads/master from Attila Jeges [ https://gitbox.apache.org/repos/asf?p=impala.git;h=2bce974 ] IMPALA-7370 : DATE: Read/Write to parquet. This change is a follow-up to IMPALA-7368 and adds support for DATE type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET statements associated with data files that contain dates are also supported. Parquet uses DATE logical type for dates. DATE logical type annotates an INT32 that stores the number of days from the Unix epoch, 1 January 1970. This representation introduces a parquet interoperability issue between Impala and older versions of Hive: Before version 3.1, Hive used Julian calendar to represent dates up to 1582-10-05 and Gregorian calendar for dates starting with 1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost. Impala uses proleptic Gregorian calendar, extending the Gregorian calendar backward to dates preceding its official introduction in 1582-10-15. This means that pre-1582-10-15 dates written to a parquet table by Hive will be read back incorrectly by Impala and vice versa. Note that Hive 3.1 switched to proleptic Gregorian calendar too, so for Hive 3.1+ this is no longer an issue. Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Reviewed-on: http://gerrit.cloudera.org:8080/13189 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            Commit b5805de3e65fd1c7154e4169b323bb38ddc54f4f in impala's branch refs/heads/master from Attila Jeges
            [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b5805de ]

            IMPALA-7368: Add initial support for DATE type

            DATE values describe a particular year/month/day in the form
            yyyy-MM-dd. For example: DATE '2019-02-15'. DATE values do not have a
            time of day component. The range of values supported for the DATE type
            is 0000-01-01 to 9999-12-31.

            This initial DATE type support covers TEXT and HBASE fileformats only.
            'DateValue' is used as the internal type to represent DATE values.

            The changes are as follows:

            • Support for DATE literal syntax.
            • Explicit casting between DATE and other types (note that invalid
              casts will fail with an error just like invalid DECIMAL_V2 casts,
              while failed casts to other types do no lead to warning or error):
            • from STRING to DATE. The string value must be formatted as
              yyyy-MM-dd HH:mm:ss.SSSSSSSSS. The date component is mandatory,
              the time component is optional. If the time component is
              present, it will be truncated silently.
            • from DATE to STRING. The resulting string value is formatted as
              yyyy-MM-dd.
            • from TIMESTAMP to DATE. The source timestamp's time of day
              component is ignored.
            • from DATE to TIMESTAMP. The target timestamp's time of day
              component is set to 00:00:00.
            • Implicit casting between DATE and other types:
            • from STRING to DATE if the source string value is used in a
              context where a DATE value is expected.
            • from DATE to TIMESTAMP if the source date value is used in a
              context where a TIMESTAMP value is expected.
            • Since STRING -> DATE, STRING -> TIMESTAMP and DATE -> TIMESTAMP
              implicit conversions are now all possible, the existing function
              overload resolution logic is not adequate anymore.
              For example, it resolves the
              if(false, '2011-01-01', DATE '1499-02-02') function call to the
              if(BOOLEAN, TIMESTAMP, TIMESTAMP) version of the overloaded
              function, instead of the if(BOOLEAN, DATE, DATE) version.

            This is clearly wrong, so the function overload resolution logic had
            to be changed to resolve function calls to the best-fit overloaded
            function definition if there are multiple applicable candidates.

            An overloaded function definition is an applicable candidate for a
            function call if each actual parameter in the function call either
            matches the corresponding formal parameter's type (without casting)
            or is implicitly castable to that type.

            When looking for the best-fit applicable candidate, a parameter
            match score (i.e. the number of actual parameters in the function
            call that match their corresponding formal parameter's type without
            casting) is calculated and the applicable candidate with the highest
            parameter match score is chosen.

            There's one more issue that the new resolution logic has to address:
            if two applicable candidates have the same parameter match score and
            the only difference between the two is that the first one requires a
            STRING -> TIMESTAMP implicit cast for some of its parameters while
            the second one requires a STRING -> DATE implicit cast for the same
            parameters then the first candidate has to be chosen not to break
            backward compatibility.
            E.g: year('2019-02-15') function call must resolve to
            year(TIMESTAMP) instead of year(DATE). Note, that year(DATE) is not
            implemented yet, so this is not an issue at the moment but it will
            be in the future.
            When the resolution algorithm considers overloaded function
            definitions, first it orders them lexicographically by the types in
            their parameter lists. To ensure the backward compatible behavior
            Primitivetype.DATE enum value has to come after
            PrimitiveType.TIMESTAMP.

            • Codegen infrastructure changes for expression evaluation.
            • 'IS [NOT] NULL' and '[NOT] IN' predicates.
            • Common comparison operators (including the 'BETWEEN' operator).
            • Infrastructure changes for built-in functions.
            • Some built-in functions: conditional, aggregate, analytical and
              math functions.
            • C++ UDF/UDA support.
            • Support partitioning and grouping by DATE.
            • Beeswax, HiveServer2 support.

            These items are tightly coupled and it makes sense to implement them
            in one change-set.

            Testing:

            • A new partitioned TEXT table 'functional.date_tbl' (and the
              corresponding HBASE table 'functional_hbase.date_tbl') was
              introduced for DATE-related tests.
            • BE and FE tests were extended to cover DATE type.
            • E2E tests:
            • since DATE type is supported for TEXT and HBASE fileformats
              only, most DATE tests were implemented separately in
              tests/query_test/test_date_queries.py.

            Note, that this change-set is not a complete DATE type implementation,
            but it lays the foundation for future work:

            • Add date support to the random query generator.
            • Implement a complete set of built-in functions.
            • Add Parquet support.
            • Add Kudu support.
            • Optionally support Avro and ORC.
              For further details, see IMPALA-6169.

            Change-Id: Iea8155ef09557e0afa2f8b2d0b2dc9d0896dc30f
            Reviewed-on: http://gerrit.cloudera.org:8080/12481
            Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
            Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            jira-bot ASF subversion and git services added a comment - Commit b5805de3e65fd1c7154e4169b323bb38ddc54f4f in impala's branch refs/heads/master from Attila Jeges [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b5805de ] IMPALA-7368 : Add initial support for DATE type DATE values describe a particular year/month/day in the form yyyy-MM-dd. For example: DATE '2019-02-15'. DATE values do not have a time of day component. The range of values supported for the DATE type is 0000-01-01 to 9999-12-31. This initial DATE type support covers TEXT and HBASE fileformats only. 'DateValue' is used as the internal type to represent DATE values. The changes are as follows: Support for DATE literal syntax. Explicit casting between DATE and other types (note that invalid casts will fail with an error just like invalid DECIMAL_V2 casts, while failed casts to other types do no lead to warning or error): from STRING to DATE. The string value must be formatted as yyyy-MM-dd HH:mm:ss.SSSSSSSSS. The date component is mandatory, the time component is optional. If the time component is present, it will be truncated silently. from DATE to STRING. The resulting string value is formatted as yyyy-MM-dd. from TIMESTAMP to DATE. The source timestamp's time of day component is ignored. from DATE to TIMESTAMP. The target timestamp's time of day component is set to 00:00:00. Implicit casting between DATE and other types: from STRING to DATE if the source string value is used in a context where a DATE value is expected. from DATE to TIMESTAMP if the source date value is used in a context where a TIMESTAMP value is expected. Since STRING -> DATE, STRING -> TIMESTAMP and DATE -> TIMESTAMP implicit conversions are now all possible, the existing function overload resolution logic is not adequate anymore. For example, it resolves the if(false, '2011-01-01', DATE '1499-02-02') function call to the if(BOOLEAN, TIMESTAMP, TIMESTAMP) version of the overloaded function, instead of the if(BOOLEAN, DATE, DATE) version. This is clearly wrong, so the function overload resolution logic had to be changed to resolve function calls to the best-fit overloaded function definition if there are multiple applicable candidates. An overloaded function definition is an applicable candidate for a function call if each actual parameter in the function call either matches the corresponding formal parameter's type (without casting) or is implicitly castable to that type. When looking for the best-fit applicable candidate, a parameter match score (i.e. the number of actual parameters in the function call that match their corresponding formal parameter's type without casting) is calculated and the applicable candidate with the highest parameter match score is chosen. There's one more issue that the new resolution logic has to address: if two applicable candidates have the same parameter match score and the only difference between the two is that the first one requires a STRING -> TIMESTAMP implicit cast for some of its parameters while the second one requires a STRING -> DATE implicit cast for the same parameters then the first candidate has to be chosen not to break backward compatibility. E.g: year('2019-02-15') function call must resolve to year(TIMESTAMP) instead of year(DATE). Note, that year(DATE) is not implemented yet, so this is not an issue at the moment but it will be in the future. When the resolution algorithm considers overloaded function definitions, first it orders them lexicographically by the types in their parameter lists. To ensure the backward compatible behavior Primitivetype.DATE enum value has to come after PrimitiveType.TIMESTAMP. Codegen infrastructure changes for expression evaluation. 'IS [NOT] NULL' and ' [NOT] IN' predicates. Common comparison operators (including the 'BETWEEN' operator). Infrastructure changes for built-in functions. Some built-in functions: conditional, aggregate, analytical and math functions. C++ UDF/UDA support. Support partitioning and grouping by DATE. Beeswax, HiveServer2 support. These items are tightly coupled and it makes sense to implement them in one change-set. Testing: A new partitioned TEXT table 'functional.date_tbl' (and the corresponding HBASE table 'functional_hbase.date_tbl') was introduced for DATE-related tests. BE and FE tests were extended to cover DATE type. E2E tests: since DATE type is supported for TEXT and HBASE fileformats only, most DATE tests were implemented separately in tests/query_test/test_date_queries.py. Note, that this change-set is not a complete DATE type implementation, but it lays the foundation for future work: Add date support to the random query generator. Implement a complete set of built-in functions. Add Parquet support. Add Kudu support. Optionally support Avro and ORC. For further details, see IMPALA-6169 . Change-Id: Iea8155ef09557e0afa2f8b2d0b2dc9d0896dc30f Reviewed-on: http://gerrit.cloudera.org:8080/12481 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            People

              attilaj Attila Jeges
              attilaj Attila Jeges
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: