Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1392

Druid default time column not properly recognized

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.9.0
    • Fix Version/s: 1.10.0
    • Component/s: druid
    • Labels:
      None

      Description

      Druid 0.9.1.1
      Calcite 1.9.0

      I'm trying to calculate count per day.
      What I'm doing wrong ?

      !connect jdbc:calcite:schemaFactory=org.apache.calcite.adapter.druid.DruidSchemaFactory admin admin
      0: jdbc:calcite:schemaFactory=org.apache.calc> !describe "table"
      +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+
      | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB |
      +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+
      |           | adhoc       | table  | __time      | -5        | BIGINT    | -1          |               | null           | 10             | 1        |         |            |               |                  |
      |
      |           | adhoc       | table    | count       | -5        | BIGINT    | -1          |               | null           | 10             | 1        |         |            |               |                  
      
      0: jdbc:calcite:schemaFactory=org.apache.calc> select sum("count") from "table" group by floor("__time" to DAY);
      2016-09-27 21:50:26,770 [main] ERROR - org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply 'FLOOR' to arguments of type 'FLOOR(<BIGINT>, <INTERVAL DAY>)'. Supported form(s): 'FLOOR(<NUMERIC>)'
      'FLOOR(<DATETIME_INTERVAL>)'
      'FLOOR(<DATE> TO <TIME_UNIT>)'
      'FLOOR(<TIME> TO <TIME_UNIT>)'
      'FLOOR(<TIMESTAMP> TO <TIME_UNIT>)'
      

        Activity

        Hide
        julianhyde Julian Hyde added a comment -

        Did you tell Calcite that "__time" is the timestamp column? If you did, then its type should be TIMESTAMP and FLOOR should work. If you didn't, its type will be VARCHAR.

        Show
        julianhyde Julian Hyde added a comment - Did you tell Calcite that "__time" is the timestamp column? If you did, then its type should be TIMESTAMP and FLOOR should work. If you didn't, its type will be VARCHAR.
        Hide
        maver1ck Maciej Bryński added a comment - - edited

        I didn't.
        And I don't have "__time" column in my data source. I was sure that this is meta column created by calcite itself. And it's type is not VARCHAR but BIGINT

        Especially with this quote from calcite page suggest that there is not nead to do anything.
        https://calcite.apache.org/docs/druid_adapter.html

        Simplifying the modelPermalink
        If less metadata is provided in the model, the Druid adapter can discover it automatically from Druid. Here is a schema equivalent to the previous one but with dimensions, metrics and timestampColumn removed:

        Show
        maver1ck Maciej Bryński added a comment - - edited I didn't. And I don't have "__time" column in my data source. I was sure that this is meta column created by calcite itself. And it's type is not VARCHAR but BIGINT Especially with this quote from calcite page suggest that there is not nead to do anything. https://calcite.apache.org/docs/druid_adapter.html Simplifying the modelPermalink If less metadata is provided in the model, the Druid adapter can discover it automatically from Druid. Here is a schema equivalent to the previous one but with dimensions, metrics and timestampColumn removed:
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Maciej Bryński, have you tried to replace "__time" in your query by "timestamp" and check if it works?

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Maciej Bryński , have you tried to replace "__time" in your query by "timestamp" and check if it works?
        Hide
        xiadao Dragobin added a comment - - edited

        I also met this problem,when calcite call Druid rest api get tables info,there is a default column called "__time" that refers to the time column of the segment,this column's type is Long in restful json result,Calcite convert it to BigInt.

        I want to know how to set this time column to tell Calcite when use simplifying the model style,this model json not setting tables info, I only could use "__time" to excute sql, not use other specify identifier.

        I also dont'n know how to specify time granularity for Druid,and not use such as FLOOR、EXTRACT functions,it will error like this issue show in top.

        Show
        xiadao Dragobin added a comment - - edited I also met this problem,when calcite call Druid rest api get tables info,there is a default column called "__time" that refers to the time column of the segment,this column's type is Long in restful json result,Calcite convert it to BigInt. I want to know how to set this time column to tell Calcite when use simplifying the model style,this model json not setting tables info, I only could use "__time" to excute sql, not use other specify identifier. I also dont'n know how to specify time granularity for Druid,and not use such as FLOOR、EXTRACT functions,it will error like this issue show in top.
        Hide
        maver1ck Maciej Bryński added a comment -

        It doesn't

        2016-09-29 09:48:48,312 [main] ERROR - org.apache.calcite.sql.validate.SqlValidatorException: Column 'timestamp' not found in any table
        
        Show
        maver1ck Maciej Bryński added a comment - It doesn't 2016-09-29 09:48:48,312 [main] ERROR - org.apache.calcite.sql.validate.SqlValidatorException: Column 'timestamp' not found in any table
        Hide
        xiadao Dragobin added a comment -

        If "timestamp" instead of "__time" in sql, I will not work. It tell me not found colunm "timstamp" in tables.

        Show
        xiadao Dragobin added a comment - If "timestamp" instead of "__time" in sql, I will not work. It tell me not found colunm "timstamp" in tables.
        Hide
        julianhyde Julian Hyde added a comment -

        OK, I'm looking into this today.

        Jesus Camacho Rodriguez, Have you run DruidAdapterIT recently? I get a lot of errors, due to the formatting of the time interval in plans (and maybe also timezone issues).

        Show
        julianhyde Julian Hyde added a comment - OK, I'm looking into this today. Jesus Camacho Rodriguez , Have you run DruidAdapterIT recently? I get a lot of errors, due to the formatting of the time interval in plans (and maybe also timezone issues).
        Hide
        maver1ck Maciej Bryński added a comment -

        Any progress on this ?

        Show
        maver1ck Maciej Bryński added a comment - Any progress on this ?
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited

        Julian Hyde, if you did not start with it, I can tackle it to include it in 1.10.0. I think we just need to add a special case to handle column with name "__time" when we retrieve the datasource information from Druid. This should be fine, since "__time" is a reserved column name in Druid.

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited Julian Hyde , if you did not start with it, I can tackle it to include it in 1.10.0. I think we just need to add a special case to handle column with name "__time" when we retrieve the datasource information from Druid. This should be fine, since "__time" is a reserved column name in Druid.
        Hide
        julianhyde Julian Hyde added a comment -

        Jesus Camacho Rodriguez, Yes, please fix this. I intended work on it but got stuck on CALCITE-1403; today I'll focus on getting the CALCITE-1403 patch in.

        Show
        julianhyde Julian Hyde added a comment - Jesus Camacho Rodriguez , Yes, please fix this. I intended work on it but got stuck on CALCITE-1403 ; today I'll focus on getting the CALCITE-1403 patch in.
        Hide
        julianhyde Julian Hyde added a comment -

        Maciej Bryński Please don't nag. If there had been progress we would have updated this JIRA.

        Show
        julianhyde Julian Hyde added a comment - Maciej Bryński Please don't nag. If there had been progress we would have updated this JIRA.
        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/a1e50b6 .
        Hide
        maver1ck Maciej Bryński added a comment - - edited

        Not everything is OK. I can't select "day". Is it correct query ?

        0: jdbc:calcite:schemaFactory=org.apache.calc> select sum("count") as "count", floor("__time" to DAY) as "day"  from "table" group by floor("__time" to DAY);
        
        count day
        19259  
        95333  
        29841  
        11364  
        01933  
        Show
        maver1ck Maciej Bryński added a comment - - edited Not everything is OK. I can't select "day". Is it correct query ? 0: jdbc:calcite:schemaFactory=org.apache.calc> select sum( "count" ) as "count" , floor( "__time" to DAY) as "day" from "table" group by floor( "__time" to DAY); count day 19259   95333   29841   11364   01933  
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        This seems an additional problem reading the timestamp column from the result when we specify a certain granularity. I am taking a look at it.

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - This seems an additional problem reading the timestamp column from the result when we specify a certain granularity. I am taking a look at it.
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited

        Follow-up fix in http://git-wip-us.apache.org/repos/asf/calcite/commit/6627448.

        Maciej Bryński, could you check that it works properly now? Thanks

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited Follow-up fix in http://git-wip-us.apache.org/repos/asf/calcite/commit/6627448 . Maciej Bryński , could you check that it works properly now? Thanks
        Hide
        maver1ck Maciej Bryński added a comment -

        Yep.
        Previous select worked.
        One more thing.
        How can in set range for timestamp ?

        Show
        maver1ck Maciej Bryński added a comment - Yep. Previous select worked. One more thing. How can in set range for timestamp ?
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Filter the "__time" column accordingly in the WHERE clause of the query.

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Filter the "__time" column accordingly in the WHERE clause of the query.
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Resolved in release 1.10.0 (2016-10-12).

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Resolved in release 1.10.0 (2016-10-12).

          People

          • Assignee:
            jcamachorodriguez Jesus Camacho Rodriguez
            Reporter:
            maver1ck Maciej Bryński
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development