Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-708

Avatica and Calcite to support DatabaseMetaData getTypeInfo

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0-incubating
    • Component/s: None
    • Labels:

      Description

      http://docs.oracle.com/javase/7/docs/api/java/sql/DatabaseMetaData.html#getTypeInfo()

      Avatica to call underlying connection database metadata for getTypeInfo to show the database supported type and meta info.

      Calcite to provide the info for the underlying adapter thru SqlTypeName and RelDataType

      1. typeInfo.csv
        3 kB
        Xavier FH Leong

        Activity

        Hide
        yeongwei YeongWei added a comment - - edited

        Hi Julian Hyde,

        Few queries regarding this JIRA from implementation perspective,

        1. For implementations with SchemaFactory and TableFactory, are the information for this metadata to be facilitate by those factory implementation? E.g. CsvSchemaaFactory to be able to return a MAP of supported JDBC Data Types ?

        2. For implementations with JdbcSchema, similarly are the information to be retrieve from the underlying data source after the data source has been created? E.g. Like the JdbcSchema#computeTables() method where it does it connection.getMetaData() ?

        From the various examples given, I think all Schema's implementation does not have this implemented.

        Probable we could add a method like Schema#getSupportedDataTypes ?

        Look forward for your response.

        Thanks!

        Show
        yeongwei YeongWei added a comment - - edited Hi Julian Hyde , Few queries regarding this JIRA from implementation perspective, 1. For implementations with SchemaFactory and TableFactory, are the information for this metadata to be facilitate by those factory implementation? E.g. CsvSchemaaFactory to be able to return a MAP of supported JDBC Data Types ? 2. For implementations with JdbcSchema, similarly are the information to be retrieve from the underlying data source after the data source has been created? E.g. Like the JdbcSchema#computeTables() method where it does it connection.getMetaData() ? From the various examples given, I think all Schema's implementation does not have this implemented. Probable we could add a method like Schema#getSupportedDataTypes ? Look forward for your response. Thanks!
        Hide
        xhoong Xavier FH Leong added a comment -

        I'd completed the first draft implementation, basically using what was defined in the runtime Schema, a query was run to collect all the type info. This will give a sub set of type info that was created by the schema factory, un-realized type are not query-able.

        I take this approach since currently DDL is not officially supported, so in a sense, the schema are not changeable via JDBC or SQL, so exploration of type info are mainly for what was already defined and used, which at the client side use case is to dynamically build SQL based on data type reflection

        However, I do believe in future, we need a way that the type can be reflected back by the adapter, which what YeongWei is mentioning on the support on the Schema#getTypeInfos, where the adapter can overwrite the details info for all the supported data types.

        Show
        xhoong Xavier FH Leong added a comment - I'd completed the first draft implementation, basically using what was defined in the runtime Schema, a query was run to collect all the type info. This will give a sub set of type info that was created by the schema factory, un-realized type are not query-able. I take this approach since currently DDL is not officially supported, so in a sense, the schema are not changeable via JDBC or SQL, so exploration of type info are mainly for what was already defined and used, which at the client side use case is to dynamically build SQL based on data type reflection However, I do believe in future, we need a way that the type can be reflected back by the adapter, which what YeongWei is mentioning on the support on the Schema#getTypeInfos, where the adapter can overwrite the details info for all the supported data types.
        Hide
        julianhyde Julian Hyde added a comment -

        This method is supposed to return the type-info for the whole database, not a particular schema. So I don't think Calcite should be talking to Schema, Table, SchemaFactory or TableFactory. It should basically return the same thing for any instance of Calcite – with some tweaking depending on the current RelDataTypeSystem.

        When Calcite is not involved, JdbcMeta should just return getTypeInfo from the underlying database.

        Show
        julianhyde Julian Hyde added a comment - This method is supposed to return the type-info for the whole database, not a particular schema. So I don't think Calcite should be talking to Schema, Table, SchemaFactory or TableFactory. It should basically return the same thing for any instance of Calcite – with some tweaking depending on the current RelDataTypeSystem. When Calcite is not involved, JdbcMeta should just return getTypeInfo from the underlying database.
        Hide
        xhoong Xavier FH Leong added a comment - - edited

        There's a missing part in the propose solution, not all the information is completed without getting the info from the adapter which implements the RelDataType. And also it can be that not all "default" type is supported by the underlying adapter.

        To support the complete type, I have to query the underlying schema to get the complete data type info, and union with SqlTypeName to get the rest, with some default value.

        My work in progress branch can be found at:

        https://github.com/xhoong/incubator-calcite/tree/708-getTypeInfo

        Show
        xhoong Xavier FH Leong added a comment - - edited There's a missing part in the propose solution, not all the information is completed without getting the info from the adapter which implements the RelDataType. And also it can be that not all "default" type is supported by the underlying adapter. To support the complete type, I have to query the underlying schema to get the complete data type info, and union with SqlTypeName to get the rest, with some default value. My work in progress branch can be found at: https://github.com/xhoong/incubator-calcite/tree/708-getTypeInfo
        Hide
        xhoong Xavier FH Leong added a comment -

        There's an alternate commit, think this is what you are looking for. And is the Schema#getTypeInfo that we are looking for, where the RelDataTypeSystem can be override to represent the underlying adapter connection.

        Still work in progress, working on regression test. Let me know what you think and I'll rebase later. Thanks.

        Show
        xhoong Xavier FH Leong added a comment - There's an alternate commit, think this is what you are looking for. And is the Schema#getTypeInfo that we are looking for, where the RelDataTypeSystem can be override to represent the underlying adapter connection. Still work in progress, working on regression test. Let me know what you think and I'll rebase later. Thanks.
        Hide
        julianhyde Julian Hyde added a comment -

        It seems to me that Calcite needs to be able to generate a full response to getTypeInfo without talking to any of its schemas. (There might not be any schemas... and if there are multiple schemas, which one to pick? If the schema happens to be a JDBC database, its type system will be distorted as it passes through Calcite, so it will look like Calcite's type system.)

        Your implementation looks good. I'd have iterated over SqlTypeName, too. I'd make RelDataTypeSystem a final field in CalciteMetaImpl.

        Your typeInfo(MetaTable) is going to be called once for every table. That is going to be unpleasant for large schemas. Do we really need to get type info from every source? It is turning into a painful requirement.

        Show
        julianhyde Julian Hyde added a comment - It seems to me that Calcite needs to be able to generate a full response to getTypeInfo without talking to any of its schemas. (There might not be any schemas... and if there are multiple schemas, which one to pick? If the schema happens to be a JDBC database, its type system will be distorted as it passes through Calcite, so it will look like Calcite's type system.) Your implementation looks good. I'd have iterated over SqlTypeName, too. I'd make RelDataTypeSystem a final field in CalciteMetaImpl. Your typeInfo(MetaTable) is going to be called once for every table. That is going to be unpleasant for large schemas. Do we really need to get type info from every source? It is turning into a painful requirement.
        Hide
        xhoong Xavier FH Leong added a comment - - edited

        Yes, you are correct, scanning thru all schema is not a good idea. I eliminate that and now it depends on the RelDataTypeSystem from the connection typeFactory. And only covers the type specify in SqlTypeName thru the loops over the Enum.

        Underlying Calcite adapters needs to implement the typeSystem to override the default values.

        One point thou, overriding it also require the implementor to extend CalciteConnectionImpl class or override the connection factory to pass in the typeFactory with the custom typeSystem.

        Show
        xhoong Xavier FH Leong added a comment - - edited Yes, you are correct, scanning thru all schema is not a good idea. I eliminate that and now it depends on the RelDataTypeSystem from the connection typeFactory. And only covers the type specify in SqlTypeName thru the loops over the Enum. Underlying Calcite adapters needs to implement the typeSystem to override the default values. One point thou, overriding it also require the implementor to extend CalciteConnectionImpl class or override the connection factory to pass in the typeFactory with the custom typeSystem.
        Hide
        xhoong Xavier FH Leong added a comment -

        Hi Julian Hyde, need some guidance, I'd make the RelDataTypeSystem be representing for the DatabaseMetaData type info, also found that BasicSqlType basically employs all the default precision on the type. I'd move them instead to the type system, now mostly all types have the default precision represented, eg INTEGER will be INTEGER(10)

        So, now the precision info is available and in the rel nodes and plans, this means all the unit test that do string compare will fail. Want to check with you whether this is the right path, as to me, it shows the type are now consistently represented, and also being push down to the rel nodes. What do you think?

        You can also have a peek on the WIP branch in github. Thanks.

        Show
        xhoong Xavier FH Leong added a comment - Hi Julian Hyde , need some guidance, I'd make the RelDataTypeSystem be representing for the DatabaseMetaData type info, also found that BasicSqlType basically employs all the default precision on the type. I'd move them instead to the type system, now mostly all types have the default precision represented, eg INTEGER will be INTEGER(10) So, now the precision info is available and in the rel nodes and plans, this means all the unit test that do string compare will fail. Want to check with you whether this is the right path, as to me, it shows the type are now consistently represented, and also being push down to the rel nodes. What do you think? You can also have a peek on the WIP branch in github. Thanks.
        Hide
        julianhyde Julian Hyde added a comment -

        If you ask any type its precision, it will tell you. But for certain types such as INTEGER and BOOLEAN, they always have the same precision so we don't bother to print it. If SqlTypeName has sufficient information for you to know that you should print "INTEGER" rather than "INTEGER(10)".

        And similarly that you should print "TIMESTAMP" rather than "TIMESTAMP(3)" because 3 is TIMESTAMP's default precision.

        Show
        julianhyde Julian Hyde added a comment - If you ask any type its precision, it will tell you. But for certain types such as INTEGER and BOOLEAN, they always have the same precision so we don't bother to print it. If SqlTypeName has sufficient information for you to know that you should print "INTEGER" rather than "INTEGER(10)". And similarly that you should print "TIMESTAMP" rather than "TIMESTAMP(3)" because 3 is TIMESTAMP's default precision.
        Hide
        xhoong Xavier FH Leong added a comment -

        Pull request created:
        https://github.com/apache/incubator-calcite/pull/87

        So, I'm using alloPrec to determine if we need to print, static precision eg. INTEGER do not allow precision change.

        Show
        xhoong Xavier FH Leong added a comment - Pull request created: https://github.com/apache/incubator-calcite/pull/87 So, I'm using alloPrec to determine if we need to print, static precision eg. INTEGER do not allow precision change.
        Hide
        xhoong Xavier FH Leong added a comment -

        Output of DatabaseMetaData.getTypeInfo()

        Show
        xhoong Xavier FH Leong added a comment - Output of DatabaseMetaData.getTypeInfo()
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/83707f72 . Thanks Xavier!
        Hide
        jnadeau Jacques Nadeau added a comment -

        Resolved in release 1.4.0-incubating (2015-08-23)

        Show
        jnadeau Jacques Nadeau added a comment - Resolved in release 1.4.0-incubating (2015-08-23)

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            xhoong Xavier FH Leong
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development