Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-76

Hive cannot determine the list of columns automatically based on Parquet serde

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Today we are not able to create a parquet based hive table without having to specify the column names and types. When we try to define it the following way, we get the error
      "14/08/20 17:27:46 ERROR ql.Driver: FAILED: SemanticException [Error 10043]: Either list of columns or a custom serializer should be specified"

      CREATE  TABLE parquet_test
      ROW FORMAT SERDE
        'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
      STORED AS INPUTFORMAT
        'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
      OUTPUTFORMAT
        'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
      LOCATION
        '/user/pratik/campaigns';
      

      Whereas if we create a hive table on top of AVRO based files, we do not need to specify the column names, hive automatically figures out the schema through the SerDe.

      CREATE EXTERNAL TABLE campaigns
      ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
      STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
      OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
      LOCATION '/user/pratik/campaigns'
      TBLPROPERTIES ('avro.schema.url'='hdfs:///user/pratik/campaigns.avsc');
      

        Activity

        Hide
        singhashish Ashish Singh added a comment - - edited

        I have started working on this. I could not assign this JIRA to myself. If someone could, that will be helpful.

        Show
        singhashish Ashish Singh added a comment - - edited I have started working on this. I could not assign this JIRA to myself. If someone could, that will be helpful.
        Hide
        singhashish Ashish Singh added a comment -

        Created HIVE-8950, which should take care of this.

        Show
        singhashish Ashish Singh added a comment - Created HIVE-8950 , which should take care of this.
        Hide
        singhashish Ashish Singh added a comment -

        Julien Le Dem could you assign this to me.

        Show
        singhashish Ashish Singh added a comment - Julien Le Dem could you assign this to me.
        Hide
        mpollock Matt Pollock added a comment -

        Is this being worked? Has it been resolved under another issue?

        With hive 1.2 I am still seeing similar messages:

         Error in .local(conn, statement, ...) : 
          execute JDBC update query failed in dbSendUpdate (Error while compiling statement: FAILED: SemanticException [Error 10043]: Either list of columns or a custom serializer should be specified) 
        
        Show
        mpollock Matt Pollock added a comment - Is this being worked? Has it been resolved under another issue? With hive 1.2 I am still seeing similar messages: Error in .local(conn, statement, ...) : execute JDBC update query failed in dbSendUpdate (Error while compiling statement: FAILED: SemanticException [Error 10043]: Either list of columns or a custom serializer should be specified)
        Hide
        spena Sergio Peña added a comment -

        It is not currently working on Hive. The feature is still under review, and with opinions about using a different SQL language on HIVE-10593

        Show
        spena Sergio Peña added a comment - It is not currently working on Hive. The feature is still under review, and with opinions about using a different SQL language on HIVE-10593

          People

          • Assignee:
            singhashish Ashish Singh
            Reporter:
            tispratik Pratik Khadloya
          • Votes:
            5 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:

              Development