Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This command fails with the following error:

      hive/bin/hive -e "INSERT OVERWRITE LOCAL DIRECTORY 'abc' select null from zshao_tt"
      FAILED: Error in semantic analysis:
      java.lang.RuntimeException: Internal error: Cannot find ObjectInspector for VOID
      

      When 'null' is replaced by '' it works.

      1. 734.patch
        4 kB
        Ning Zhang

        Issue Links

          Activity

          Hide
          Ashish Thusoo added a comment -

          select null from null from zshao_tt ?? or is it select null from zshao_tt.

          Show
          Ashish Thusoo added a comment - select null from null from zshao_tt ?? or is it select null from zshao_tt.
          Hide
          Zheng Shao added a comment -

          corrected. it's "select null from zshao_tt"

          Show
          Zheng Shao added a comment - corrected. it's "select null from zshao_tt"
          Hide
          Ning Zhang added a comment -

          Thanks guys! I figured that out too. Debugging now.

          Show
          Ning Zhang added a comment - Thanks guys! I figured that out too. Debugging now.
          Hide
          Ning Zhang added a comment -

          The error is threw during semantic analysis where the parser try to instantiate an ObjectInspector for type "void" (corresponding to null).

          The following similar query succeeded:

          select 2 from zshao_tt;

          since the semantic analyzer is able to deduct type (int) from the constant 2.

          Since null can also be treated as a constant, I think we need to write an ObjectInspector for its type (void) too. And the object inspector will always return null for each row.

          Show
          Ning Zhang added a comment - The error is threw during semantic analysis where the parser try to instantiate an ObjectInspector for type "void" (corresponding to null). The following similar query succeeded: select 2 from zshao_tt; since the semantic analyzer is able to deduct type (int) from the constant 2. Since null can also be treated as a constant, I think we need to write an ObjectInspector for its type (void) too. And the object inspector will always return null for each row.
          Hide
          Zheng Shao added a comment -

          +1 on the idea. We have WritableVoidObjectInspector, JavaVoidObjectInspector etc but none are actually implemented or used.
          Can we use JavaVoidObjectInspector for null constant? We probably need to modify ObjectInspectorConverters to allow conversion from void type to any type as well.

          Show
          Zheng Shao added a comment - +1 on the idea. We have WritableVoidObjectInspector, JavaVoidObjectInspector etc but none are actually implemented or used. Can we use JavaVoidObjectInspector for null constant? We probably need to modify ObjectInspectorConverters to allow conversion from void type to any type as well.
          Hide
          Ning Zhang added a comment -

          Thanks for the pointer Zheng. We certainly can use JavaVoidObjecInspector for NULL. However it turns out NULL is more complicated than what it sounds like (see http://en.wikipedia.org/wiki/Null_(SQL) ). NULL has to be handled not only in this simple select cause, but also in much more complex cases in the where-clause and joins. Let's have a talk.

          Show
          Ning Zhang added a comment - Thanks for the pointer Zheng. We certainly can use JavaVoidObjecInspector for NULL. However it turns out NULL is more complicated than what it sounds like (see http://en.wikipedia.org/wiki/Null_(SQL ) ). NULL has to be handled not only in this simple select cause, but also in much more complex cases in the where-clause and joins. Let's have a talk.
          Hide
          Zheng Shao added a comment -

          We do handle NULL in the WHERE clause and "AND", "OR" functions using the 3-value booleans. I am not sure whether we ignore NULL in join/group-by but that seems easy enough to do.
          Do you have a specific example that the current Hive code breaks?

          Show
          Zheng Shao added a comment - We do handle NULL in the WHERE clause and "AND", "OR" functions using the 3-value booleans. I am not sure whether we ignore NULL in join/group-by but that seems easy enough to do. Do you have a specific example that the current Hive code breaks?
          Hide
          Ning Zhang added a comment -

          I wasn't able to get an example in HIVE. I tried to insert NULL to a table by modifying the test query input4_cb_delim.q (actually by only modifying the input data to be

          <value1>\002\012
          \002<value2>\012

          and run the query

          select * from input4_cb where key is not null;

          and both rows are returned. So either the NULL values are not inserted correctly, or the evaluation is not correct about NULL.

          Show
          Ning Zhang added a comment - I wasn't able to get an example in HIVE. I tried to insert NULL to a table by modifying the test query input4_cb_delim.q (actually by only modifying the input data to be <value1>\002\012 \002<value2>\012 and run the query select * from input4_cb where key is not null; and both rows are returned. So either the NULL values are not inserted correctly, or the evaluation is not correct about NULL.
          Hide
          Zheng Shao added a comment -

          With LazySimpleSerDe (the default), we represent NULL by two bytes: "\" and "N". This is in sync with MySQL export format.
          Can you try again with that?

          Show
          Zheng Shao added a comment - With LazySimpleSerDe (the default), we represent NULL by two bytes: "\" and "N". This is in sync with MySQL export format. Can you try again with that?
          Hide
          Ning Zhang added a comment -

          That's great! It works (and my test queries on NULL also works too). In this case I guess it is much simpler since what we need here is just the VOID type ObjectInspector. I'll working on it.

          Show
          Ning Zhang added a comment - That's great! It works (and my test queries on NULL also works too). In this case I guess it is much simpler since what we need here is just the VOID type ObjectInspector. I'll working on it.
          Hide
          Ning Zhang added a comment -

          BTW, I tried joins and it doesn't handle NULL semantics correctly. Here's the data and results:

          Table:

          Key Value
          ------- -------
          NULL 325
          18 NULL

          Query:

          select * from input4_cb a join input4_cb b on a.key = b.value;

          Result:

          NULL 325 18 NULL

          The correct result should be empty set.

          I guess the NULL vs. NULL comparison is incorrectly evaluated.

          Show
          Ning Zhang added a comment - BTW, I tried joins and it doesn't handle NULL semantics correctly. Here's the data and results: Table: Key Value ------- ------- NULL 325 18 NULL Query: select * from input4_cb a join input4_cb b on a.key = b.value; Result: NULL 325 18 NULL The correct result should be empty set. I guess the NULL vs. NULL comparison is incorrectly evaluated.
          Hide
          Ning Zhang added a comment -

          added the JavaVoidObjectInspector support for LazySimpleSerDe. Will add the same support to other serdes.

          Show
          Ning Zhang added a comment - added the JavaVoidObjectInspector support for LazySimpleSerDe. Will add the same support to other serdes.
          Hide
          Zheng Shao added a comment -

          Good finding! Let's fix the join bug on a separate JIRA. Can you open a jira on that?
          I am not sure about why we need to add JavaVoidObjectInspector support to LazySimpleSerDe - data from SerDe already have types (from columns). We only need to handle NULL that user specified in the query. Thoughts?

          Show
          Zheng Shao added a comment - Good finding! Let's fix the join bug on a separate JIRA. Can you open a jira on that? I am not sure about why we need to add JavaVoidObjectInspector support to LazySimpleSerDe - data from SerDe already have types (from columns). We only need to handle NULL that user specified in the query. Thoughts?
          Hide
          Ning Zhang added a comment -

          In the failing query, there is no column corresponding to null (select null from tt), so the semantic analyzer doesn't know the type of the NULL, therefore it is VOID type. That's why JavaVoidObjectInspector needs to be created. It is the similar that in (select 2 from tt), An JavaIntegerObjectInspector is needed for the constant 2.

          Show
          Ning Zhang added a comment - In the failing query, there is no column corresponding to null (select null from tt), so the semantic analyzer doesn't know the type of the NULL, therefore it is VOID type. That's why JavaVoidObjectInspector needs to be created. It is the similar that in (select 2 from tt), An JavaIntegerObjectInspector is needed for the constant 2.
          Hide
          Zheng Shao added a comment -

          Discussed with Ning offline. SemanticAnalyzer.genFileSinkPlan() will be the place for the change - in case the target is a local/remote directory (instead of a table), when we create table column names and types, we should use string type if the input is a void type. Later in the same function, we are using genConversionSelectOperator() to convert the column types.

          So ODBC/JDBC will see this column as string type. Raghu confirmed that MySQL is creating a table with a column typed "BINARY(0)" when running "CREATE TABLE AS SELECT NULL, ..."

          Show
          Zheng Shao added a comment - Discussed with Ning offline. SemanticAnalyzer.genFileSinkPlan() will be the place for the change - in case the target is a local/remote directory (instead of a table), when we create table column names and types, we should use string type if the input is a void type. Later in the same function, we are using genConversionSelectOperator() to convert the column types. So ODBC/JDBC will see this column as string type. Raghu confirmed that MySQL is creating a table with a column typed "BINARY(0)" when running "CREATE TABLE AS SELECT NULL, ..."
          Hide
          Namit Jain added a comment -

          hive> select * from (select NULL from zshao_tt union all select 1 from zshao_tt)x;
          select * from (select NULL from zshao_tt union all select 1 from zshao_tt)x;
          FAILED: Error in semantic analysis: Schema of both sides of union should match: Column _c0 is of type void on first table and type int on second table

          Do you want to handle this also in this patch, or file a new jira for above ?

          Show
          Namit Jain added a comment - hive> select * from (select NULL from zshao_tt union all select 1 from zshao_tt)x; select * from (select NULL from zshao_tt union all select 1 from zshao_tt)x; FAILED: Error in semantic analysis: Schema of both sides of union should match: Column _c0 is of type void on first table and type int on second table Do you want to handle this also in this patch, or file a new jira for above ?
          Hide
          Ning Zhang added a comment -

          Namit, thanks for the comments! I have started a new JIRA HIVE-747 for that. I'll look at this one soon.

          Show
          Ning Zhang added a comment - Namit, thanks for the comments! I have started a new JIRA HIVE-747 for that. I'll look at this one soon.
          Hide
          Ning Zhang added a comment -

          fix attached.

          Show
          Ning Zhang added a comment - fix attached.
          Hide
          Namit Jain added a comment -

          +1

          looks good - will commit if the tests pass

          Show
          Namit Jain added a comment - +1 looks good - will commit if the tests pass
          Hide
          Namit Jain added a comment -

          Committed. Thanks Ning

          Show
          Namit Jain added a comment - Committed. Thanks Ning
          Hide
          Ning Zhang added a comment -

          Cool!

          Thank you very much!

          Ning

          Show
          Ning Zhang added a comment - Cool! Thank you very much! Ning

            People

            • Assignee:
              Ning Zhang
              Reporter:
              Zheng Shao
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development