Hive
  1. Hive
  2. HIVE-3454

Problem with CAST(BIGINT as TIMESTAMP)

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1
    • Fix Version/s: 1.2.0
    • Component/s: Types, UDF
    • Labels:
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Hide
      The behaviors of converting from BOOLEAN/TINYINT/SMALLINT/INT/BIGINT and converting from FLOAT/DOUBLE to TIMESTAMP were inconsistent. The value of a BOOLEAN/TINYINT/SMALLINT/INT/BIGINT was treated as the time in milliseconds while the value of a FLOAT/DOUBLE was treated as the time in seconds. After the change, all the types during the conversion are interpreted in seconds.
      Show
      The behaviors of converting from BOOLEAN/TINYINT/SMALLINT/INT/BIGINT and converting from FLOAT/DOUBLE to TIMESTAMP were inconsistent. The value of a BOOLEAN/TINYINT/SMALLINT/INT/BIGINT was treated as the time in milliseconds while the value of a FLOAT/DOUBLE was treated as the time in seconds. After the change, all the types during the conversion are interpreted in seconds.
    • Tags:
      timestamp

      Description

      Ran into an issue while working with timestamp conversion.
      CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp()

      Instead, however, a 1970-01-16 timestamp is returned.

      1. HIVE-3454.1.patch.txt
        0.5 kB
        Paul Bergeron
      2. HIVE-3454.3.patch
        302 kB
        Aihua Xu
      3. HIVE-3454.patch
        0.9 kB
        Arijit Banerjee

        Issue Links

          Activity

          Hide
          Richard Nadeau added a comment -

          Additional information/workaround:

          This looks like a nanosecond bug similar to HIVE-3090, so I tried this:

          cast((unix_timestamp() * 1.0) as TIMESTAMP)

          and got this:

          2012-09-17 14:49:39.0

          Which is what it should be, but I would have expected this to be handled without the "conversion".

          cast(round(unix_timestamp(),3) as TIMESTAMP)

          Will also work.

          Show
          Richard Nadeau added a comment - Additional information/workaround: This looks like a nanosecond bug similar to HIVE-3090 , so I tried this: cast((unix_timestamp() * 1.0) as TIMESTAMP) and got this: 2012-09-17 14:49:39.0 Which is what it should be, but I would have expected this to be handled without the "conversion". cast(round(unix_timestamp(),3) as TIMESTAMP) Will also work.
          Hide
          Navis added a comment -

          unix_timestamp() returns seconds but Timestamp expects millisecond value if input type is integer or long. I think you should multiply the value with 1000 for proper converting.

          Show
          Navis added a comment - unix_timestamp() returns seconds but Timestamp expects millisecond value if input type is integer or long. I think you should multiply the value with 1000 for proper converting.
          Hide
          Ryan Harris added a comment -

          These are both valid options to work around the bug. However from the Hive wiki:

          https://cwiki.apache.org/Hive/languagemanual-types.html#LanguageManualTypes-Timestamps
          "Supports traditional UNIX timestamp with optional nanosecond precision.

          Supported conversions:

          Integer numeric types: Interpreted as UNIX timestamp in seconds"

          The documentation is stating that TIMESTAMP is supposed to accept integer numeric types. Nanosecond is optional.

          I assume that the documentation in the wiki is the intended behavior. It isn't working as described in the wiki, that is why I have marked this as a bug.

          Show
          Ryan Harris added a comment - These are both valid options to work around the bug. However from the Hive wiki: https://cwiki.apache.org/Hive/languagemanual-types.html#LanguageManualTypes-Timestamps "Supports traditional UNIX timestamp with optional nanosecond precision. Supported conversions: Integer numeric types: Interpreted as UNIX timestamp in seconds" The documentation is stating that TIMESTAMP is supposed to accept integer numeric types. Nanosecond is optional. I assume that the documentation in the wiki is the intended behavior. It isn't working as described in the wiki, that is why I have marked this as a bug.
          Hide
          Navis added a comment -

          You are right. It's not consistent with description in wiki. This should be fixed ASAP but backward compatibility also be considered. I don't know how to resolve.

          Show
          Navis added a comment - You are right. It's not consistent with description in wiki. This should be fixed ASAP but backward compatibility also be considered. I don't know how to resolve.
          Hide
          Ryan Harris added a comment -

          After commenting on HIVE-3822 I've realized that even with this quasi-workaround, the cast(int/float/double as timestamp) is fundamentally broken.
          When an epoch (assumed GMT) timestamp is passed through the cast() function it is converted to a different time based on the local timezone...this doesn't happen if the timestamp is cast from a formatted date string....this behavior is inconsistent.

          Furthermore, when attempting to use the to_utc_timestamp() function with a epoch date value the implicit cast() poisons the result of the timestamp that is stored by to_utc_timestamp()

          Something is definitely wrong.

          Show
          Ryan Harris added a comment - After commenting on HIVE-3822 I've realized that even with this quasi-workaround, the cast(int/float/double as timestamp) is fundamentally broken. When an epoch (assumed GMT) timestamp is passed through the cast() function it is converted to a different time based on the local timezone...this doesn't happen if the timestamp is cast from a formatted date string....this behavior is inconsistent. Furthermore, when attempting to use the to_utc_timestamp() function with a epoch date value the implicit cast() poisons the result of the timestamp that is stored by to_utc_timestamp() Something is definitely wrong.
          Hide
          Paul Bergeron added a comment -

          Using java.sql.Date instead of java.util.Date because java.sql.Date implements a timezone agnostic method of representing timestamps (always UTC, which is what a unix epoch timestamp is defined as)

          Show
          Paul Bergeron added a comment - Using java.sql.Date instead of java.util.Date because java.sql.Date implements a timezone agnostic method of representing timestamps (always UTC, which is what a unix epoch timestamp is defined as)
          Hide
          Arijit Banerjee added a comment -

          I think a good approach would be to fix it in a way that it can work with both double and long values. This is what I did and works for me. Let me know what you think.

          In org.apache.hadoop.hive.serde2.objectinspector.primitive PrimitiveObjectInspectorUtils.java edit getTimestamp method and convert the long object to double.

          getTimestamp(Object o, PrimitiveObjectInspector oi)

          case LONG:
          //Timestamp conversion from LONG is messy. Converting to double.
          long tsLongPrimitive=((LongObjectInspector) oi).get(o);
          result =TimestampWritable.doubleToTimestamp(tsLongPrimitive*1.0);
          break;

          Show
          Arijit Banerjee added a comment - I think a good approach would be to fix it in a way that it can work with both double and long values. This is what I did and works for me. Let me know what you think. In org.apache.hadoop.hive.serde2.objectinspector.primitive PrimitiveObjectInspectorUtils.java edit getTimestamp method and convert the long object to double. getTimestamp(Object o, PrimitiveObjectInspector oi) case LONG: //Timestamp conversion from LONG is messy. Converting to double. long tsLongPrimitive=((LongObjectInspector) oi).get(o); result =TimestampWritable.doubleToTimestamp(tsLongPrimitive*1.0); break;
          Hide
          Arijit Banerjee added a comment -

          Adding a patch for fixing the issue. Apply to trunk's serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java.

          Show
          Arijit Banerjee added a comment - Adding a patch for fixing the issue. Apply to trunk's serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java.
          Hide
          Arijit Banerjee added a comment -

          Query : select cast(unix_timestamp()*1.0 as timestamp) , cast(unix_timestamp() as timestamp) from mycrime limit 1;

          Output(After patching with HIVE-3454.patch) - both returns same and correct timestamps.

          2013-06-18 11:45:04 2013-06-18 11:45:04

          Show
          Arijit Banerjee added a comment - Query : select cast(unix_timestamp()*1.0 as timestamp) , cast(unix_timestamp() as timestamp) from mycrime limit 1; Output(After patching with HIVE-3454 .patch) - both returns same and correct timestamps. 2013-06-18 11:45:04 2013-06-18 11:45:04
          Hide
          Kostiantyn Kudriavtsev added a comment -

          Hi there, when is this patch going to be applied to trunk? it seems it's enough important issue to be included

          Show
          Kostiantyn Kudriavtsev added a comment - Hi there, when is this patch going to be applied to trunk? it seems it's enough important issue to be included
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12588389/HIVE-3454.patch

          SUCCESS: +1 4789 tests passed

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/671/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/671/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12588389

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12588389/HIVE-3454.patch SUCCESS: +1 4789 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/671/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/671/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12588389
          Hide
          Arijit Banerjee added a comment -

          Could this patch be checked in?

          Show
          Arijit Banerjee added a comment - Could this patch be checked in?
          Hide
          David P Moore added a comment -

          I see that this is still an issue. I found this today when trying to convert from integer to timestamp. In order to get this to work I had to add milliseconds (3 digits after the unix epoch) and also use a bigint intstead of int.

          I am using version 0.12. Any idea when this will be put in the main trunk?

          Show
          David P Moore added a comment - I see that this is still an issue. I found this today when trying to convert from integer to timestamp. In order to get this to work I had to add milliseconds (3 digits after the unix epoch) and also use a bigint intstead of int. I am using version 0.12. Any idea when this will be put in the main trunk?
          Hide
          Venkata Ramana G added a comment -

          I am using 0.12 release still this problems exists. Inconsistancy in casting.

          select cast(cast(cast(1000 as INT) as TIMESTAMP) as INT) from src limit 1; result: 1
          select cast(cast(cast(1000 as BIGINT) as TIMESTAMP) as BIGINT) from src limit 1; result: 1
          select cast(cast(cast(1000.0 as DOUBLE) as TIMESTAMP) as DOUBLE) from src limit 1; result: 1000.0

          converting int/bigint/tinyint to timestamp and back to int/bigint/tinyint changes the value.
          int/bigint/tinyint/boolean conversion to timestamp, should take input in seconds, just like double.

          Issue raised in sep/2012, still problem persists. This requires a fix.

          Show
          Venkata Ramana G added a comment - I am using 0.12 release still this problems exists. Inconsistancy in casting. select cast(cast(cast(1000 as INT) as TIMESTAMP) as INT) from src limit 1; result: 1 select cast(cast(cast(1000 as BIGINT) as TIMESTAMP) as BIGINT) from src limit 1; result: 1 select cast(cast(cast(1000.0 as DOUBLE) as TIMESTAMP) as DOUBLE) from src limit 1; result: 1000.0 converting int/bigint/tinyint to timestamp and back to int/bigint/tinyint changes the value. int/bigint/tinyint/boolean conversion to timestamp, should take input in seconds, just like double. Issue raised in sep/2012, still problem persists. This requires a fix.
          Hide
          Aihua Xu added a comment -

          Update the primitive types including boolean, byte, short ,int, long to be consistently represent the time in seconds when we convert them into timestamp.

          Show
          Aihua Xu added a comment - Update the primitive types including boolean, byte, short ,int, long to be consistently represent the time in seconds when we convert them into timestamp.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12697289/HIVE-3454.2.patch

          ERROR: -1 due to 6 failed/errored test(s), 7526 tests executed
          Failed tests:

          TestCustomAuthentication - did not produce a TEST-*.xml file
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_decimal_date
          org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_timestamp
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_decimal_date
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_casts
          org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_decimal_date
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2705/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2705/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2705/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 6 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12697289 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12697289/HIVE-3454.2.patch ERROR: -1 due to 6 failed/errored test(s), 7526 tests executed Failed tests: TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_decimal_date org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_timestamp org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_decimal_date org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_casts org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_decimal_date Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2705/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2705/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2705/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed This message is automatically generated. ATTACHMENT ID: 12697289 - PreCommit-HIVE-TRUNK-Build
          Hide
          Aihua Xu added a comment -

          Unit tests baseline updated.

          Show
          Aihua Xu added a comment - Unit tests baseline updated.
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12697536/HIVE-3454.3.patch

          SUCCESS: +1 7541 tests passed

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2732/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2732/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2732/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12697536 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12697536/HIVE-3454.3.patch SUCCESS: +1 7541 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2732/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2732/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2732/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12697536 - PreCommit-HIVE-TRUNK-Build
          Hide
          Aihua Xu added a comment -

          +Brock Noland Can you take a look at the code?

          Show
          Aihua Xu added a comment - + Brock Noland Can you take a look at the code?
          Hide
          Brock Noland added a comment -

          Seems reasonable to me! Sergio Peña could you look as you have more experience here than I do.

          Show
          Brock Noland added a comment - Seems reasonable to me! Sergio Peña could you look as you have more experience here than I do.
          Hide
          Jason Dere added a comment -

          We may need to mark this as an incompatible change - folks have been using this behavior (inconsistent as it is) for a while now.
          Also, HIVE-9298 added TimestampParser capable of interpreting numeric text input as milliseconds since Unix epoch. Should we change this to seconds to make it consistent with the changes done here?

          Show
          Jason Dere added a comment - We may need to mark this as an incompatible change - folks have been using this behavior (inconsistent as it is) for a while now. Also, HIVE-9298 added TimestampParser capable of interpreting numeric text input as milliseconds since Unix epoch. Should we change this to seconds to make it consistent with the changes done here?
          Hide
          Aihua Xu added a comment -

          Jason Dere Thanks for pointing that out. I marked it as incompatible change (of course technically it's not incompatible).
          Regarding the TimestampParser, since MillisDateFormatParser is just one of the parser to support, I think it should be good without any change. Probably we can support SecondsDateFormatParser in the future if it's necessary.

          Show
          Aihua Xu added a comment - Jason Dere Thanks for pointing that out. I marked it as incompatible change (of course technically it's not incompatible). Regarding the TimestampParser, since MillisDateFormatParser is just one of the parser to support, I think it should be good without any change. Probably we can support SecondsDateFormatParser in the future if it's necessary.
          Hide
          Brock Noland added a comment -

          I hate new configuration...but if people are depending on this should we make the behavior change configurable?

          Show
          Brock Noland added a comment - I hate new configuration...but if people are depending on this should we make the behavior change configurable?
          Hide
          Aihua Xu added a comment -

          Me either and I feel a new configuration may make things more complicated. From the comments above, actually the users are more using workarounds (converting int/bigint to float/double first before interpreting it as timestamp). My opinion is not to add a new configuration and keep things simple.

          Show
          Aihua Xu added a comment - Me either and I feel a new configuration may make things more complicated. From the comments above, actually the users are more using workarounds (converting int/bigint to float/double first before interpreting it as timestamp). My opinion is not to add a new configuration and keep things simple.
          Hide
          Aihua Xu added a comment -

          Me either and I feel a new configuration may make things more complicated. From the comments above, actually the users are more using workarounds (converting int/bigint to float/double first before interpreting it as timestamp). My opinion is not to add a new configuration and keep things simple.

          Show
          Aihua Xu added a comment - Me either and I feel a new configuration may make things more complicated. From the comments above, actually the users are more using workarounds (converting int/bigint to float/double first before interpreting it as timestamp). My opinion is not to add a new configuration and keep things simple.
          Hide
          Sergio Peña added a comment -

          It looks very good.

          I agree with Aihua Xu, if users were using workarounds, then they shouldn't have any problem with this change, as float/double would work fine.The only thing is what if users are using integer values to get the timestamp? If they know that they need to pass milliseconds, they they could have these values in their queries; and now this change would give them unexpected values. However, the wiki states that seconds is the allowed number, so these users are using it incorrectly.

          I agree with this change so that it fixes the bug, and complies with the wiki page.

          +1

          Show
          Sergio Peña added a comment - It looks very good. I agree with Aihua Xu , if users were using workarounds, then they shouldn't have any problem with this change, as float/double would work fine.The only thing is what if users are using integer values to get the timestamp? If they know that they need to pass milliseconds, they they could have these values in their queries; and now this change would give them unexpected values. However, the wiki states that seconds is the allowed number, so these users are using it incorrectly. I agree with this change so that it fixes the bug, and complies with the wiki page. +1
          Hide
          Aihua Xu added a comment -

          Discussed with Brock offline, seems more reasonable to make it configurable so that we won't break existing customers. Set "int.timestamp.conversion.in.seconds" to true in hive-site.xml to make the behavior consistent. We will gradually switch to the seconds interpretation.

          Show
          Aihua Xu added a comment - Discussed with Brock offline, seems more reasonable to make it configurable so that we won't break existing customers. Set "int.timestamp.conversion.in.seconds" to true in hive-site.xml to make the behavior consistent. We will gradually switch to the seconds interpretation.
          Hide
          Jason Dere added a comment -

          Can you make the naming of the HiveConf parameter more consistent with the rest of the HiveConf variable names (starts with "hive.").

          Show
          Jason Dere added a comment - Can you make the naming of the HiveConf parameter more consistent with the rest of the HiveConf variable names (starts with "hive.").
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12699343/HIVE-3454.3.patch

          SUCCESS: +1 7553 tests passed

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2817/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2817/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2817/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12699343 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12699343/HIVE-3454.3.patch SUCCESS: +1 7553 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2817/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2817/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2817/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12699343 - PreCommit-HIVE-TRUNK-Build
          Hide
          Aihua Xu added a comment -

          Thanks Jason Dere for reviewing. Just updated the parameter name.

          Show
          Aihua Xu added a comment - Thanks Jason Dere for reviewing. Just updated the parameter name.
          Hide
          Aihua Xu added a comment -

          Thanks Jason Dere for reviewing. Just updated the parameter name.

          Show
          Aihua Xu added a comment - Thanks Jason Dere for reviewing. Just updated the parameter name.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12699474/HIVE-3454.3.patch

          ERROR: -1 due to 1 failed/errored test(s), 7557 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2820/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2820/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2820/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 1 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12699474 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12699474/HIVE-3454.3.patch ERROR: -1 due to 1 failed/errored test(s), 7557 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2820/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2820/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2820/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12699474 - PreCommit-HIVE-TRUNK-Build
          Hide
          Aihua Xu added a comment -

          The test failure is unrelated to the change.

          Show
          Aihua Xu added a comment - The test failure is unrelated to the change.
          Hide
          Brock Noland added a comment - - edited

          Have we tested this as part of an MR job? I don't think that the hive-site.xml is shipped as part of MR jobs. If that is true, how about we do as follows:

          1) Add method public static void initialize(Configuration) to TimestampWritable
          2) Call this method from AbstractSerDe.initialize which should be called, with configuration, in all the right places.
          3) In TimestampWritable.initialize you can use the static HiveConf.getBoolVar

          a bit kludgy but it should work. This all assuming the current impl doesn't work.

          "timestamp conversion."

          I think we need a space after this.

          Show
          Brock Noland added a comment - - edited Have we tested this as part of an MR job? I don't think that the hive-site.xml is shipped as part of MR jobs. If that is true, how about we do as follows: 1) Add method public static void initialize(Configuration) to TimestampWritable 2) Call this method from AbstractSerDe.initialize which should be called, with configuration, in all the right places. 3) In TimestampWritable.initialize you can use the static HiveConf.getBoolVar a bit kludgy but it should work. This all assuming the current impl doesn't work. "timestamp conversion." I think we need a space after this.
          Hide
          Aihua Xu added a comment -

          Yeah. I have tested with an MR job and it picks up the hive-site.xml without the problem with hiveserver2 or CLI.

          Show
          Aihua Xu added a comment - Yeah. I have tested with an MR job and it picks up the hive-site.xml without the problem with hiveserver2 or CLI.
          Hide
          Jason Dere added a comment -

          If this config setting is initialized once in a static block, then for hiveserver2 all subsequent sessions would be stuck with the initial setting regardless of the config settings of the session, right? During the MR jobs, would we then see the sec/msec behavior flip to use the session's config settings since the static variable is being initialized for the first time in MR task?

          Show
          Jason Dere added a comment - If this config setting is initialized once in a static block, then for hiveserver2 all subsequent sessions would be stuck with the initial setting regardless of the config settings of the session, right? During the MR jobs, would we then see the sec/msec behavior flip to use the session's config settings since the static variable is being initialized for the first time in MR task?
          Hide
          Aihua Xu added a comment -

          Yeah. Thanks for pointing that out. I also notice that problem. We won't be able to override during the session with that approach and also we could have problem if we run in the real cluster as Brock pointed out. I'm trying a different approach.

          Show
          Aihua Xu added a comment - Yeah. Thanks for pointing that out. I also notice that problem. We won't be able to override during the session with that approach and also we could have problem if we run in the real cluster as Brock pointed out. I'm trying a different approach.
          Hide
          Aihua Xu added a comment -

          Read the configuration from passed in Configuration object from the initialize() in AbstractSerDe class. Since AvroSerDe overrides the function and OrcSerDe doesn't inherit from AbstractSerDe right now, added in these two classes as well. It will support the session override.

          Show
          Aihua Xu added a comment - Read the configuration from passed in Configuration object from the initialize() in AbstractSerDe class. Since AvroSerDe overrides the function and OrcSerDe doesn't inherit from AbstractSerDe right now, added in these two classes as well. It will support the session override.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12699952/HIVE-3454.4.patch

          ERROR: -1 due to 1 failed/errored test(s), 7568 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2836/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2836/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2836/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 1 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12699952 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12699952/HIVE-3454.4.patch ERROR: -1 due to 1 failed/errored test(s), 7568 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2836/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2836/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2836/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12699952 - PreCommit-HIVE-TRUNK-Build
          Hide
          Brock Noland added a comment -

          +1 LGTM

          best we can do in this situation.

          Show
          Brock Noland added a comment - +1 LGTM best we can do in this situation.
          Hide
          Jason Dere added a comment -

          TimestampWritable.intToTimeStampInSeconds is still static, which means we could run into issues in HiveServer2 with concurrent queries. Maybe this should be thread-local.
          Hmm, yeah I'm not sure where the best place is to call TimestampWritable.initialize() .. I was going to recommend Driver.compile(), though we would also have to add it to MR task initialization somewhere if there is such a place.

          Show
          Jason Dere added a comment - TimestampWritable.intToTimeStampInSeconds is still static, which means we could run into issues in HiveServer2 with concurrent queries. Maybe this should be thread-local. Hmm, yeah I'm not sure where the best place is to call TimestampWritable.initialize() .. I was going to recommend Driver.compile(), though we would also have to add it to MR task initialization somewhere if there is such a place.
          Hide
          Aihua Xu added a comment -

          Jason Dere I spent some time looking into the parallel in Hive. Hive can handle queries in parallel or a complex query breaking into subtasks executed in parallel. In both cases, Hive hands over the multiple jobs to hadoop and those jobs won't affect each other. I have simulated the cases by debugging them from different sessions and it works as expected.

          Show
          Aihua Xu added a comment - Jason Dere I spent some time looking into the parallel in Hive. Hive can handle queries in parallel or a complex query breaking into subtasks executed in parallel. In both cases, Hive hands over the multiple jobs to hadoop and those jobs won't affect each other. I have simulated the cases by debugging them from different sessions and it works as expected.
          Hide
          Aihua Xu added a comment -

          The unit test failure is unrelated to the change. Jason Dere More comments on my input above?

          Show
          Aihua Xu added a comment - The unit test failure is unrelated to the change. Jason Dere More comments on my input above?
          Hide
          Jason Dere added a comment -

          I guess the only possible issue here might be for expressions that get evaluated during compilation as opposed to during the tasks, something like cast(0 as timestamp) which would get evaluated by constant folding. Not sure if there is any other possible work/tasks that happen locally in-process.

          Show
          Jason Dere added a comment - I guess the only possible issue here might be for expressions that get evaluated during compilation as opposed to during the tasks, something like cast(0 as timestamp) which would get evaluated by constant folding. Not sure if there is any other possible work/tasks that happen locally in-process.
          Hide
          Aihua Xu added a comment -

          If the job runs locally in-process, like cast(0 as timestamp), SerDe is still used for intermediate output and thus the property still gets set and works fine.

          Show
          Aihua Xu added a comment - If the job runs locally in-process, like cast(0 as timestamp), SerDe is still used for intermediate output and thus the property still gets set and works fine.
          Hide
          Jason Dere added a comment -

          If concurrent compilation is possible, then there is still the possibility of multiple threads setting/accessing the same static variable and messing up the config setting for the other threads. You're adding this to TimestampWritable - there is even a member in that class that was changed to thread-local as a result of HIVE-4516.

          Show
          Jason Dere added a comment - If concurrent compilation is possible, then there is still the possibility of multiple threads setting/accessing the same static variable and messing up the config setting for the other threads. You're adding this to TimestampWritable - there is even a member in that class that was changed to thread-local as a result of HIVE-4516 .
          Hide
          Aihua Xu added a comment -

          I see your points that multiple threads run in single mapper or reducer access that class and that may cause concurrency issue. I will try to call the initialize() from Mapper and Reducer's initialization function to make sure it initializes once. And also Driver.compile() (I will check what would be the good place for the local mode).

          Show
          Aihua Xu added a comment - I see your points that multiple threads run in single mapper or reducer access that class and that may cause concurrency issue. I will try to call the initialize() from Mapper and Reducer's initialization function to make sure it initializes once. And also Driver.compile() (I will check what would be the good place for the local mode).
          Hide
          Aihua Xu added a comment -

          Updated to initialize for local mode in Driver.runInternal() and configure() of ExecMapper and ExecReducer.

          Show
          Aihua Xu added a comment - Updated to initialize for local mode in Driver.runInternal() and configure() of ExecMapper and ExecReducer.
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12701998/HIVE-3454.4.patch

          SUCCESS: +1 7580 tests passed

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2921/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2921/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2921/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12701998 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701998/HIVE-3454.4.patch SUCCESS: +1 7580 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2921/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2921/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2921/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12701998 - PreCommit-HIVE-TRUNK-Build
          Hide
          Aihua Xu added a comment -

          Jason Dere Can you review the latest patch? Right now it should be set once for each Mapper and Reducer.

          Show
          Aihua Xu added a comment - Jason Dere Can you review the latest patch? Right now it should be set once for each Mapper and Reducer.
          Hide
          Jason Dere added a comment -
          • Can you make intToTimeStampInSeconds thread-local?
          • According to Vikram Dixit K, Tez uses separate classes, the equivalent to ExecMapper/ExecReducer are MapRecordProcessor/ReduceRecordProcessor. It looks like Spark also has its own equivalents (SparkMapRecordHandler/SparkMapReduceHandler). If you want we can defer those changes and open a follow-up Jira to do that work.
          Show
          Jason Dere added a comment - Can you make intToTimeStampInSeconds thread-local? According to Vikram Dixit K , Tez uses separate classes, the equivalent to ExecMapper/ExecReducer are MapRecordProcessor/ReduceRecordProcessor. It looks like Spark also has its own equivalents (SparkMapRecordHandler/SparkMapReduceHandler). If you want we can defer those changes and open a follow-up Jira to do that work.
          Hide
          Aihua Xu added a comment -

          #Actually we don't want it to be thread-local since we want all threads, e.g, within Mapper to have the same settings. If we set to be thread-local, the threads within the same mapper which don't set intToTimeStampInSeconds will have incorrect behaviors. Of course, the original problem was that we can't access hive properties from TimestampWritable class and we have to set from outside.

          #I will create a follow-up jira to do the rest (Tez and Spark).

          Show
          Aihua Xu added a comment - #Actually we don't want it to be thread-local since we want all threads, e.g, within Mapper to have the same settings. If we set to be thread-local, the threads within the same mapper which don't set intToTimeStampInSeconds will have incorrect behaviors. Of course, the original problem was that we can't access hive properties from TimestampWritable class and we have to set from outside. #I will create a follow-up jira to do the rest (Tez and Spark).
          Hide
          Sergey Shelukhin added a comment -

          Please don't use the static intToTimeStampInSeconds, it is not thread-safe

          Show
          Sergey Shelukhin added a comment - Please don't use the static intToTimeStampInSeconds, it is not thread-safe
          Hide
          Jason Dere added a comment -

          Actually we don't want it to be thread-local since we want all threads, e.g, within Mapper to have the same settings. If we set to be thread-local, the threads within the same mapper which don't set intToTimeStampInSeconds will have incorrect behaviors. Of course, the original problem was that we can't access hive properties from TimestampWritable class and we have to set from outside.

          Asking around a bit about HIVE-7926, it sounds like LLAP will potentially have multiple queries running at the same time in different threads, which would make a single static variable problematic. It also sounds like each thread should in theory have the equivalent of something like ExecMapper.configure() where thread-local context is initialized, so I think that any threads running map/reduce work should have an opportunity to set intToTimeStampInSeconds. I would think that things would be similar for Spark as well, though someone with more familiarity with Hive-Spark execution would be better suited to answer that question.

          Show
          Jason Dere added a comment - Actually we don't want it to be thread-local since we want all threads, e.g, within Mapper to have the same settings. If we set to be thread-local, the threads within the same mapper which don't set intToTimeStampInSeconds will have incorrect behaviors. Of course, the original problem was that we can't access hive properties from TimestampWritable class and we have to set from outside. Asking around a bit about HIVE-7926 , it sounds like LLAP will potentially have multiple queries running at the same time in different threads, which would make a single static variable problematic. It also sounds like each thread should in theory have the equivalent of something like ExecMapper.configure() where thread-local context is initialized, so I think that any threads running map/reduce work should have an opportunity to set intToTimeStampInSeconds. I would think that things would be similar for Spark as well, though someone with more familiarity with Hive-Spark execution would be better suited to answer that question.
          Hide
          Brock Noland added a comment -

          HOS won't have this issue as the executors are tied to a user session. I think we should commit the patch as is and create a follow-up to address the issue with LLAP.

          Show
          Brock Noland added a comment - HOS won't have this issue as the executors are tied to a user session. I think we should commit the patch as is and create a follow-up to address the issue with LLAP.
          Hide
          Jason Dere added a comment -

          I'm willing to defer that work to a different Jira. +1

          But I'm surprised this would not be an issue with HOS as well .. can HOS have multiple threads doing work for more than one query at the same time? If so then HOS would get hit by this too since this setting is relying on a static variable.

          Show
          Jason Dere added a comment - I'm willing to defer that work to a different Jira. +1 But I'm surprised this would not be an issue with HOS as well .. can HOS have multiple threads doing work for more than one query at the same time? If so then HOS would get hit by this too since this setting is relying on a static variable.
          Hide
          Brock Noland added a comment -

          an HOS have multiple threads doing work for more than one query at the same time?

          no it cannot. HOS does do concurrent tasks but not across queries.

          Show
          Brock Noland added a comment - an HOS have multiple threads doing work for more than one query at the same time? no it cannot. HOS does do concurrent tasks but not across queries.
          Hide
          Aihua Xu added a comment -

          Was not aware of LLAP. Seems reasonable to defer the issue since I guess LLAP needs to handle many other similar thread-safe problems like this. Correct me if I'm wrong with the hadoop execution: currently (without LLAP) each ExecMapper and ExecReducer runs in its own JVM and it could spin off new threads. Each ExecMapper and ExecReducer handles one job from a query and we want those threads to share that same static variable value.

          With LLAP, if ExecMapper runs in a thread and doesn't spin off new threads, then thread-local would work, but we still need to handle current existing scenario. If new threads are spun off, thread-local actually won't work.

          Show
          Aihua Xu added a comment - Was not aware of LLAP. Seems reasonable to defer the issue since I guess LLAP needs to handle many other similar thread-safe problems like this. Correct me if I'm wrong with the hadoop execution: currently (without LLAP) each ExecMapper and ExecReducer runs in its own JVM and it could spin off new threads. Each ExecMapper and ExecReducer handles one job from a query and we want those threads to share that same static variable value. With LLAP, if ExecMapper runs in a thread and doesn't spin off new threads, then thread-local would work, but we still need to handle current existing scenario. If new threads are spun off, thread-local actually won't work.
          Hide
          Sergey Shelukhin added a comment -

          It just seems like a bad design pattern irrespective of LLAP. Gunther Hagleitner did some work in operator pipeline to clean up statics and such, he can comment. What is the real scope of this variable? It should be scoped accordingly. Even a singleton map would be better...

          Show
          Sergey Shelukhin added a comment - It just seems like a bad design pattern irrespective of LLAP. Gunther Hagleitner did some work in operator pipeline to clean up statics and such, he can comment. What is the real scope of this variable? It should be scoped accordingly. Even a singleton map would be better...
          Hide
          Gunther Hagleitner added a comment -

          I agree with Sergey Shelukhin. These statics aren't just a pain w/ multi threading, they also hurt if you just re-use the same jvm for multiple things. This can happen in tez, spark and if we decide to run stuff on the client/in hs2 afaik.

          More specifically though: This config should take effect at compile time. There is no reason to evaluate the condition for each value in each row at run time. We should be able to just install the appropriate udf when we compile the query, no?

          Show
          Gunther Hagleitner added a comment - I agree with Sergey Shelukhin . These statics aren't just a pain w/ multi threading, they also hurt if you just re-use the same jvm for multiple things. This can happen in tez, spark and if we decide to run stuff on the client/in hs2 afaik. More specifically though: This config should take effect at compile time. There is no reason to evaluate the condition for each value in each row at run time. We should be able to just install the appropriate udf when we compile the query, no?
          Hide
          Aihua Xu added a comment -

          I agree that for non thread-safe statics, it will cause problems for multi-threading, while for constants or immutable objects, I guess you would share across threads rather than creating each copy for each thread. This is the case of a java primitive that we initialize once and use/read later by threads.

          Regarding UDF approach, I looked into that and it seems promising, while the current approach actually only evaluates the value when the type is timestamp. I guess it would be the same as UDF approach.

          Show
          Aihua Xu added a comment - I agree that for non thread-safe statics, it will cause problems for multi-threading, while for constants or immutable objects, I guess you would share across threads rather than creating each copy for each thread. This is the case of a java primitive that we initialize once and use/read later by threads. Regarding UDF approach, I looked into that and it seems promising, while the current approach actually only evaluates the value when the type is timestamp. I guess it would be the same as UDF approach.
          Hide
          Aihua Xu added a comment -

          Of course, since we need to support LLAP, UDF approach seems to be the right approach.

          Show
          Aihua Xu added a comment - Of course, since we need to support LLAP, UDF approach seems to be the right approach.
          Hide
          Aihua Xu added a comment -

          I'm looking into the solution Gunther Hagleitner suggested. I will break the tasks into separate tasks, so that this jira will focus on the right/simple behavior and a followup to focus on making it configurable so that we can easily revert in the future and also focus on that only.

          Show
          Aihua Xu added a comment - I'm looking into the solution Gunther Hagleitner suggested. I will break the tasks into separate tasks, so that this jira will focus on the right/simple behavior and a followup to focus on making it configurable so that we can easily revert in the future and also focus on that only.
          Hide
          Aihua Xu added a comment -

          Brock Noland, Jason Dere and Gunther Hagleitner I created HIVE-9917 for the followup change. I will mark the current change as incompatible. How does that sound?

          Show
          Aihua Xu added a comment - Brock Noland , Jason Dere and Gunther Hagleitner I created HIVE-9917 for the followup change. I will mark the current change as incompatible. How does that sound?
          Hide
          Chao Sun added a comment -

          +1. Breaking into separate tasks makes sense to me, but it's better to have others' inputs (I'm no expert on this matter), before we commit this.

          Show
          Chao Sun added a comment - +1. Breaking into separate tasks makes sense to me, but it's better to have others' inputs (I'm no expert on this matter), before we commit this.
          Hide
          Jason Dere added a comment -

          That's fine to break into separate tasks, we'll just need to make sure the followup config task gets in before the next release.

          Show
          Jason Dere added a comment - That's fine to break into separate tasks, we'll just need to make sure the followup config task gets in before the next release.
          Hide
          Jason Dere added a comment -

          This has been marked patch available - which one should we be looking at - HIVE-3454.3.patch?

          Show
          Jason Dere added a comment - This has been marked patch available - which one should we be looking at - HIVE-3454 .3.patch?
          Hide
          Aihua Xu added a comment -

          Yes. That's right. That patch is to change in the correct way to interpreting all the datatypes as seconds. Thanks for looking.

          Show
          Aihua Xu added a comment - Yes. That's right. That patch is to change in the correct way to interpreting all the datatypes as seconds. Thanks for looking.
          Hide
          Jason Dere added a comment -

          I've just started looking at vectorized code .. looks like the corresponding change to make for the vectorized path will be in MathExpr.doubleToTimestamp(). CC'ing Matt McCline in case there is any more details to add here.

          Show
          Jason Dere added a comment - I've just started looking at vectorized code .. looks like the corresponding change to make for the vectorized path will be in MathExpr.doubleToTimestamp(). CC'ing Matt McCline in case there is any more details to add here.
          Hide
          Aihua Xu added a comment -

          Thanks Jason Dere. We need to correct MathErpr.longToTimestamp() function since we have inconsistency with the int/long type, Correct?

          Show
          Aihua Xu added a comment - Thanks Jason Dere . We need to correct MathErpr.longToTimestamp() function since we have inconsistency with the int/long type, Correct?
          Hide
          Jason Dere added a comment -

          Whoops, yes it is longToTimestamp() that should be fixed, since the point of this is to correct the int/long to timestamp behavior.

          Show
          Jason Dere added a comment - Whoops, yes it is longToTimestamp() that should be fixed, since the point of this is to correct the int/long to timestamp behavior.
          Hide
          Aihua Xu added a comment -

          Made additional changes to vectorized functions and unit tests.

          Show
          Aihua Xu added a comment - Made additional changes to vectorized functions and unit tests.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12705949/HIVE-3454.3.patch

          ERROR: -1 due to 1 failed/errored test(s), 7820 tests executed
          Failed tests:

          org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3091/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3091/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3091/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 1 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12705949 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12705949/HIVE-3454.3.patch ERROR: -1 due to 1 failed/errored test(s), 7820 tests executed Failed tests: org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3091/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3091/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3091/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12705949 - PreCommit-HIVE-TRUNK-Build
          Hide
          Jason Dere added a comment -

          So this patch did not require any changes to MathExpr? How does the behavior of the vectorized cast change?

          Show
          Jason Dere added a comment - So this patch did not require any changes to MathExpr? How does the behavior of the vectorized cast change?
          Hide
          Aihua Xu added a comment -

          Attached should be the right patch now. Maybe I uploaded the right patch first but I worried test run was not started (but seems the test was run against the right patch), so I uploaded again but the wrong one. Thanks for catching it.

          Show
          Aihua Xu added a comment - Attached should be the right patch now. Maybe I uploaded the right patch first but I worried test run was not started (but seems the test was run against the right patch), so I uploaded again but the wrong one. Thanks for catching it.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12706162/HIVE-3454.3.patch

          ERROR: -1 due to 3 failed/errored test(s), 7819 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in
          org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTypeCasts.testCastLongToTimestamp
          org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3105/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3105/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3105/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 3 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12706162 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706162/HIVE-3454.3.patch ERROR: -1 due to 3 failed/errored test(s), 7819 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTypeCasts.testCastLongToTimestamp org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3105/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3105/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3105/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed This message is automatically generated. ATTACHMENT ID: 12706162 - PreCommit-HIVE-TRUNK-Build
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12706362/HIVE-3454.3.patch

          ERROR: -1 due to 1 failed/errored test(s), 7819 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3107/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3107/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3107/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 1 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12706362 - PreCommit-HIVE-TRUNK-Build

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706362/HIVE-3454.3.patch ERROR: -1 due to 1 failed/errored test(s), 7819 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3107/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3107/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3107/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12706362 - PreCommit-HIVE-TRUNK-Build
          Hide
          Aihua Xu added a comment -

          This error is unrelated to the change and seems random.

          Show
          Aihua Xu added a comment - This error is unrelated to the change and seems random.
          Hide
          Jason Dere added a comment -

          +1

          Show
          Jason Dere added a comment - +1
          Hide
          Chao Sun added a comment -

          Committed to trunk. Thanks Aihua!

          Show
          Chao Sun added a comment - Committed to trunk. Thanks Aihua!
          Hide
          Aihua Xu added a comment -

          Thanks Chao Sun and Jason Dere.

          Show
          Aihua Xu added a comment - Thanks Chao Sun and Jason Dere .

            People

            • Assignee:
              Aihua Xu
              Reporter:
              Ryan Harris
            • Votes:
              6 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development