Hive
  1. Hive
  2. HIVE-189

Various TestCliDriver tests fail

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.3.0
    • Component/s: Testing Infrastructure
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      On a fresh checkout of trunk a number of tests fail in org.apache.hadoop.hive.cli.TestCliDriver:
      testCliDriver_sample5
      testCliDriver_input3_limit
      testCliDriver_udf5
      testCliDriver_sample3

      1. TEST-org.apache.hadoop.hive.cli.TestCliDriver.txt
        677 kB
        Johan Oskarsson
      2. TEST-org.apache.hadoop.hive.cli.TestCliDriver.txt
        802 kB
        Johan Oskarsson
      3. sample3.q.out
        2 kB
        Johan Oskarsson
      4. patch-189.txt
        0.7 kB
        Ashish Thusoo
      5. patch-189_6.txt
        36 kB
        Ashish Thusoo
      6. patch-189_5.txt
        24 kB
        Ashish Thusoo
      7. patch-189_4.txt
        24 kB
        Ashish Thusoo
      8. patch-189_3.txt
        23 kB
        Ashish Thusoo
      9. patch-189_2.txt
        24 kB
        Ashish Thusoo

        Issue Links

          Activity

          Carl Steinbach made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Carl Steinbach made changes -
          Fix Version/s 0.3.0 [ 12313637 ]
          Fix Version/s 0.6.0 [ 12314524 ]
          Zheng Shao made changes -
          Fix Version/s 0.6.0 [ 12314524 ]
          Fix Version/s 0.2.0 [ 12313565 ]
          Ashish Thusoo made changes -
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hide
          Ashish Thusoo added a comment -

          committed.

          Show
          Ashish Thusoo added a comment - committed.
          Hide
          Johan Oskarsson added a comment -

          +1. With the latest patch all tests pass just fine on my machine. Thanks for looking into this Ashish!

          Show
          Johan Oskarsson added a comment - +1. With the latest patch all tests pass just fine on my machine. Thanks for looking into this Ashish!
          Ashish Thusoo made changes -
          Attachment patch-189_6.txt [ 12399410 ]
          Hide
          Ashish Thusoo added a comment -

          Ok. This one should work. The problem was the ordering the sample queries and I have fixed those. Please verify and let me know. I will checkin once this passes for you.

          Show
          Ashish Thusoo added a comment - Ok. This one should work. The problem was the ordering the sample queries and I have fixed those. Please verify and let me know. I will checkin once this passes for you.
          Hide
          Johan Oskarsson added a comment -

          As you suspected the sample tests still fail with the new patch.

          Show
          Johan Oskarsson added a comment - As you suspected the sample tests still fail with the new patch.
          Ashish Thusoo made changes -
          Attachment patch-189_5.txt [ 12399179 ]
          Hide
          Ashish Thusoo added a comment -

          Another patch after resolving the conflicts. Can you try once more. I think this one will also fail but just wanted to make sure..

          Show
          Ashish Thusoo added a comment - Another patch after resolving the conflicts. Can you try once more. I think this one will also fail but just wanted to make sure..
          Johan Oskarsson made changes -
          Attachment sample3.q.out [ 12398578 ]
          Hide
          Johan Oskarsson added a comment -

          This is the output from /build/ql/test/logs/clientpositive/sample3.q.out after running the test

          Show
          Johan Oskarsson added a comment - This is the output from /build/ql/test/logs/clientpositive/sample3.q.out after running the test
          Hide
          Johan Oskarsson added a comment -

          It seems the latest patch doesn't apply cleanly anymore. Was going to run the test again and see if I can help figuring out what's going on.

          Show
          Johan Oskarsson added a comment - It seems the latest patch doesn't apply cleanly anymore. Was going to run the test again and see if I can help figuring out what's going on.
          Hide
          Ashish Thusoo added a comment -

          From the diffs I do not actually see any difference in the lines that are reported as diffining in the two test cases. Is there extra space at the end of one of these results? Puzzled...

          Show
          Ashish Thusoo added a comment - From the diffs I do not actually see any difference in the lines that are reported as diffining in the two test cases. Is there extra space at the end of one of these results? Puzzled...
          Johan Oskarsson made changes -
          Hide
          Johan Oskarsson added a comment -

          Tried out the latest patch but unfortunately both of the sample methods still fail. The rest works fine though. I don't have time to look deeper at why right now. Attaching log file for the test run.

          Show
          Johan Oskarsson added a comment - Tried out the latest patch but unfortunately both of the sample methods still fail. The rest works fine though. I don't have time to look deeper at why right now. Attaching log file for the test run.
          Hide
          Zheng Shao added a comment -

          +1

          Show
          Zheng Shao added a comment - +1
          Ashish Thusoo made changes -
          Attachment patch-189_4.txt [ 12398364 ]
          Hide
          Ashish Thusoo added a comment -

          Fixed the unit test output..

          Show
          Ashish Thusoo added a comment - Fixed the unit test output..
          Ashish Thusoo made changes -
          Attachment patch-189_3.txt [ 12398362 ]
          Hide
          Ashish Thusoo added a comment -

          Removed the comparator.

          Show
          Ashish Thusoo added a comment - Removed the comparator.
          Hide
          Ashish Thusoo added a comment -

          you are right!! Will upload a new patch.

          Show
          Ashish Thusoo added a comment - you are right!! Will upload a new patch.
          Hide
          Prasad Chakka added a comment -

          ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java:216 i think FileStatus default comparator is same the one you defined here. so custom comparator is not necessary.

          Show
          Prasad Chakka added a comment - ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java:216 i think FileStatus default comparator is same the one you defined here. so custom comparator is not necessary.
          Ashish Thusoo made changes -
          Attachment patch-189_2.txt [ 12398348 ]
          Hide
          Ashish Thusoo added a comment -

          changed the query in input2_limit.q

          Also fixed the getBucketPath so that it is deterministic now.

          Show
          Ashish Thusoo added a comment - changed the query in input2_limit.q Also fixed the getBucketPath so that it is deterministic now.
          Hide
          Ashish Thusoo added a comment -

          sample3 and sample5 are suffering from the same sorting issue. I think the related code there is at

          getBucketPath in

          ql/metadata/Partition.java

          I think we should be sorting the files there as well...

          This is picked up in

          TableSample.java to create the paths needed..

          Show
          Ashish Thusoo added a comment - sample3 and sample5 are suffering from the same sorting issue. I think the related code there is at getBucketPath in ql/metadata/Partition.java I think we should be sorting the files there as well... This is picked up in TableSample.java to create the paths needed..
          Hide
          Johan Oskarsson added a comment -

          Sounds like a good solution to me.

          Do you have any idea what might be happening in testCliDriver_sample5 and testCliDriver_sample3? Similar issue with file sorting?

          Show
          Johan Oskarsson added a comment - Sounds like a good solution to me. Do you have any idea what might be happening in testCliDriver_sample5 and testCliDriver_sample3? Similar issue with file sorting?
          Hide
          Ashish Thusoo added a comment -

          On thinking more about this, I think we should change the test for input3_limit.q to make it deterministic. Instead of

          INSERT OVERWRITE TABLE T2 SELECT a.key, a.value from T1 a LIMIT 20;

          We should have

          INSERT OVERWRITE TABLE T2 SELECT * FROM (SELECT * FROM T1 DISTRIBUTE BY key SORT BY key, value) T LIMIT 20;

          SELECT * FROM T2

          That would ensure that the rows that we get are top 20 rows.

          The order in which these files are processed are controlled by hadoop, so I don't think we have much control on whether kv1.txt gets processed first or kv2.txt.

          Thoughts?

          Show
          Ashish Thusoo added a comment - On thinking more about this, I think we should change the test for input3_limit.q to make it deterministic. Instead of INSERT OVERWRITE TABLE T2 SELECT a.key, a.value from T1 a LIMIT 20; We should have INSERT OVERWRITE TABLE T2 SELECT * FROM (SELECT * FROM T1 DISTRIBUTE BY key SORT BY key, value) T LIMIT 20; SELECT * FROM T2 That would ensure that the rows that we get are top 20 rows. The order in which these files are processed are controlled by hadoop, so I don't think we have much control on whether kv1.txt gets processed first or kv2.txt. Thoughts?
          Hide
          Johan Oskarsson added a comment -

          Zheng: Only one of the tests are fixed by the patch, the udf5 test. The others are working sometimes but mostly failing, I suspect they might all be related to File list sort order, but I have only confirmed it with one of them so far as mentioned above.
          There are a few options as far as I can see, commit the udf5 and open new ticket for the other three tests or try to fix all the tests in this ticket. I don't mind either way.

          Show
          Johan Oskarsson added a comment - Zheng: Only one of the tests are fixed by the patch, the udf5 test. The others are working sometimes but mostly failing, I suspect they might all be related to File list sort order, but I have only confirmed it with one of them so far as mentioned above. There are a few options as far as I can see, commit the udf5 and open new ticket for the other three tests or try to fix all the tests in this ticket. I don't mind either way.
          Hide
          Ashish Thusoo added a comment -

          The problematic code is in

          FetchTask.java:getNextPath()

          We are using fs.listStatus there to get the list of files and I do not think that gaurantees order.

          Show
          Ashish Thusoo added a comment - The problematic code is in FetchTask.java:getNextPath() We are using fs.listStatus there to get the list of files and I do not think that gaurantees order.
          Ashish Thusoo made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Ashish Thusoo added a comment -

          Submitting the patch for review.

          Show
          Ashish Thusoo added a comment - Submitting the patch for review.
          Hide
          Zheng Shao added a comment -

          I believe LIMIT statement does take the data from the first file first. So the file ordering does matter.

          The other 3 tests should be fixed by Ashish's patch correct? If it does, please let me know and I will commit Ashish's patch and close this jira (and please open a new one for the limit problem).

          Show
          Zheng Shao added a comment - I believe LIMIT statement does take the data from the first file first. So the file ordering does matter. The other 3 tests should be fixed by Ashish's patch correct? If it does, please let me know and I will commit Ashish's patch and close this jira (and please open a new one for the limit problem).
          Hide
          Johan Oskarsson added a comment -

          I'm starting to suspect that these are related to File.list ordering just like HIVE-90. I had a closer look at input3_limit.q. The important bits:
          <snip>
          LOAD DATA LOCAL INPATH '../data/files/kv1.txt' INTO TABLE T1;
          LOAD DATA LOCAL INPATH '../data/files/kv2.txt' INTO TABLE T1;
          <snip>
          INSERT OVERWRITE TABLE T2 SELECT a.key, a.value from T1 a LIMIT 20;
          <snip>

          It seems that the insert select statement with a 20 limit is meant to select from the kv1.txt file, but on my system it picks the kv2.txt file first. Thus the final output will be incorrect. Could someone confirm that this differs from a run where it passes?

          Show
          Johan Oskarsson added a comment - I'm starting to suspect that these are related to File.list ordering just like HIVE-90 . I had a closer look at input3_limit.q. The important bits: <snip> LOAD DATA LOCAL INPATH '../data/files/kv1.txt' INTO TABLE T1; LOAD DATA LOCAL INPATH '../data/files/kv2.txt' INTO TABLE T1; <snip> INSERT OVERWRITE TABLE T2 SELECT a.key, a.value from T1 a LIMIT 20; <snip> It seems that the insert select statement with a 20 limit is meant to select from the kv1.txt file, but on my system it picks the kv2.txt file first. Thus the final output will be incorrect. Could someone confirm that this differs from a run where it passes?
          Hide
          Johan Oskarsson added a comment -

          For whatever reason the three other tests in this ticket description have started failing for me again in a similar fashion to the log already attached to this ticket. Am I the only one to see this?

          Show
          Johan Oskarsson added a comment - For whatever reason the three other tests in this ticket description have started failing for me again in a similar fashion to the log already attached to this ticket. Am I the only one to see this?
          Hide
          Johan Oskarsson added a comment -

          I've tested this solution and it works for me, the udf5 test passes.

          Show
          Johan Oskarsson added a comment - I've tested this solution and it works for me, the udf5 test passes.
          Ashish Thusoo made changes -
          Assignee Ashish Thusoo [ athusoo ]
          Ashish Thusoo made changes -
          Attachment patch-189.txt [ 12397323 ]
          Hide
          Ashish Thusoo added a comment -

          Added setting of TZ in the ant file.

          Do let me know if this works and I can then submit it for review.

          Show
          Ashish Thusoo added a comment - Added setting of TZ in the ant file. Do let me know if this works and I can then submit it for review.
          Hide
          Ashish Thusoo added a comment -

          There are two ways to solve this, and I think we should do both in the long run:

          1. Add a variable to hive-site.xml to indicate the timezone. This variable if present would override the timezone set on the machine. Then we can set that variable for the hive-site.xml for the tests.\
          2. Make all the datetime functions timezone aware like they are for other RDBMs (mysql, oracle etc.) and then change the test query to take the timezone.

          Both of these would solve the problem and would be desirable feature extensions as well.

          Show
          Ashish Thusoo added a comment - There are two ways to solve this, and I think we should do both in the long run: 1. Add a variable to hive-site.xml to indicate the timezone. This variable if present would override the timezone set on the machine. Then we can set that variable for the hive-site.xml for the tests.\ 2. Make all the datetime functions timezone aware like they are for other RDBMs (mysql, oracle etc.) and then change the test query to take the timezone. Both of these would solve the problem and would be desirable feature extensions as well.
          Hide
          Johan Oskarsson added a comment -

          Looking at the code it seems you're right, it just dumps the output of the command to a file and then runs the cmd line diff against it and the predefined correct test output file. Since the unix timestamp in the query is GMT the UDFFromUnixTime class will format it differently for all the timezones. Any suggestions on how to solve this? Having a unit test tied to a specific timezone doesn't sound like a good idea.

          Show
          Johan Oskarsson added a comment - Looking at the code it seems you're right, it just dumps the output of the command to a file and then runs the cmd line diff against it and the predefined correct test output file. Since the unix timestamp in the query is GMT the UDFFromUnixTime class will format it differently for all the timezones. Any suggestions on how to solve this? Having a unit test tied to a specific timezone doesn't sound like a good idea.
          Hide
          David Phillips added a comment -

          Set your TZ environment variable to "US/Pacific" before running the tests:

          export TZ=US/Pacific

          Show
          David Phillips added a comment - Set your TZ environment variable to "US/Pacific" before running the tests: export TZ=US/Pacific
          David Phillips made changes -
          Link This issue relates to HIVE-137 [ HIVE-137 ]
          Hide
          Johan Oskarsson added a comment -

          Since it looks like the error is caused by some date/time issue it is worth mentioning that the server where the tests ran is set to UTC.

          Show
          Johan Oskarsson added a comment - Since it looks like the error is caused by some date/time issue it is worth mentioning that the server where the tests ran is set to UTC.
          Johan Oskarsson made changes -
          Field Original Value New Value
          Attachment TEST-org.apache.hadoop.hive.cli.TestCliDriver.txt [ 12396794 ]
          Hide
          Johan Oskarsson added a comment -

          Log file for the test run that fails

          Show
          Johan Oskarsson added a comment - Log file for the test run that fails
          Hide
          Johan Oskarsson added a comment -

          It seems three of the four tests are now finishing properly.
          The fourth one still fails:
          [junit] diff I (file:&#41;\|(tmp/hive.*) /storage/home/chartgen/hive-trunk/ql/../build/ql/test/logs/udf5.q.out /storage/home/chartgen/hive-trunk/ql/src/
          test/results/clientpositive/udf5.q.out
          [junit] 43c43
          [junit] < 2008-11-11 23:32:20 2008-11-11 1 11 2008 1 11 2008
          [junit]
          [junit] > 2008-11-11 15:32:20 2008-11-11 1 11 2008 1 11 2008
          [junit] Exception: Client execution results dailed with error code = 1
          [junit] junit.framework.AssertionFailedError: Client execution results dailed with error code = 1
          [junit] at junit.framework.Assert.fail(Assert.java:47)
          [junit] at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf5(TestCliDriver.java:3034)

          I'll attach the log of the test run

          Show
          Johan Oskarsson added a comment - It seems three of the four tests are now finishing properly. The fourth one still fails: [junit] diff I ( file:&#41;\ |(tmp/hive .*) /storage/home/chartgen/hive-trunk/ql/../build/ql/test/logs/udf5.q.out /storage/home/chartgen/hive-trunk/ql/src/ test/results/clientpositive/udf5.q.out [junit] 43c43 [junit] < 2008-11-11 23:32:20 2008-11-11 1 11 2008 1 11 2008 [junit] — [junit] > 2008-11-11 15:32:20 2008-11-11 1 11 2008 1 11 2008 [junit] Exception: Client execution results dailed with error code = 1 [junit] junit.framework.AssertionFailedError: Client execution results dailed with error code = 1 [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf5(TestCliDriver.java:3034) I'll attach the log of the test run
          Hide
          Ashish Thusoo added a comment -

          I am getting a clean build and all tests succeeded on the latest branch. Are you still seeing this?

          Show
          Ashish Thusoo added a comment - I am getting a clean build and all tests succeeded on the latest branch. Are you still seeing this?
          Johan Oskarsson created issue -

            People

            • Assignee:
              Ashish Thusoo
              Reporter:
              Johan Oskarsson
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development