Hive
  1. Hive
  2. HIVE-160

sampling in a subquery is broken

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    1. hive-160.1.patch
      256 kB
      Raghotham Murthy

      Activity

      Transition Time In Source Status Execution Times Last Executer Last Execution Date
      Open Open Resolved Resolved
      216d 20h 58m 1 Namit Jain 16/Jul/09 00:35
      Resolved Resolved Closed Closed
      884d 31m 1 Carl Steinbach 17/Dec/11 00:07
      Carl Steinbach made changes -
      Status Resolved [ 5 ] Closed [ 6 ]
      Namit Jain made changes -
      Status Open [ 1 ] Resolved [ 5 ]
      Hadoop Flags [Reviewed]
      Fix Version/s 0.4.0 [ 12313714 ]
      Resolution Fixed [ 1 ]
      Hide
      Namit Jain added a comment -

      Committed. Thanks Raghu

      Show
      Namit Jain added a comment - Committed. Thanks Raghu
      Hide
      Raghotham Murthy added a comment -

      Filed HIVE-638 to fix sampling in subqueries properly

      Show
      Raghotham Murthy added a comment - Filed HIVE-638 to fix sampling in subqueries properly
      Hide
      Namit Jain added a comment -

      +1

      The code changes look good - will commit if the tests look good and they pass

      Show
      Namit Jain added a comment - +1 The code changes look good - will commit if the tests look good and they pass
      Raghotham Murthy made changes -
      Attachment hive-160.1.patch [ 12413497 ]
      Hide
      Raghotham Murthy added a comment -

      No, the problem is that input pruning does not work well when done over parse structures (QB). We should do it over the operator tree. The current patch is a temporary fix for this bug. It always adds a sampling predicate to the where clause irrespective of whether there was input pruning or not. The final fix will be modeled after the partition pruning code that Ashish is fixing.

      I also modified the tests so that srcbucket has an integer key. This allows for better testing of the case where a predicate is added to the where clause. 'Bucket 1 out of 2' will return keys which are even and bucket 2 out of 2 will return keys which are odd.

      Show
      Raghotham Murthy added a comment - No, the problem is that input pruning does not work well when done over parse structures (QB). We should do it over the operator tree. The current patch is a temporary fix for this bug. It always adds a sampling predicate to the where clause irrespective of whether there was input pruning or not. The final fix will be modeled after the partition pruning code that Ashish is fixing. I also modified the tests so that srcbucket has an integer key. This allows for better testing of the case where a predicate is added to the where clause. 'Bucket 1 out of 2' will return keys which are even and bucket 2 out of 2 will return keys which are odd.
      Hide
      Zheng Shao added a comment -

      So it's resolved right? Will you close this issue?

      Show
      Zheng Shao added a comment - So it's resolved right? Will you close this issue?
      Hide
      Raghotham Murthy added a comment -

      Sampling within a sub-query does not seem to prune the input. A filter is added and the result seems correct.

      Show
      Raghotham Murthy added a comment - Sampling within a sub-query does not seem to prune the input. A filter is added and the result seems correct.
      Raghotham Murthy made changes -
      Assignee Raghotham Murthy [ rsm ]
      Ashish Thusoo made changes -
      Priority Major [ 3 ] Critical [ 2 ]
      Hide
      Ashish Thusoo added a comment -

      I think this one has been resolved by Raghu or Namit?

      Show
      Ashish Thusoo added a comment - I think this one has been resolved by Raghu or Namit?
      Jeff Hammerbacher made changes -
      Field Original Value New Value
      Component/s Query Processor [ 12312586 ]
      Hide
      Jeff Hammerbacher added a comment -

      Adding to "Query Processor" component.

      Show
      Jeff Hammerbacher added a comment - Adding to "Query Processor" component.
      Venky Iyer created issue -

        People

        • Assignee:
          Raghotham Murthy
          Reporter:
          Venky Iyer
        • Votes:
          0 Vote for this issue
          Watchers:
          2 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development