Hive
  1. Hive
  2. HIVE-4957

Restrict number of bit vectors, to prevent out of Java heap memory

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.11.0
    • Fix Version/s: 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      normally increase number of bit vectors will increase calculation accuracy. Let's say

      select compute_stats(a, 40) from test_hive;
      

      generally get better accuracy than

      select compute_stats(a, 16) from test_hive;
      

      But larger number of bit vectors also cause query run slower. When number of bit vectors over 50, it won't help to increase accuracy anymore. But it still increase memory usage, and crash Hive if number if too huge. Current Hive doesn't prevent user use ridiculous large number of bit vectors in 'compute_stats' query.

      One example

      select compute_stats(a, 999999999) from column_eight_types;
      

      crashes Hive.

      2012-12-20 23:21:52,247 Stage-1 map = 0%,  reduce = 0%
      2012-12-20 23:22:11,315 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.29 sec
      MapReduce Total cumulative CPU time: 290 msec
      Ended Job = job_1354923204155_0777 with errors
      Error during job, obtaining debugging information...
      Job Tracking URL: http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/
      Examining task ID: task_1354923204155_0777_m_000000 (and more) from job job_1354923204155_0777
      
      Task with the most failures(4): 
      -----
      Task ID:
        task_1354923204155_0777_m_000000
      
      URL:
        http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777&tipid=task_1354923204155_0777_m_000000
      -----
      Diagnostic Messages for this Task:
      Error: Java heap space
      
      1. HIVE-4957.1.patch
        4 kB
        Shreepadma Venugopalan
      2. HIVE-4957.2.patch
        5 kB
        Shreepadma Venugopalan

        Activity

        Show
        Shreepadma Venugopalan added a comment - RB: https://reviews.apache.org/r/14250/
        Hide
        Brock Noland added a comment -

        LGTM, let's see what the tests say.

        Show
        Brock Noland added a comment - LGTM, let's see what the tests say.
        Hide
        Carl Steinbach added a comment -

        Comments on reviewboard. Thanks.

        Show
        Carl Steinbach added a comment - Comments on reviewboard. Thanks.
        Hide
        Shreepadma Venugopalan added a comment -

        New patch addresses review comments.

        Show
        Shreepadma Venugopalan added a comment - New patch addresses review comments.
        Hide
        Hive QA added a comment -

        Overall: +1 all checks pass

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12606199/HIVE-4957.2.patch

        SUCCESS: +1 4078 tests passed

        Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/987/testReport
        Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/987/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        

        This message is automatically generated.

        Show
        Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12606199/HIVE-4957.2.patch SUCCESS: +1 4078 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/987/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/987/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated.
        Hide
        Brock Noland added a comment -

        +1

        Carl do you have any more concerns?

        Show
        Brock Noland added a comment - +1 Carl do you have any more concerns?
        Hide
        Brock Noland added a comment -

        Thank you for the contribution Shreepadma! I have committed this to trunk!

        Show
        Brock Noland added a comment - Thank you for the contribution Shreepadma! I have committed this to trunk!
        Hide
        Shreepadma Venugopalan added a comment -

        Thanks, Brock!

        Show
        Shreepadma Venugopalan added a comment - Thanks, Brock!
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Hive-trunk-hadoop1-ptest #209 (See https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/209/)
        HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337)

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
        • /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q
        • /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out
        Show
        Hudson added a comment - FAILURE: Integrated in Hive-trunk-hadoop1-ptest #209 (See https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/209/ ) HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337 ) /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Hive-trunk-hadoop2-ptest #147 (See https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/147/)
        HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337)

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
        • /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q
        • /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out
        Show
        Hudson added a comment - FAILURE: Integrated in Hive-trunk-hadoop2-ptest #147 (See https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/147/ ) HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337 ) /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Hive-trunk-h0.21 #2413 (See https://builds.apache.org/job/Hive-trunk-h0.21/2413/)
        HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337)

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
        • /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q
        • /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out
        Show
        Hudson added a comment - FAILURE: Integrated in Hive-trunk-h0.21 #2413 (See https://builds.apache.org/job/Hive-trunk-h0.21/2413/ ) HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337 ) /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out
        Hide
        Hudson added a comment -

        ABORTED: Integrated in Hive-trunk-hadoop2 #515 (See https://builds.apache.org/job/Hive-trunk-hadoop2/515/)
        HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337)

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
        • /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q
        • /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out
        Show
        Hudson added a comment - ABORTED: Integrated in Hive-trunk-hadoop2 #515 (See https://builds.apache.org/job/Hive-trunk-hadoop2/515/ ) HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337 ) /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out

          People

          • Assignee:
            Shreepadma Venugopalan
            Reporter:
            Brock Noland
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development