Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • None
    • None

    Description

      Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following are the execution time breakdown

      Total time - 13m59s
      Junit reported time for testcase - 50s
      Most of the time is spent in creating/loading/analyzing initial tables - ~12m
      Cleanup - ~1m

      There is huge overhead for running MiniMr tests when compared to the actual test runtime.

      Ran the same test without init script.
      Total time - 2m17s
      Junit reported time for testcase - 52s

      Also I noticed some tests that doesn't have to run on MiniMr (like udf_using.q that does not require MiniMr. It just reads/write to hdfs which we can do in MiniTez/MiniLlap which are way faster). Most tests access only very few initial tables to read few rows from it. We can fix those tests to load just the table that is required for the table instead of all initial tables. Also we can remove q_init_script.sql initialization for MiniMr after rewriting and moving over the unwanted tests which should cut down the runtime a lot.

      Attachments

        1. HIVE-14627.3.patch
          52 kB
          Prasanth Jayachandran
        2. HIVE-14627.2.patch
          4 kB
          Prasanth Jayachandran
        3. HIVE-14627.1.patch
          3 kB
          Prasanth Jayachandran

        Activity

          People

            prasanth_j Prasanth Jayachandran
            prasanth_j Prasanth Jayachandran
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: