Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17718 Hive on Spark Debugging Improvements
  3. HIVE-19525

Spark task logs print PLAN PATH excessive number of times

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • Spark
    • None

    Description

      A ton of logs with this Utilities - PLAN PATH = hdfs://localhost:59527/.../apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/6ebceb49-7a76-4159-9082-5bba44391e30/hive_2018-05-14_07-28-44_672_8205774950452575544-1/-mr-10006/bf14c0b5-a014-4ee8-8ddf-fdb7453eb0f0/map.xml

      Seems it print multiple times per task exception, not sure where it is coming from, but its too verbose. It should be changed to DEBUG level. Furthermore, given that we are using Utilities#getBaseWork anytime we need to access a MapWork or ReduceWork object, we should make the method slightly more efficient. Right now it borrows a Kryo from a pool and does a bunch of stuff to set the classloader, then it checks the cache to see if the work object has already been created. It should check the cache before doing any of that.

      Attachments

        1. HIVE-19525.2.patch
          8 kB
          Bharath Krishna
        2. HIVE-19525.1.patch
          8 kB
          Bharath Krishna

        Issue Links

          Activity

            People

              bharos92 Bharath Krishna
              stakiar Sahil Takiar
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: