Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-28212

MiniHS2: use a base folder which is more likely writable on the local FS

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 4.1.0
    • None

    Description

      we hardcode a HDFS session dir like below:
      https://github.com/apache/hive/blob/2d855b27d31db6476f18870651db6987816bb5e3/itests/util/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java#L307

            baseFsDir = new Path(new Path(fs.getUri()), "/base");
      

      this can lead to problems with tez local mode with mini hs2, as tez mirrors the hdfs contents to a local folder, and later it this leads to a confusing message like:

      2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error starting DAGAppMaster
      java.io.FileNotFoundException: /base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb (No such file or directory)
      	at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
      	at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
      	at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_292]
      	at org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84) ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
      	at org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) ~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
      	at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) [tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
      	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
      

      btw, this confusing message will be fixed in TEZ-4555, but we need to give something different than /base
      it doesn't make sense to hack a different folder in tez for the local mode, instead we should change the hardcoded "/base" in MiniHS2 which might be more durable and solves the abovementioned problem

      currently, hive's default scratch dir is /tmp/hive

      Attachments

        Issue Links

          Activity

            People

              abstractdog László Bodor
              abstractdog László Bodor
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: