Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-2209

Make MiniDFS easier to embed in other apps

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.20.203.0
    • Fix Version/s: 0.23.0, 0.24.0
    • Component/s: test
    • Labels:
      None

      Description

      I've been deploying MiniDFSCluster for some testing, and while using it/looking through the code I made some notes of where there are issues and improvement opportunities. This is mostly minor as its a test tool, but a risk of synchronization problems is there and does need addressing; the rest are all feature creep.

      Field nameNode should be marked as volatile as the shutdown operation can be in a different thread than startup. Best of all,
      add synchronized methods to set and get the field, as well as shutdown.

      The data dir is set from from System Properties.

          base_dir = new File(System.getProperty("test.build.data", "build/test/data"), "dfs/");
          data_dir = new File(base_dir, "data");
      

      This is done in formatDataNodeDirs() corruptBlockOnDataNode() and the constructor.

      Improvement: have a test property in the conf file, and only read the system property if this is unset. This will enable
      multiple MiniDFSClusters to come up in the same JVM, and handle shutdown/startup race conditions better, and avoid the
      "java.io.IOException: Cannot lock storage build/test/data/dfs/name1. The directory is already locked." messages

      Messages should log to the commons logging and not System.err and System.out. This enables containers to catch and stream better,
      and include more diagnostics such as timestamp and thread Id

      Class could benefit from a method to return the FS URI, rather than just the FS. This currently has to be worked around with some tricks involving a cached configuration

      waitActive() could get confused if "localhost" maps to an IPv6 address. Better to ask for 127.0.0.1 as the hostname; Junit
      test runs may need to be set up to force in IPv4 too.

      injectBlocks has a spelling error in the IOException, "SumulatedFSDataset" is the correct spelling

      1. HDFS-2209.patch
        29 kB
        Steve Loughran
      2. HDFS-2209.patch
        29 kB
        Steve Loughran
      3. HDFS-2209.patch
        29 kB
        Steve Loughran
      4. HDFS-2209.patch
        29 kB
        Steve Loughran
      5. HDFS-2209.patch
        24 kB
        Steve Loughran

        Activity

          People

          • Assignee:
            Steve Loughran
            Reporter:
            Steve Loughran
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1h Original Estimate - 1h
              1h
              Remaining:
              Time Spent - 1.5h Remaining Estimate - 2h
              2h
              Logged:
              Time Spent - 1.5h Remaining Estimate - 2h
              1.5h

                Development