Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1989

Add support for simulated Data Nodes - helpful for testing and performance benchmarking of the Name Node without having a large cluster



    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.16.0
    • 0.16.0
    • None
    • None


      Proposal is to add an implementation for a Simulated Data Node.
      This will

      • allow one to test certain parts of the system (especially the Name Node, protocols) much more easily and efficiently.
      • allow one to run performance benchmarks on the Name node without having a large cluster.
      • Inject faults for testing (e.g. one can add random faults based probability parameters).

      The idea is that the Simulated Data Node will

      • discard any data written to blocks (but remember the blocks and their sizes)
      • generate fixed data on the fly when blocks are read (e.g. block is fixed set of bytes or repeated sequence of strings).

      The Simulated Data Node can also be used for fault injection.
      The data node can be parameterized with probabilities that allow one to control:

      • Delays on reads and writes, creates, etc
      • IO Exceptions
      • Loss of blocks
      • Failures


        1. SimulatedStoragePatchSubmit.txt
          70 kB
          Sanjay Radia
        2. SimulatedStoragePatchSubmit5.txt
          84 kB
          Sanjay Radia
        3. SimulatedStoragePatchSubmit6.txt
          86 kB
          Sanjay Radia
        4. SimulatedStoragePatchSubmit7.txt
          86 kB
          Sanjay Radia
        5. SimulatedStoragePatchSubmit8.txt
          86 kB
          Sanjay Radia
        6. SimulatedStoragePatchSubmit9.patch
          1.0 kB
          Sanjay Radia



            sanjay.radia Sanjay Radia
            sanjay.radia Sanjay Radia
            0 Vote for this issue
            2 Start watching this issue