Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6685

LocalDistributedCacheManager can have overlapping filenames

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      LocalDistributedCacheManager has this setup:

      AtomicLong uniqueNumberGenerator = new AtomicLong(System.currentTimeMillis());

      to create this temporary filename:

      new FSDownload(localFSFileContext, ugi, conf, new Path(destPath, Long.toString(uniqueNumberGenerator.incrementAndGet())), resource);

      when using LocalJobRunner. When two or more start on the same machine, then it's possible to end up having the same timestamp or a large enough overlap that two successive timestamps may not be sufficiently far apart.

      Given the assumptions:

      1) Assume timestamp is the same. Then the most common starting random seed will be the same.
      2) Process ID will very likely be unique, but will likely be close in value.
      3) Thread ID is not guaranteed to be unique.

      A unique ID based on PID as a seed (in addition to the timestamp) should be a better unique identifier for temporary filenames.

        Attachments

        1. MAPREDUCE-6685.001.patch
          3 kB
          Ray Chiang
        2. MAPREDUCE-6685.002.patch
          3 kB
          Ray Chiang

          Issue Links

            Activity

              People

              • Assignee:
                rchiang Ray Chiang
                Reporter:
                rchiang Ray Chiang
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: