Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-731

Sometimes when a dfs file is accessed and one copy has a checksum error the I/O command fails, even if another copy is alright.

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.2
    • Fix Version/s: 0.11.0
    • Component/s: None
    • Labels:
      None

      Description

      for a particular file [alas, the file no longer exists -- I had to progress]

      $dfs -cp foo bar

      and

      $dfs -get foo local

      failed on a checksum error. The dfs browser's download function retrieved the file, so either that function doesn't check, or more likely the download function got a different copy.

      When a checksum fails on one copy of a file that is redundantly stored, I would prefer that dfs try a different copy, mark the bad one as not existing [which should induce a fresh copy being made from one of the good copies eventually], and make the call continue to work and deliver bytes.

      Ideally, if all copies have checksum errors but it's possible to piece together a good copy I would like that to be done.

      -dk

        Attachments

        1. hadoop-731-7.patch
          9 kB
          Wendy Chien

        Issue Links

          Activity

            People

            • Assignee:
              wchien Wendy Chien
              Reporter:
              dking Dick King

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment