Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3294

distcp leaves empty blocks afte successful execution

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.16.3
    • 0.16.4
    • util
    • None
    • Reviewed

    Description

      I copied around 40 TB between two hadoop clusters, with distcp running on source.

      Job was successful, but one destination file was empty because of its only block being empty.
      None of the distcp log files have any mentioning of this file.

      There were a couple of messages in the namenode server log of the destination cluster referencing the file:

      hadoop-xxxnamenode-yyy.log.2008-04-19:2008-04-19 02:19:15,666 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: destinationDir/distcp_tmp_z0g93p/fileName. blk-9209890281741927376
      hadoop-xxx-namenode-yyy.log.2008-04-19:2008-04-19 02:54:45,820 WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.internalReleaseCreate: attempt to release a create lock on destinationDir/_distcp_tmp_z0g93p/fileName file does not exist.

      distcp should not rely on the user to double-check.
      Would it make sense to add a reducer to compare destination file sizes with source files sizes and do some appropriate action?

      Attachments

        1. 3294_20080423.patch
          1 kB
          Tsz-wo Sze
        2. 3294_20080423b_0.16.patch
          2 kB
          Tsz-wo Sze
        3. 3294_20080423b.patch
          2 kB
          Tsz-wo Sze

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            szetszwo Tsz-wo Sze
            ckunz Christian Kunz
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment