Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4403

DFSClient can infer checksum type when not provided by reading first byte

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.0.2-alpha
    • 2.0.3-alpha
    • hdfs-client
    • None
    • Reviewed
    • Hide
      The HDFS implementation of getFileChecksum() can now operate correctly against earlier-version datanodes which do not include the checksum type information in their checksum response. The checksum type is automatically inferred by issuing a read of the first byte of each block.
      Show
      The HDFS implementation of getFileChecksum() can now operate correctly against earlier-version datanodes which do not include the checksum type information in their checksum response. The checksum type is automatically inferred by issuing a read of the first byte of each block.

    Description

      HDFS-3177 added the checksum type to OpBlockChecksumResponseProto, but the new protobuf field is optional, with a default of CRC32. This means that this API, when used against an older cluster (like earlier 0.23 releases) will falsely return CRC32 even if that cluster has written files with CRC32C. This can cause issues for distcp, for example.

      Instead of defaulting the protobuf field to CRC32, we can leave it with no default, and if the OpBlockChecksumResponseProto has no checksum type set, the client can send OP_READ_BLOCK to read the first byte of the block, then grab the checksum type out of that response (which has always been present)

      Attachments

        1. hdfs-4403.txt
          13 kB
          Todd Lipcon
        2. hdfs-4403.txt
          13 kB
          Todd Lipcon

        Activity

          People

            tlipcon Todd Lipcon
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: