Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10214

Checkpoint Can not be done by StandbyNameNode.Because checkpoint may cause DataNode blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the object num > 100000000.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.5.0, 2.6.4
    • None
    • ha, namenode
    • None
    • 500 DataNode.

      137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s)

    Description

      The current Cluster status :
      137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s).

      The checkpoint save namespace cost more than 5 min.

      DataNode rpc timeout.

      Standby NameNode skip the DataNode rpc request(because datanode rpc timeout , datanode close the socket channel).

      There are many corrupt files when failover.

      So, Checkpoint may be done by other component, not Standby NameNode.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              chenfolin ChenFolin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 672h
                  672h
                  Remaining:
                  Remaining Estimate - 672h
                  672h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified