Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14603 Über-JIRA: HDFS RBF stabilization phase II
  3. HDFS-15079

RBF: Client maybe get an unexpected result with network anomaly

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.3.0
    • 3.4.0
    • rbf

    Description

      I find there is a critical problem on RBF, HDFS-15078 can resolve it on some Scenarios, but i have no idea about the overall resolution.
      The problem is that

      Client with RBF(r0, r1) create a file HDFS file via r0, it gets Exception and failovers to r1
      r0 has been send create rpc to namenode(1st create)
      Client create a HDFS file via r1(2nd create)
      Client writes the HDFS file and close it finally(3rd close)
      Maybe namenode receiving the rpc in order as follow

      2nd create
      3rd close
      1st create
      And overwrite is true by default, this would make the file had been written an empty file. This is an critical problem
      We had encountered this problem. There are many hive and spark jobs running on our cluster, sometimes it occurs

      Attachments

        1. HDFS-15079.001.patch
          37 kB
          Hui Fei
        2. HDFS-15079.002.patch
          49 kB
          Hui Fei
        3. UnexpectedOverWriteUT.patch
          11 kB
          Hui Fei

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            xuzq_zander ZanderXu
            ferhui Hui Fei
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2h 40m
                2h 40m

                Slack

                  Issue deployment