Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-265

Revisit append

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.21.0
    • 0.21.0
    • None
    • None

    Description

      HADOOP-1700 and related issues have put a lot of efforts to provide the first implementation of append. However, append is such a complex feature. It turns out that there are issues that were initially seemed trivial but needs a careful design. This jira revisits append, aiming for a design and implementation supporting a semantics that are acceptable to its users.

      Attachments

        1. a.sh
          1 kB
          Tsz-wo Sze
        2. appendDesign.pdf
          408 kB
          Konstantin Shvachko
        3. appendDesign.pdf
          580 kB
          Hairong Kuang
        4. appendDesign1.pdf
          624 kB
          Hairong Kuang
        5. appendDesign2.pdf
          868 kB
          Hairong Kuang
        6. appendDesign3.pdf
          869 kB
          Hairong Kuang
        7. AppendSpec.pdf
          48 kB
          Hairong Kuang
        8. AppendTestPlan.html
          64 kB
          Konstantin I Boudnik
        9. AppendTestPlan.html
          65 kB
          Konstantin I Boudnik
        10. AppendTestPlan.html
          65 kB
          Konstantin I Boudnik
        11. AppendTestPlan.html
          65 kB
          Konstantin I Boudnik
        12. AppendTestPlan.html
          65 kB
          Konstantin I Boudnik
        13. AppendTestPlan.html
          63 kB
          Konstantin I Boudnik
        14. AppendTestPlan.html
          62 kB
          Konstantin I Boudnik
        15. AppendTestPlan.html
          62 kB
          Konstantin I Boudnik
        16. AppendTestPlan.html
          55 kB
          Konstantin I Boudnik
        17. TestPlanAppend.html
          50 kB
          Konstantin I Boudnik

        Issue Links

        1.
        Factor out BlockInfo from BlocksMap Sub-task Closed Konstantin Shvachko Actions
        2.
        Redesign DataNode volumeMap to include all types of Replicas Sub-task Resolved Hairong Kuang Actions
        3.
        Introduce BlockInfoUnderConstruction to reflect block replica states while writing. Sub-task Resolved Konstantin Shvachko Actions
        4.
        Create new tests for Append's hflush Sub-task Resolved Konstantin I Boudnik Actions
        5.
        Create new tests for lease recovery Sub-task Closed Konstantin I Boudnik Actions
        6.
        Create new tests for block recovery Sub-task Closed Hairong Kuang Actions
        7.
        Create new tests for pipeline Sub-task Closed Konstantin I Boudnik Actions
        8.
        Create stress tests for append feature Sub-task Resolved Unassigned Actions
        9.
        Support hflush at DFSClient Sub-task Resolved Hairong Kuang Actions
        10.
        DataNode uses ReplicaBeingWritten to support dfs writes/hflush Sub-task Resolved Hairong Kuang Actions
        11.
        Break FSDatasetInterface#writeToBlock() into writeToTemporary, writeToRBW, and append Sub-task Resolved Hairong Kuang Actions
        12.
        Add a "rbw" sub directory to DataNode data directory Sub-task Resolved Hairong Kuang Actions
        13.
        DataNode restarts may introduce corrupt/duplicated/lost replicas when handling detached replicas Sub-task Resolved Hairong Kuang Actions
        14.
        Add a test for NameNode.getBlockLocations(..) to check read from un-closed file. Sub-task Resolved Tsz-wo Sze Actions
        15.
        Introduce block committing logic during new block allocation and file close. Sub-task Resolved Konstantin Shvachko Actions
        16.
        When opening a file for read, make the file length avaliable to client. Sub-task Resolved Tsz-wo Sze Actions
        17.
        Extend Block report to include under-construction replicas Sub-task Resolved Konstantin Shvachko Actions
        18.
        Datanode should serve up to visible length of a replica for read requests Sub-task Resolved Tsz-wo Sze Actions
        19.
        Change block write protocol to support pipeline recovery Sub-task Resolved Hairong Kuang Actions
        20.
        Create new functional test for a block report. Sub-task Closed Konstantin I Boudnik Actions
        21.
        Allow client to get a new generation stamp from NameNode Sub-task Resolved Hairong Kuang Actions
        22.
        Block report processing for append Sub-task Resolved Konstantin Shvachko Actions
        23.
        Create functional tests for new design of the block report Sub-task Closed Konstantin I Boudnik Actions
        24.
        Support replica recovery initialization in datanode Sub-task Resolved Tsz-wo Sze Actions
        25.
        Client support pipeline recovery Sub-task Resolved Hairong Kuang Actions
        26.
        Support replica update in datanode Sub-task Resolved Tsz-wo Sze Actions
        27.
        SafeMode should count only complete blocks. Sub-task Resolved Konstantin Shvachko Actions
        28.
        Support pipeline close and close recovery Sub-task Resolved Hairong Kuang Actions
        29.
        Lease recovery, concurrency support. Sub-task Resolved Konstantin Shvachko Actions
        30.
        Replace BlockInfo.isUnderConstruction() with isComplete() Sub-task Resolved Konstantin Shvachko Actions
        31.
        Simplify the codes in the replica related classes Sub-task Resolved Unassigned Actions
        32.
        Remove unused legacy protocol methods. Sub-task Resolved Konstantin Shvachko Actions
        33.
        Block recovery for primary data-node Sub-task Resolved Konstantin Shvachko Actions
        34.
        DFSClient cannot read all the available bytes Sub-task Resolved Tsz-wo Sze Actions
        35.
        Remove deprecated methods from InterDatanodeProtocol. Sub-task Closed Konstantin Shvachko Actions
        36.
        Data-node upgrade problem Sub-task Resolved Hairong Kuang Actions
        37.
        Unnecessary info message from DFSClient Sub-task Resolved Hairong Kuang Actions
        38.
        DFSIO for append Sub-task Closed Konstantin Shvachko Actions
        39.
        TestFileAppend2 sometimes hangs Sub-task Resolved Hairong Kuang Actions
        40.
        TestFileAppend3#TC7 sometimes hangs Sub-task Closed Hairong Kuang Actions
        41.
        NPE in FSDataset.updateReplicaUnderRecovery(..) Sub-task Closed Konstantin Shvachko Actions
        42.
        Create block recovery tests that handle errors Sub-task Closed Hairong Kuang Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hairong Hairong Kuang
            hairong Hairong Kuang
            Votes:
            5 Vote for this issue
            Watchers:
            54 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment