Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8031

Follow-on work for erasure coding phase I (striping layout)

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Attachments

        Issue Links

        1.
        Block Readers and Writers used in both client side and datanode side Sub-task Resolved Li Bo
        2.
        Erasure Coding: unifying common constructs like coding work, block reader and block writer across client and DataNode Sub-task Resolved Li Bo
        3.
        WebHDFS: Support EC commands through webhdfs Sub-task Resolved Uma Maheswara Rao G
        4.
        Handle hflush and hsync in the best optimal way possible during online Erasure encoding Sub-task Resolved Vinayakumar B
        5.
        Erasure Coding: Add more EC zone management APIs (get/list EC zone(s)) Sub-task Resolved Yi Liu
        6.
        Allow to configure the system default EC schema Sub-task Resolved Kai Zheng
        7.
        ECManager should be able to manage multiple ECSchemas Sub-task Resolved Unassigned
        8.
        Erasure Coding: Persist erasure coding policies in NameNode Sub-task Resolved Sammi Chen
        9.
        Erasure coding: NameNode manages multiple erasure coding policies Sub-task Resolved Rui Li
        10.
        [umbrella] Adding metrics for Erasure Coding Sub-task Resolved Li Bo
        11.
        ECSchema supports for offline EditsVisitor over an OEV XML file Sub-task Resolved Xinwei Qin
        12.
        Erasure Coding: Expose refreshECSchemas command to reload predefined schemas Sub-task Resolved Rakesh R
        13.
        Erasure coding: revisit how to store EC schema and cellSize in NameNode Sub-task Resolved Yi Liu
        14.
        Add MODIFY and REMOVE ECSchema editlog operations Sub-task Resolved Xinwei Qin
        15.
        Erasure Coding: local and remote block reader for coding work in DataNode Sub-task Resolved Zhe Zhang
        16.
        Erasure Coding: Create FileStatus isErasureCoded() method Sub-task Resolved Rakesh R
        17.
        Erasure Coding: Update last cellsize calculation according to whether the erasure codec has chunk boundary Sub-task Resolved Yi Liu
        18.
        Erasure Coding: local and remote block writer for coding work in DataNode Sub-task Resolved Li Bo
        19.
        [umbrella] Erasure Coding worker and support in DataNode Sub-task Resolved Li Bo
        20.
        Erasure Coding: Correctly calculate last striped block length in DFSStripedInputStream if it's under construction. Sub-task Resolved Yi Liu
        21.
        Add computation time metrics to datanode for ECWorker Sub-task Resolved Sammi Chen
        22.
        Add bytes count metrics to datanode for ECWorker Sub-task Resolved Sammi Chen
        23.
        Erasure Coding: Allow concat striped files if they have the same ErasureCodingPolicy Sub-task Resolved Walter Su
        24.
        Add tasks count metrics to datanode for ECWorker Sub-task Resolved Li Bo
        25.
        Ec files can't be deleted into Trash because of that Trash isn't EC zone. Sub-task Resolved Brahma Reddy Battula
        26.
        Relax permission checking for EC related operations Sub-task Resolved Andrew Wang
        27.
        Merge HDFS-8227 into EC branch Sub-task Resolved Haohui Mai
        28.
        Create EC zone should not need superuser privilege Sub-task Resolved Yong Zhang
        29.
        Erasure Coding: optimize client writing by making the writing of data and parity concurrently Sub-task Resolved Li Bo
        30.
        Add blocks count metrics to datanode for ECWorker Sub-task Resolved Li Bo
        31.
        Erasure Coding: cache ErasureCodingZone Sub-task Resolved Walter Su
        32.
        Remove hard-coded chunk size in favor of ECZone Sub-task Resolved Kai Sasaki
        33.
        Remove hard-coded values in favor of EC schema Sub-task Resolved Kai Sasaki
        34.
        Erasure Coding: update invalidateBlock(..) logic for striped block Sub-task Resolved Walter Su
        35.
        Erasure Coding: use thread pool for StripedDataStreamer Sub-task Resolved Rui Gao
        36.
        Erasure Coding: revisit buffer used for encoding and decoding. Sub-task Resolved Sammi Chen
        37.
        Erasure Coding: Add EC-related Metrics to NN (seperate striped blocks count from UnderReplicatedBlocks count) Sub-task Resolved Manoj Govindassamy
        38.
        Erasure coding: use simple replication for internal blocks on decommissioning datanodes Sub-task Resolved Rakesh R
        39.
        Erasure Coding: Revisit the long and int datatypes usage in striping logic Sub-task Resolved Rakesh R
        40.
        Erasure Coding: block group ID displayed in WebUI is not consistent with fsck Sub-task Resolved Unassigned
        41.
        Remove the use of hard-coded cell size value in balancer Dispatcher Sub-task Resolved Walter Su
        42.
        Erasure coding: merge HDFS-8499 to EC branch and refactor BlockInfoStriped Sub-task Resolved Zhe Zhang
        43.
        Use ByteBuffer in striping positional read Sub-task Resolved Sammi Chen
        44.
        Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful) Sub-task Resolved Kai Zheng
        45.
        Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping write Sub-task Resolved Kai Zheng
        46.
        Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping recovery on DataNode side Sub-task Resolved Kai Zheng
        47.
        Refactor DFSInputStream#ReaderStrategy Sub-task Resolved Sammi Chen
        48.
        Erasure Coding: the log of each streamer should show its index Sub-task Resolved Li Bo
        49.
        Erasure coding: a comprehensive I/O throughput benchmark tool Sub-task Resolved Rui Li
        50.
        Tolerate multiple failures in DFSStripedOutputStream Sub-task Resolved Walter Su
        51.
        Erasure Coding: client fails to write large file when one datanode fails Sub-task Resolved Li Bo
        52.
        Erasure Coding: add tests for taking snapshots on EC files Sub-task Resolved Rakesh R
        53.
        Simplify Erasure Coding Zone DiskSpace quota exceeded exception error message Sub-task Resolved Rui Gao
        54.
        Erasure Coding: cover more test situations of datanode failure during client writing Sub-task Resolved Li Bo
        55.
        Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client Sub-task Resolved Zhe Zhang
        56.
        Erasure Coding: Lease recovery for striped file Sub-task Resolved Walter Su
        57.
        Add InterfaceAudience annotation to the erasure coding classes Sub-task Resolved Rakesh R
        58.
        Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers Sub-task Resolved Jing Zhao
        59.
        Inconsistent default value of dfs.datanode.stripedread.buffer.size Sub-task Resolved Walter Su
        60.
        Erasure coding: Add apache license header in TestFileStatusWithECPolicy.java Sub-task Resolved Surendra Singh Lilhore
        61.
        Erasure Coding: Skip encoding the data cells if all the parity data streamers are failed for the current block group Sub-task Resolved Rakesh R
        62.
        Wait previous ErasureCodingWork to finish before schedule another one Sub-task Resolved Walter Su
        63.
        Erasure coding: client should update and commit block based on acknowledged size Sub-task Resolved Sammi Chen
        64.
        Erasure coding: friendly log information for write operations with some failed streamers Sub-task Resolved Li Bo
        65.
        Erasure coding: updateBlockForPipeline sometimes returns non-striped block for striped file Sub-task Resolved Unassigned
        66.
        Erasure coding: some EC tests are missing timeout Sub-task Resolved Rui Gao
        67.
        Erasure Coding: DFS GetErasureCodingPolicy API on a non-existent file should be handled properly Sub-task Resolved Rakesh R
        68.
        Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) Sub-task Resolved Rui Gao
        69.
        Erasure coding: an erasure codec throughput benchmark tool Sub-task Resolved Unassigned
        70.
        Use byte array for internal block indices in a striped block Sub-task Resolved Jing Zhao
        71.
        Erasure Coding: Wrong limit setting of target ByteBuffer Sub-task Resolved Kai Sasaki
        72.
        Move ErasureCodingPolicyManager to FSDirectory Sub-task Resolved Walter Su
        73.
        getListing wrongly associates Erasure Coding policy to pre-existing replicated files under an EC directory Sub-task Resolved Jing Zhao
        74.
        ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block Sub-task Resolved Jing Zhao
        75.
        Erasure Coding: allow to use multiple EC policies in striping related tests Sub-task Resolved Rui Li
        76.
        Make existing DFSClient#getFileChecksum() work for striped blocks Sub-task Resolved Kai Zheng
        77.
        Refactoring ErasureCodingWorker into smaller reusable constructs Sub-task Resolved Kai Zheng
        78.
        Erasure Coding: allow to use multiple EC policies in striping related tests [Part 2] Sub-task Resolved Rui Li
        79.
        Correctly update DataNode's scheduled block size when writing small EC file Sub-task Resolved Jing Zhao
        80.
        Streamer threads may leak if failure happens when closing the striped outputstream Sub-task Resolved Jing Zhao
        81.
        Erasure Coding: allow to use multiple EC policies in striping related tests [Part 3] Sub-task Resolved Rui Li
        82.
        Correctly handle EC reconstruction work caused by not enough racks Sub-task Resolved Jing Zhao
        83.
        Erasure Coding: Postpone the recovery work for a configurable time period Sub-task Resolved Li Bo
        84.
        Erasure Coding: Improve few exception handling logic of ErasureCodingWorker Sub-task Resolved Rakesh R
        85.
        Erasure Coding: Improve exception handling in ErasureCodingWorker#ReconstructAndTransferBlock Sub-task Resolved Yiqun Lin
        86.
        Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data Sub-task Resolved Rakesh R
        87.
        BlockManager#countNodes should be able to detect duplicated internal blocks Sub-task Resolved Jing Zhao
        88.
        BlockManager#chooseExcessReplicasStriped may weaken rack fault tolerance Sub-task Resolved Jing Zhao
        89.
        Missing block exception should carry locatedBlocks information Sub-task Resolved Mingliang Liu
        90.
        shouldProcessOverReplicated should not count number of pending replicas Sub-task Resolved Jing Zhao
        91.
        Erasure Coding: Avoids scheduling multiple reconstruction tasks for a striped block at the same time Sub-task Open Sammi Chen
        92.
        Erasure Coding: Sort located striped blocks based on decommissioned states Sub-task Resolved Rakesh R
        93.
        Erasure Coding: support small cluster whose #DataNode < # (Blocks in a BlockGroup) Sub-task Resolved Li Bo
        94.
        StripedFileTestUtil#readAll flaky Sub-task Resolved Mingliang Liu
        95.
        Fix intermittent test failure of TestDataNodeErasureCodingMetrics Sub-task Resolved Rakesh R
        96.
        Erasure Coding: Recompute block checksum for a particular range less than file size on the fly by reconstructing missed block Sub-task Resolved Rakesh R
        97.
        Allow only suitable storage policies to be set on striped files Sub-task Resolved Uma Maheswara Rao G
        98.
        BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy Sub-task Resolved Manoj Govindassamy
        99.
        Add EC policy and storage policy related usage summarization function to dfs du command Sub-task Resolved Sammi Chen
        100.
        Erasure Coding: Document about the current allowed storage policies for EC Striped mode files Sub-task Resolved Uma Maheswara Rao G
        101.
        Erasure Coding: Add removeErasureCodingPolicy API Sub-task Resolved Unassigned
        102.
        Correctly report missing EC blocks in FSCK Sub-task Resolved Takanobu Asanuma
        103.
        When there are unrecoverable ec block groups, Namenode Web UI shows "There are X missing blocks." but doesn't show the block names. Sub-task Resolved Takanobu Asanuma
        104.
        FBR processing may generate incorrect reportedBlock-blockGroup mapping Sub-task Resolved Jing Zhao
        105.
        Switch from "raw" to "system" xattr namespace for erasure coding policy Sub-task Resolved Andrew Wang
        106.
        BlockManager#isInNewRack should consider decommissioning nodes Sub-task Resolved Jing Zhao
        107.
        Distcp should not copy replication factor if source file is erasure coded Sub-task Resolved Manoj Govindassamy
        108.
        fsck -list-corruptfileblocks does not report corrupt EC files Sub-task Resolved Takanobu Asanuma
        109.
        Report erasure coding policy of EC files in Fsck Sub-task Resolved Wei-Chiu Chuang
        110.
        OIV tool should make an EC file explicit Sub-task Resolved Manoj Govindassamy
        111.
        Ability to specify per-file EC policy at create time Sub-task Resolved Sammi Chen
        112.
        Support an XOR policy XOR-2-1-64k in HDFS Sub-task Resolved Sammi Chen
        113.
        Introduce separate stats for Replicated and Erasure Coded Blocks apart from the current Aggregated stats Sub-task Resolved Manoj Govindassamy
        114.
        Correct typos in native erasure coding dump code Sub-task Resolved László Bence Nagy
        115.
        Improve test coverage for ISA-L native coder Sub-task Open Huafeng Wang
        116.
        Add ability to unset and change directory EC policy Sub-task Resolved Sammi Chen
        117.
        Provide replicated EC policy to replicate files Sub-task Resolved Sammi Chen
        118.
        Document dfs.client.read.striped configuration in hdfs-default.xml Sub-task Resolved Rakesh R
        119.
        Add assertions to BlockInfo#addStorage to protect from breaking reportedBlock-blockGroup mapping Sub-task Resolved Takanobu Asanuma
        120.
        Report blockIds of internal blocks for EC files in Fsck Sub-task Resolved Takanobu Asanuma
        121.
        Support an erasure coding policy using RS 10 + 4 Sub-task Resolved Wei Zhou
        122.
        Enforce set of enabled EC policies on the NameNode Sub-task Resolved Andrew Wang
        123.
        Call RawErasureEncoder and RawErasureDecoder release() methods Sub-task Resolved Sammi Chen
        124.
        Erasure Coding: Support Parity Blocks placement onto same nodes hosting Data Blocks when DataNodes are insufficient Sub-task Resolved Manoj Govindassamy
        125.
        Support ErasureCoding section in OIV XML/ReverseXML Sub-task Resolved Huafeng Wang
        126.
        Add javadoc for storage policy and erasure coding policy Sub-task Resolved Kai Sasaki
        127.
        Inotify should support erasure coding policy op as replica meta change Sub-task Resolved Huafeng Wang

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              zhz Zhe Zhang
            • Votes:
              2 Vote for this issue
              Watchers:
              38 Start watching this issue

              Dates

              • Created:
                Updated: