Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3816

Erasure Coding

    XMLWordPrintableJSON

Details

    Description

      We propose to implement Erasure Coding in Apache Ozone to provide efficient storage. With EC in place, Ozone can provide same or better tolerance by giving 50% or more storage space savings.
      In HDFS project, we already have native codecs(ISAL) and Java codecs implemented, we can leverage the same or similar codec design.

      However, the critical part of EC data layout design is in-progress, we will post the design doc soon.

      Also see HDDS-5351, which has a bunch of pre-requisites for the EC feature committed directly to the master branch.

      Attachments

        1. Ozone EC v3.pdf
          1.16 MB
          Marton Elek
        2. Ozone EC Container groups and instances.pdf
          205 kB
          Marton Elek
        3. EC-read-write-path.pdf
          170 kB
          Marton Elek
        4. Apache Ozone Erasure Coding-V2.pdf
          294 kB
          Uma Maheswara Rao G
        5. Erasure Coding in Apache Hadoop Ozone.pdf
          375 kB
          Marton Elek

        Issue Links

          1.
          Introduce the ReplicationConfig and modify the proto files Sub-task Closed Marton Elek
          2.
          Persist replicationIndex on datanode side Sub-task Resolved Marton Elek
          3.
          Enhance SCMServerProtocol with using ReplicationConfig Sub-task Closed Marton Elek
          4.
          EC: allocateBlock API of OM->SCM should take the option EC policy/parameters, which can be used by SCM to select the pipeline provider Sub-task Closed Uma Maheswara Rao G
          5.
          EC: Add replicaIndex to the RPC protocols Sub-task Resolved Marton Elek
          6.
          EC: Introduce EC replication type Sub-task Resolved Marton Elek
          7.
          EC: Implement basic EC pipeline provider Sub-task Resolved Stephen O'Donnell
          8.
          EC: Create ECReplicationConfig on client side based on input string Sub-task Resolved Marton Elek
          9.
          EC: Implement the ECKeyOutputStream which should handle the EC mode writes Sub-task Resolved Uma Maheswara Rao G
          10.
          EC: Add ECReplicationConfig into KeyInfo proto Sub-task Resolved Uma Maheswara Rao G
          11.
          EC: Extend PipelineManager.createPipeline API to support excluded nodes Sub-task Resolved Stephen O'Donnell
          12.
          EC: Allow EC blocks to be requests from OM Sub-task Resolved Stephen O'Donnell
          13.
          EC: Add missing break in switch statement when requesting EC blocks Sub-task Resolved Stephen O'Donnell
          14.
          EC: Add configuration to set an EC container placement Policy Sub-task Resolved Janus Chow
          15.
          allocateContainer should handle ec replication config Sub-task Resolved Uma Maheswara Rao G
          16.
          ContainerStateMap should handle ecReplication config map Sub-task Resolved Stephen O'Donnell
          17.
          EC: ECReplicationConfig should be immutable Sub-task Resolved István Fajth
          18.
          EC: ContainerPlacementPolicyFactory#getPolicyInternal() should not be public Sub-task Resolved István Fajth
          19.
          OMKeyRequest#createFileInfo should handle ECReplicationConfig Sub-task Resolved Uma Maheswara Rao G
          20.
          EC: Implement ECBlockInputStream to read a single EC Block Group. Sub-task Resolved Stephen O'Donnell
          21.
          EC: openKey/createFile should return whether the file created in EC mode Sub-task Resolved Uma Maheswara Rao G
          22.
          EC: Make ECReplicationConfig stored as bucket level attributes. (Example Encryption info) Sub-task Resolved Uma Maheswara Rao G
          23.
          EC: Create a new as many racks as possible placement policy for EC Sub-task Resolved Janus Chow
          24.
          EC: Add padding and generate parity if the last stripe is not full Sub-task Resolved Uma Maheswara Rao G
          25.
          EC: Pipeline builder should copy replica Indexes from original pipeline Sub-task Resolved Stephen O'Donnell
          26.
          EC: commit key should consolidate and create one keyLocationInfo per blockGrp Sub-task Resolved Uma Maheswara Rao G
          27.
          EC: Provide replication config option from CLI when creating bucket. Sub-task Resolved Uma Maheswara Rao G
          28.
          EC: Resolve findbugs warnings after branch merge Sub-task Resolved Stephen O'Donnell
          29.
          EC: ECBlockOutputstream commitKey should create one keyLocationInfo per logical block Sub-task Resolved Stephen O'Donnell
          30.
          EC: Adapt KeyInputStream to read EC Blocks Sub-task Resolved Stephen O'Donnell
          31.
          EC: Add Codec and chunkSize to ECReplicationConfig Sub-task Resolved Stephen O'Donnell
          32.
          EC: ECKeyoutputStream should use codec and chunksize from ECReplicationConfig Sub-task Resolved Uma Maheswara Rao G
          33.
          EC: We should improve the behavior of BasicRootedOzoneClientAdapterImpl#getDefaultReplication with defaults configs moved to server. Sub-task Resolved Uma Maheswara Rao G
          34.
          EC: In BasicRootedOzoneClientAdapterImpl, Inherit bucket default replication config only in the case of EC. Sub-task Resolved Uma Maheswara Rao G
          35.
          EC: Remove hard coded chunksize and get from from ReplicationConfig Sub-task Resolved Stephen O'Donnell
          36.
          EC: Writing a large buffer to an EC file duplicates first chunk in block 1 and 2 Sub-task Resolved Uma Maheswara Rao G
          37.
          EC: Fix TestRootedOzoneFileSystem.testBucketDefaultsShouldBeInheritedToFileForEC failure in branch Sub-task Resolved Uma Maheswara Rao G
          38.
          EC: ECKeyOutputStream#close fails if we write the partial chunk Sub-task Resolved Uma Maheswara Rao G
          39.
          EC: ECKeyOutputStream persists blocks in random order Sub-task Resolved Stephen O'Donnell
          40.
          EC: Implement seek on ECKeyInputStream Sub-task Resolved Stephen O'Donnell
          41.
          EC: Implement seek on ECBlockInputStream Sub-task Resolved Stephen O'Donnell
          42.
          EC: Adopt EC related utility from Hadoop source repository Sub-task Resolved Unassigned
          43.
          EC: Refactor ECBlockOutputStreamEntry to accommodate all block group related ECBlockOuputStreams. Sub-task Resolved István Fajth
          44.
          EC: Write should handle node failures. Sub-task Resolved Uma Maheswara Rao G
          45.
          EC: Integrate the Codec changes into EC Streams. Sub-task Resolved Uma Maheswara Rao G
          46.
          EC: Fix the compile issue in TestOzoneECClient (Due to concurrent commits) Sub-task Resolved Uma Maheswara Rao G
          47.
          EC: Implement an Input Stream to reconstruct EC blocks ondemand Sub-task Resolved Stephen O'Donnell
          48.
          EC: Implement seek on ECBlockReconstructedStripeInputStream Sub-task Resolved Stephen O'Donnell
          49.
          EC: Review the current flush API and clean up Sub-task Resolved Uma Maheswara Rao G
          50.
          EC: ECBlockReconstructedStripeInputStream should handle block read failures and continue reading Sub-task Resolved Stephen O'Donnell
          51.
          EC: Fix the replication config handling in OMDirectoryCreateRequest#dirKeyInfoBuilderNoACL Sub-task Resolved Uma Maheswara Rao G
          52.
          EC: Fix TestOmMetrics in merge branch Sub-task Resolved Uma Maheswara Rao G
          53.
          EC: Create ECBlockReconstructedInputStream to wrap ECBlockReconstructedStripeInputStream Sub-task Resolved Stephen O'Donnell
          54.
          EC: Fix TestOzoneShellHA failures post master merge with EC branch Sub-task Resolved Uma Maheswara Rao G
          55.
          EC: Optimize ECBlockReconstructedStripeInputStream where there are no missing data indexes. Sub-task Resolved Stephen O'Donnell
          56.
          EC: Change CLI bucket default replication option name to "-type" Sub-task Resolved Uma Maheswara Rao G
          57.
          EC: Track the failed servers to add into the excludeList when invoking allocateBlock Sub-task Patch Available Uma Maheswara Rao G
          58.
          Centralize string based replication config validation via ReplicationConfigValidator Sub-task Resolved István Fajth
          59.
          EC: Create ECInputStream wrapper to choose between reconstruction and normal reads. Sub-task Open Stephen O'Donnell
          60.
          EC: Review the TODOs in GRPC Xceiver client and fix them. Sub-task Open István Fajth
          61.
          EC: Onboard EC into upgrade framework Sub-task Open István Fajth
          62.
          EC: Support native ISA-L based encoding in Ozone Sub-task Open Marton Elek
          63.
          EC: CreateBucketHandler should use ReplicationConfig Validator Sub-task Open István Fajth
          64.
          EC: Enhance EC replication config parsing from string and enable validation of it Sub-task Open István Fajth
          65.
          EC: WritableEcContainerProvider should dynamically adjust the open container groups Sub-task Open Unassigned
          66.
          EC: Create reusable buffer pool shared by all EC input and output streams Sub-task Open Unassigned
          67.
          EC: Provide set replicationConfig option to bucket Sub-task Patch Available Uma Maheswara Rao G
          68.
          EC: Client side exclude nodes list should expire after certain time period or based on the list size. Sub-task Open Unassigned
          69.
          EC: ECBlockReconstructedStripeInputStream should read from blocks in parallel Sub-task Open Stephen O'Donnell
          70.
          EC: handleStripeFailure should retry Sub-task Open Uma Maheswara Rao G
          71.
          EC: Provide CLI option to reset the bucket replication config Sub-task Open Unassigned

          Activity

            People

              umamaheswararao Uma Maheswara Rao G
              umamaheswararao Uma Maheswara Rao G
              Votes:
              1 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

                Created:
                Updated: