Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3816

Erasure Coding

    XMLWordPrintableJSON

Details

    Description

      We propose to implement Erasure Coding in Apache Ozone to provide efficient storage. With EC in place, Ozone can provide same or better tolerance by giving 50% or more storage space savings.
      In HDFS project, we already have native codecs(ISAL) and Java codecs implemented, we can leverage the same or similar codec design.

      However, the critical part of EC data layout design is in-progress, we will post the design doc soon.

      Also see HDDS-5351, which has a bunch of pre-requisites for the EC feature committed directly to the master branch.

      Attachments

        1. Apache Ozone Erasure Coding-V2.pdf
          294 kB
          Uma Maheswara Rao G
        2. EC-read-write-path.pdf
          170 kB
          Marton Elek
        3. Erasure Coding in Apache Hadoop Ozone.pdf
          375 kB
          Marton Elek
        4. Ozone EC Container groups and instances.pdf
          205 kB
          Marton Elek
        5. Ozone EC v3.pdf
          1.16 MB
          Marton Elek

        Issue Links

          1.
          EC: Introduce the ReplicationConfig and modify the proto files Sub-task Closed Marton Elek
          2.
          EC: Persist replicationIndex on datanode side Sub-task Resolved Marton Elek
          3.
          Enhance SCMServerProtocol with using ReplicationConfig Sub-task Closed Marton Elek
          4.
          EC: allocateBlock API of OM->SCM should take the option EC policy/parameters, which can be used by SCM to select the pipeline provider Sub-task Closed Uma Maheswara Rao G
          5.
          EC: Add replicaIndex to the RPC protocols Sub-task Resolved Marton Elek
          6.
          EC: Introduce EC replication type Sub-task Resolved Marton Elek
          7.
          EC: Implement basic EC pipeline provider Sub-task Resolved Stephen O'Donnell
          8.
          EC: Create ECReplicationConfig on client side based on input string Sub-task Resolved Marton Elek
          9.
          EC: Implement the ECKeyOutputStream which should handle the EC mode writes Sub-task Resolved Uma Maheswara Rao G
          10.
          EC: Add ECReplicationConfig into KeyInfo proto Sub-task Resolved Uma Maheswara Rao G
          11.
          EC: Extend PipelineManager.createPipeline API to support excluded nodes Sub-task Resolved Stephen O'Donnell
          12.
          EC: Allow EC blocks to be requests from OM Sub-task Resolved Stephen O'Donnell
          13.
          EC: Add missing break in switch statement when requesting EC blocks Sub-task Resolved Stephen O'Donnell
          14.
          EC: Add configuration to set an EC container placement Policy Sub-task Resolved Janus Chow
          15.
          EC: allocateContainer should handle ec replication config Sub-task Resolved Uma Maheswara Rao G
          16.
          ContainerStateMap should handle ecReplication config map Sub-task Resolved Stephen O'Donnell
          17.
          EC: ECReplicationConfig should be immutable Sub-task Resolved István Fajth
          18.
          EC: ContainerPlacementPolicyFactory#getPolicyInternal() should not be public Sub-task Resolved István Fajth
          19.
          OMKeyRequest#createFileInfo should handle ECReplicationConfig Sub-task Resolved Uma Maheswara Rao G
          20.
          EC: Implement ECBlockInputStream to read a single EC Block Group. Sub-task Resolved Stephen O'Donnell
          21.
          EC: openKey/createFile should return whether the file created in EC mode Sub-task Resolved Uma Maheswara Rao G
          22.
          EC: Make ECReplicationConfig stored as bucket level attributes. (Example Encryption info) Sub-task Resolved Uma Maheswara Rao G
          23.
          EC: Create a new as many racks as possible placement policy for EC Sub-task Resolved Janus Chow
          24.
          EC: Add padding and generate parity if the last stripe is not full Sub-task Resolved Uma Maheswara Rao G
          25.
          EC: Pipeline builder should copy replica Indexes from original pipeline Sub-task Resolved Stephen O'Donnell
          26.
          EC: commit key should consolidate and create one keyLocationInfo per blockGrp Sub-task Resolved Uma Maheswara Rao G
          27.
          EC: Provide replication config option from CLI when creating bucket. Sub-task Resolved Uma Maheswara Rao G
          28.
          EC: Resolve findbugs warnings after branch merge Sub-task Resolved Stephen O'Donnell
          29.
          EC: ECBlockOutputstream commitKey should create one keyLocationInfo per logical block Sub-task Resolved Stephen O'Donnell
          30.
          EC: Adapt KeyInputStream to read EC Blocks Sub-task Resolved Stephen O'Donnell
          31.
          EC: Add Codec and chunkSize to ECReplicationConfig Sub-task Resolved Stephen O'Donnell
          32.
          EC: ECKeyoutputStream should use codec and chunksize from ECReplicationConfig Sub-task Resolved Uma Maheswara Rao G
          33.
          EC: We should improve the behavior of BasicRootedOzoneClientAdapterImpl#getDefaultReplication with defaults configs moved to server. Sub-task Resolved Uma Maheswara Rao G
          34.
          EC: In BasicRootedOzoneClientAdapterImpl, Inherit bucket default replication config only in the case of EC. Sub-task Resolved Uma Maheswara Rao G
          35.
          EC: Remove hard coded chunksize and get from from ReplicationConfig Sub-task Resolved Stephen O'Donnell
          36.
          EC: Writing a large buffer to an EC file duplicates first chunk in block 1 and 2 Sub-task Resolved Uma Maheswara Rao G
          37.
          EC: Fix TestRootedOzoneFileSystem.testBucketDefaultsShouldBeInheritedToFileForEC failure in branch Sub-task Resolved Uma Maheswara Rao G
          38.
          EC: ECKeyOutputStream#close fails if we write the partial chunk Sub-task Resolved Uma Maheswara Rao G
          39.
          EC: ECKeyOutputStream persists blocks in random order Sub-task Resolved Stephen O'Donnell
          40.
          EC: Implement seek on ECKeyInputStream Sub-task Resolved Stephen O'Donnell
          41.
          EC: Implement seek on ECBlockInputStream Sub-task Resolved Stephen O'Donnell
          42.
          EC: Adopt EC related utility from Hadoop source repository Sub-task Resolved Uma Maheswara Rao G
          43.
          EC: Refactor ECBlockOutputStreamEntry to accommodate all block group related ECBlockOuputStreams. Sub-task Resolved István Fajth
          44.
          EC: Write should handle node failures. Sub-task Resolved Uma Maheswara Rao G
          45.
          EC: Integrate the Codec changes into EC Streams. Sub-task Resolved Uma Maheswara Rao G
          46.
          EC: Fix the compile issue in TestOzoneECClient (Due to concurrent commits) Sub-task Resolved Uma Maheswara Rao G
          47.
          EC: Implement an Input Stream to reconstruct EC blocks ondemand Sub-task Resolved Stephen O'Donnell
          48.
          EC: Implement seek on ECBlockReconstructedStripeInputStream Sub-task Resolved Stephen O'Donnell
          49.
          EC: Review the current flush API and clean up Sub-task Resolved Uma Maheswara Rao G
          50.
          EC: ECBlockReconstructedStripeInputStream should handle block read failures and continue reading Sub-task Resolved Stephen O'Donnell
          51.
          EC: Fix the replication config handling in OMDirectoryCreateRequest#dirKeyInfoBuilderNoACL Sub-task Resolved Uma Maheswara Rao G
          52.
          EC: Fix TestOmMetrics in merge branch Sub-task Resolved Uma Maheswara Rao G
          53.
          EC: Create ECBlockReconstructedInputStream to wrap ECBlockReconstructedStripeInputStream Sub-task Resolved Stephen O'Donnell
          54.
          EC: Fix TestOzoneShellHA failures post master merge with EC branch Sub-task Resolved Uma Maheswara Rao G
          55.
          EC: Optimize ECBlockReconstructedStripeInputStream where there are no missing data indexes. Sub-task Resolved Stephen O'Donnell
          56.
          EC: Change CLI bucket default replication option name to "-type" Sub-task Resolved Uma Maheswara Rao G
          57.
          EC: Track the failed servers to add into the excludeList when invoking allocateBlock Sub-task Resolved Uma Maheswara Rao G
          58.
          Centralize string based replication config validation via ReplicationConfigValidator Sub-task Resolved István Fajth
          59.
          EC: Create ECInputStream wrapper to choose between reconstruction and normal reads. Sub-task Resolved Stephen O'Donnell
          60.
          EC: Provide set replicationConfig option to bucket Sub-task Resolved Uma Maheswara Rao G
          61.
          Fix ReplicationConfig related test failures that happened due to merging HDDS-5997 to the EC branch Sub-task Resolved István Fajth
          62.
          EC: handleStripeFailure should retry Sub-task Resolved Uma Maheswara Rao G
          63.
          EC: ECBlockReconstructedStripeInputStream should read from blocks in parallel Sub-task Resolved Stephen O'Donnell
          64.
          EC: Provide CLI option to reset the bucket replication config Sub-task Resolved Uma Maheswara Rao G
          65.
          EC: HandleStripeFailure should not release the cachebuffers. Sub-task Resolved Uma Maheswara Rao G
          66.
          EC: Create reusable buffer pool shared by all EC input and output streams Sub-task Resolved Stephen O'Donnell
          67.
          EC: Client side exclude nodes list should expire after certain time period or based on the list size. Sub-task Resolved Uma Maheswara Rao G
          68.
          EC: Provide toString in ECReplicationConfig Sub-task Resolved Uma Maheswara Rao G
          69.
          EC: Pipeline creator should ignore creating pipelines for ZERO factor Sub-task Resolved Uma Maheswara Rao G
          70.
          EC: Review the TODOs in GRPC Xceiver client and fix them. Sub-task Resolved István Fajth
          71.
          EC: Document the Ozone EC Sub-task Resolved Uma Maheswara Rao G
          72.
          EC: Introduce a gRPC client implementation for EC with really async WriteChunk and PutBlock Sub-task Resolved István Fajth
          73.
          EC: Replication config from bucket should be refreshed in o3fs. Sub-task Resolved Uma Maheswara Rao G
          74.
          EC: put command should create EC key if bucket is EC Sub-task Resolved Uma Maheswara Rao G
          75.
          EC: Bucket does not display correct EC replication details Sub-task Resolved Stephen O'Donnell
          76.
          EC: Fix flakyness of tests around nodefailures Sub-task Resolved Uma Maheswara Rao G
          77.
          EC: putBlock should pass close flag true on end of block group/close file. Sub-task Resolved Uma Maheswara Rao G
          78.
          EC: Container Info command with json switch fails for EC containers Sub-task Resolved Stephen O'Donnell
          79.
          EC: Smoketest for ozone admin datanode expects exactly 3 nodes Sub-task Resolved Attila Doroszlai
          80.
          EC: Datanode Chunk Validator fails on encountering EC pipeline Sub-task Resolved Attila Doroszlai
          81.
          EC: Create EC acceptance test environment and some basic tests Sub-task Resolved Stephen O'Donnell
          82.
          EC: Replication Manager should skip EC Containers Sub-task Resolved Stephen O'Donnell
          83.
          EC: Parity blocks are incorrectly padded with zeros to the chunk size (#3043) Sub-task Resolved Uma Maheswara Rao G
          84.
          EC: Add replica index to the output in the container info command Sub-task Resolved Stephen O'Donnell
          85.
          EC: Recon UI failing because of EC replication factor Sub-task Resolved Uma Maheswara Rao G
          86.
          EC: Read with stopped but not dead nodes gives IllegalStateException rather than InsufficientNodesException Sub-task Resolved Stephen O'Donnell
          87.
          EC: Pipelines for closed containers should contain correct replica indexes Sub-task Resolved Stephen O'Donnell
          88.
          EC: Fix unaligned stripe write failure due to length overflow. Sub-task Resolved Mark Gui
          89.
          EC: Freon ockg support EC write Sub-task Resolved mingchao zhao
          90.
          EC: Apply fix for HDFS-16422 to the Ozone EC libraries Sub-task Resolved Stephen O'Donnell
          91.
          EC: Fix new checkstyle rule warnings in EC branch Sub-task Resolved Uma Maheswara Rao G
          92.
          EC: Make cluster-wide EC configuration take effect. Sub-task Resolved Kaijie Chen
          93.
          EC: Handle Replication Factor to consider EC Config in Recon UI Sub-task Resolved Uma Maheswara Rao G
          94.
          EC: Fix large write with multiple stripes upon stripe failure. Sub-task Resolved Mark Gui
          95.
          EC: Fix todo items in TestECKeyOutputStream Sub-task Resolved Stephen O'Donnell
          96.
          EC: Adapt java side native coder classes from hadoop. Sub-task Resolved Uma Maheswara Rao G
          97.
          EC: Update help strings for replication config Sub-task Resolved Kaijie Chen
          98.
          EC: Fix the race condition in TestECBlockReconstructedStripeInputStream Sub-task Resolved Uma Maheswara Rao G
          99.
          EC: Freon randomKeys EC key support Sub-task Resolved Kaijie Chen
          100.
          EC: Fix read big file failure with EC policy 10+4. Sub-task Resolved Mark Gui
          101.
          EC: PartialStripe failure handling logic is writing padding bytes also to DNs Sub-task Resolved Uma Maheswara Rao G
          102.
          EC: Do not throw NotImplementedException in flush() Sub-task Resolved Kaijie Chen
          103.
          EC: Refactor ECKeyOutputStream#write() Sub-task Resolved Kaijie Chen
          104.
          EC: Fix allocateBlockIfFull condition in ECKeyOutputStream#write() Sub-task Resolved Kaijie Chen
          105.
          EC: Calculate EC replication correctly when updating bucket usage Sub-task Resolved Stephen O'Donnell
          106.
          EC: Key Info command should not display legacy replication fields as they duplicate ReplicationConfig Sub-task Resolved Stephen O'Donnell
          107.
          EC: Exclude pipeline upon container close instead of exclude DNs. Sub-task Resolved Mark Gui
          108.
          EC: Automaticly switch to next OutputStream in ECBlockOutputStreamEntry Sub-task Resolved Kaijie Chen
          109.
          EC: Discard pre-allocated blocks to eliminate worthless retries. Sub-task Resolved Mark Gui
          110.
          EC: Review use of ReplicationConfig.getLegacyFactor() in the codebase Sub-task Resolved Uma Maheswara Rao G
          111.
          EC: add padding problem Sub-task Resolved cchenaxchen
          112.
          EC: Ensure EC container usage is updated correctly when handling reports Sub-task Resolved Stephen O'Donnell
          113.
          EC: EC keys can't be created via S3 interfaces Sub-task Resolved Uma Maheswara Rao G
          114.
          EC: ListPipelines command should consider EC Containers Sub-task Resolved Stephen O'Donnell
          115.
          EC: Fix broken future chain and cleanup unnecessary validation function. Sub-task Resolved Mark Gui
          116.
          EC: Priory of replication config in Ozone FS Sub-task Resolved Kaijie Chen
          117.
          EC: Fix too many idle threads during reconstruct read. Sub-task Resolved Mark Gui
          118.
          EC: Avoid allocating buffers in EC Reconstruction Streams until first read Sub-task Resolved Stephen O'Donnell
          119.
          EC: OzoneManagerRequestHandler needs to handle ECReplicationConfig Sub-task Resolved Stephen O'Donnell
          120.
          EC: Container list command should allow filtering of EC containers Sub-task Resolved Stephen O'Donnell
          121.
          EC: Fix allocateBlock failure due to inaccurate excludedNodes check. Sub-task Resolved Mark Gui
          122.
          EC: Fix number of preAllocatedBlocks for EC keys Sub-task Resolved Kaijie Chen
          123.
          EC: [Refactor-1] Check isFullCell inside handleDataWrite Sub-task Resolved Kaijie Chen
          124.
          EC: Adjust requested size by EC DataNum in WritableECContainerProvider Sub-task Resolved Stephen O'Donnell
          125.
          EC: OmMultipartKeyInfo needs to handle ECReplicationConfig Sub-task Resolved Attila Doroszlai
          126.
          EC: Reconstructed Input Streams should free resources after reading to end of block Sub-task Resolved Stephen O'Donnell
          127.
          EC: Improve exception message in ByteBufferEncodingState Sub-task Resolved cchenaxchen
          128.
          EC: Handle ECReplicationConfig in initiateMultipartUpload API Sub-task Resolved Uma Maheswara Rao G
          129.
          EC: the offset is less than writeoffset Sub-task Resolved cchenaxchen
          130.
          EC: Fix ISA-l load hadoop native lib UnsatisfiedLinkError Sub-task Resolved cchenaxchen
          131.
          EC: OzoneMultipartUpload needs to handle ECReplicationConfig Sub-task Resolved Attila Doroszlai
          132.
          EC: OzoneMultipartUploadPartListParts needs to handle ECReplicationConfig Sub-task Resolved Attila Doroszlai
          133.
          EC: write when the datanode not enough Sub-task Resolved cchenaxchen
          134.
          EC: Overwriting an EC key with a Ratis key fails Sub-task Resolved Uma Maheswara Rao G
          135.
          EC: Execute S3 acceptance tests with EC Sub-task Resolved Attila Doroszlai
          136.
          EC: Unify replication-related CLI params Sub-task Resolved Attila Doroszlai
          137.
          EC: [Forward compatibility issue] New client to older server could fail due to the unavailability for client default replication config Sub-task Resolved István Fajth
          138.
          EC: scm CheckAndRecoverECContainer command Sub-task Resolved Unassigned
          139.
          EC: Onboard EC into upgrade framework Sub-task Resolved István Fajth

          Activity

            People

              umamaheswararao Uma Maheswara Rao G
              umamaheswararao Uma Maheswara Rao G
              Votes:
              1 Vote for this issue
              Watchers:
              39 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: