Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-7593

Supporting HSync and lease recovery

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 2.0.0
    • None
    • None

    Description

      This is the umbrella jira encompassing the design and implementation of hflush API and lease recovery support in Ozone. This feature enables new use cases such as HBase and Solr where the Write Ahead Log and Transaction Logs are flushed to Ozone constantly.

      A design doc will be added shortly.

      Attachments

        Issue Links

          1.
          Ozone client change to support HSync Sub-task Resolved Tsz-wo Sze  
          2.
          Add support for application to probe output stream capability Sub-task Resolved Wei-Chiu Chuang  
          3.
          Ozone Manager change to support HSync Sub-task Resolved Wei-Chiu Chuang  
          4.
          hsync: Change KeyOutputStream to update length in OM Sub-task Resolved Unassigned  
          5.
          Potential discrepancy of key creation time may cause premature open key clean up Sub-task Resolved Tsz-wo Sze  
          6.
          OM lease recovery for hsync'ed files Sub-task Resolved Tsz-wo Sze  
          7.
          [hsync] Recon throws ClassCastException Sub-task Resolved Arafat Khan  
          8.
          [hsync] Outputstream in encrypted buckets do not return the correct stream capabilities Sub-task Resolved Wei-Chiu Chuang  
          9.
          input stream does not refresh expired block token Sub-task Resolved Wei-Chiu Chuang  
          10.
          Change Get Key Info to return HSync info Sub-task Resolved Wei-Chiu Chuang

          0%

          Original Estimate - 72h
          Remaining Estimate - 72h
          11.
          Ozone file systems to support Hadoop's PathCapabilities interface Sub-task Resolved Wei-Chiu Chuang  
          12.
          Add hsync metrics in OM Sub-task Resolved Tsz-wo Sze  
          13.
          Implement client initiated lease recovery Sub-task Resolved Wei-Chiu Chuang  
          14.
          Add a flag to disable hsync by default Sub-task Resolved Wei-Chiu Chuang  
          15.
          [hsync] KeyOutputStream is not thread safe Sub-task Resolved Wei-Chiu Chuang  
          16.
          LeaseRecovery failing with NullPointer exception Sub-task Resolved Wei-Chiu Chuang  
          17.
          [hsync] Add a CLI to recover lease Sub-task Resolved Wei-Chiu Chuang  
          18.
          OM crash with NPE in OMKeyCommitRequest due to missing user info Sub-task Resolved Sumit Agrawal  
          19.
          Quota needs to be updated correctly for Hsync Sub-task Resolved Sumit Agrawal  
          20.
          TestHSync is no longer flaky Sub-task Resolved Tsz-wo Sze  
          21.
          [hsync] reject renaming open file Sub-task Resolved Wei-Chiu Chuang  
          22.
          O3fs/ofs to support setTimes() API Sub-task Resolved Wei-Chiu Chuang  
          23.
          [hsync] OMKeyRequest: Detect allocated but uncommitted blocks Sub-task Resolved Wei-Chiu Chuang  
          24.
          Disallow overwriting a hsync'ed key Sub-task Resolved Wei-Chiu Chuang  
          25.
          Support setSafeMode(), isFileClosed() FileSystem API Sub-task Resolved Wei-Chiu Chuang  
          26.
          [hsync] HBase RegionServer input stream not shut down properly Sub-task Resolved Unassigned  
          27.
          OM to reject hsync if ozone.fs.hsync.enabled is false Sub-task Resolved Wei-Chiu Chuang  
          28.
          [hsync] A freon tool to focus on hsync/hflush performance Sub-task Resolved Wei-Chiu Chuang  
          29.
          ozone freon --server is broken by HDDS-6176 Sub-task Resolved Wei-Chiu Chuang  
          30.
          ChunkInputStream should use new token after pipeline refresh Sub-task Resolved Attila Doroszlai  
          31.
          [hsync] File recovery support in OM Sub-task Resolved Sammi Chen  
          32.
          [hsync] File recovery support in Client Sub-task Resolved Sammi Chen  
          33.
          hsync: Interface to retrieve block info and finalize block in DN through ratis Sub-task Resolved Ashish Kumar  
          34.
          [hsync] DataNode to deserialize Ratis transaction only once Sub-task Resolved Tsz-wo Sze  
          35.
          Make recoverLease call idempotent Sub-task Resolved Sammi Chen  
          36.
          [hsync] Make Putblock performance acceptable - Skeleton code Sub-task Resolved Wei-Chiu Chuang  
          37.
          [hsync] Cache serialized block token in output stream to reduce heap consumption Sub-task Resolved Wei-Chiu Chuang  
          38.
          Introduce soft limit support for lease recovery Sub-task Resolved Ashish Kumar  
          39.
          Add admin CLI to list open files Sub-task Resolved Siyao Meng  
          40.
          [hsync] Redesign the lease recovery protocol so block length is updated correctly at OM Sub-task Resolved Unassigned  
          41.
          [hsync] Rebase HDDS-7593 branch onto master Sub-task Resolved Wei-Chiu Chuang  
          42.
          [hsync] write after lease recovery does not fail Sub-task Resolved Sammi Chen  
          43.
          [hsync] Make Putblock performance acceptable - DataNode side Sub-task Resolved Wei-Chiu Chuang  
          44.
          [hsync]Support hard limit and auto recovery for hsync file Sub-task Resolved Ashish Kumar  
          45.
          [hsync] Reduce updating block length times at OM during hsync Sub-task Resolved Sammi Chen  
          46.
          two client parallel perform commit with Hsync can cause dataloss Sub-task Resolved Siyao Meng  
          47.
          Deleted file reappears after HSync Sub-task Resolved Siyao Meng  
          48.
          Add hsync metadata to hsync'ed keys in OpenKeyTable as well Sub-task Resolved Siyao Meng  
          49.
          Migrate tests to JUnit5 Sub-task Resolved Ashish Kumar  
          50.
          [hsync] Make Putblock performance acceptable - Client side Sub-task Resolved Wei-Chiu Chuang  
          51.
          Merge recent commits from master to HDDS-7593 Sub-task Resolved Wei-Chiu Chuang  
          52.
          [hsync] Make Putblock performance acceptable Sub-task Resolved Wei-Chiu Chuang  
          53.
          DataNode doesn't set proper DatanodeVersion when registering with SCM Sub-task Resolved Siyao Meng  
          54.
          [hsync] Output stream should support direct byte buffer Sub-task Resolved Wei-Chiu Chuang  
          55.
          [hsync] disk usage thread aborts if ratis log rolls very quickly Sub-task Resolved Unassigned  
          56.
          [hsync] Revisit configuration keys for incremental chunk list after HDDS-9884 Sub-task Resolved Wei-Chiu Chuang  
          57.
          [hsync] MockDatanodeStorage.writeChunk should make a copy of byte string Sub-task Resolved Wei-Chiu Chuang  
          58.
          Merge recent commits from master (7c8160fe) to HDDS-7593 Sub-task Resolved Siyao Meng  
          59.
          [hsync] Refresh block token immediately if block token expires Sub-task Resolved Wei-Chiu Chuang  
          60.
          OzoneFSInputStream to support ByteBufferPositionedReadable Sub-task Resolved Ashish Kumar  
          61.
          [hsync] Add a Freon tool to measure client to DataNode round-trip latency Sub-task Resolved Wei-Chiu Chuang  
          62.
          [hsync] Combine WriteData and PutBlock requests into one Sub-task Resolved Wei-Chiu Chuang  
          63.
          [LeaseRecovery] OM shuts down with "SecretKey client must have been initialized already" Sub-task Resolved Sammi Chen  
          64.
          [hsync] improve block token refresh message Sub-task Resolved Wei-Chiu Chuang  
          65.
          [hsync] Add OpenTracing traces to client side read path Sub-task Resolved Wei-Chiu Chuang  
          66.
          Merge master 97038ef to feature branch HDDS-7593 Sub-task Resolved Siyao Meng  
          67.
          [hsync] Merge recent commits from master #4 Sub-task Resolved Wei-Chiu Chuang  
          68.
          HBase WAL splitting fails due to lease recovery Sub-task Resolved Sammi Chen  
          69.
          [hsync] OMKeyCommitRequest should reject if client id doesn't match Sub-task Resolved Chung En Lee  
          70.
          [hsync] lease recovery contract test class not substantiated Sub-task Resolved Chung En Lee  
          71.
          [hsync] Show deleted hsync keys in ListOpenFile CLI Sub-task Resolved Ashish Kumar  
          72.
          Merge recent commits from master to HDDS-7593 Sub-task Resolved Ashish Kumar  
          73.
          Show overwritten hsync keys in ListOpenFile CLI Sub-task Resolved Sammi Chen  
          74.
          Wrong count is displaying in listOpenFile CLI Sub-task Resolved Unassigned  
          75.
          [hsync] Output stream lastChunkBuffer should use direct buffer Sub-task Resolved Ashish Kumar  
          76.
          [hsync] Flush to only wait for majority of DataNodes Sub-task Resolved Unassigned  
          77.
          Remove unused UserGroupInformation object in DataNode token verifier Sub-task Resolved Wei-Chiu Chuang  
          78.
          Freon tool DN-Echo to support GRPC and Ratis read/write mode Sub-task Resolved Wei-Chiu Chuang  
          79.
          [hsync] Increase default value for hdds.container.ratis.log.appender.queue.num-elements Sub-task Resolved Wei-Chiu Chuang  
          80.
          Intermittent failure in TestLeaseRecovery.testFinalizeBlockFailure Sub-task Resolved Ashish Kumar  
          81.
          recoverLease should close underlying streams Sub-task Resolved Ashish Kumar  
          82.
          [hsync] 6th merge from master Sub-task Resolved Wei-Chiu Chuang  
          83.
          Merge master branch 611066a to HDDS-7593 dev branch Sub-task Resolved Siyao Meng  
          84.
          [hsync] Parameterize TestBlockOutputStream on ozone.client.stream.putblock.piggybacking Sub-task Resolved Wei-Chiu Chuang  
          85.
          [hsync] Remove block token from Ratis log once verified Sub-task Resolved Wei-Chiu Chuang  
          86.
          [hsync] Investigate why DataNode Echo throughput is so low Sub-task Resolved Wei-Chiu Chuang  
          87.
          [hsync] Client side metrics Sub-task Resolved Wei-Chiu Chuang  
          88.
          [File Lease] OM adds request message handler Sub-task Resolved Wei-Chiu Chuang  
          89.
          [File Lease] Client side lease renewer thread and request message Sub-task Resolved Wei-Chiu Chuang  
          90.
          [File Lease] OM adds FileLeaseManager Sub-task Resolved Wei-Chiu Chuang  
          91.
          [hsync] 7th merge from master Sub-task Resolved Wei-Chiu Chuang  
          92.
          [hsync] Block finalization should also merge last chunk to blockDataTable Sub-task Resolved Wei-Chiu Chuang  
          93.
          [hsync] Adopt Ratis 3.1.0 when it's released Sub-task Resolved Chung En Lee  
          94.
          Add a few interesting ContainerStateMachine metrics in CSMMetrics Sub-task Resolved Wei-Chiu Chuang  
          95.
          [hsync] 8th merge from master Sub-task Resolved Siyao Meng  
          96.
          [hsync] Checking disk capacity at every write request is expensive for HBase Sub-task Resolved Attila Doroszlai  
          97.
          [hsync] Add a freon tool to benchmark hsync/write concurrency Sub-task Resolved Duong  
          98.
          [hsync] 9th merge from master Sub-task Resolved Siyao Meng  
          99.
          [hsync] De-synchronize hsync API Sub-task Resolved Duong  
          100.
          [hsync] Improve BlockOutputStream's BufferPool to support variable buffer allocation from concurrent hsync Sub-task Resolved Duong  
          101.
          [hsync] Replace expensive VolumeUsage.getMinVolumeFreeSpace() Sub-task Resolved Wei-Chiu Chuang  
          102.
          [hsync] Instantiates audit parameter lazily in DataNode dispatch handler Sub-task Resolved Wei-Chiu Chuang  
          103.
          Fix ContainerOpsLatencies metrics Sub-task Resolved Duong  
          104.
          KeyOutputStream flakiness when running write and hsync concurrently Sub-task Resolved Duong  
          105.
          [hsync] Support renaming open files Sub-task Resolved Unassigned  
          106.
          [hsync] Make Putblock performance acceptable - Tool cleanup, guardrails Sub-task Resolved Wei-Chiu Chuang  
          107.
          Increase hdds.datanode.handler.count Sub-task Resolved Wei-Chiu Chuang  
          108.
          Increase ipc.server.read.threadpool.size Sub-task Resolved Wei-Chiu Chuang  
          109.
          [hsync] Add new OM layout version Sub-task Resolved Wei-Chiu Chuang  
          110.
          [hsync] DataNode should verify HBASE_SUPPORT layout version for every PutBlock Sub-task Resolved Wei-Chiu Chuang  
          111.
          [hsync] Add Ozone Manager protocol version Sub-task Resolved Wei-Chiu Chuang  
          112.
          TestBlockOutputStream.testWriteMoreThanFlushSize is flaky Sub-task Resolved Ashish Kumar  
          113.
          [hsync] Merge HDDS-7593 feature branch into master Sub-task Resolved Ashish Kumar  
          114.
          Merge recent commits from master to HDDS-7593 Sub-task Resolved Ashish Kumar  
          115.
          [hsync] Add DN layout version (HBASE_SUPPORT/version 8) upgrade test Sub-task Resolved Wei-Chiu Chuang  
          116.
          [hsync] Change XceiverClientRatis.watchForCommit to async Sub-task Resolved Tsz-wo Sze  
          117.
          [hsync] Move HBASE_SUPPORT layout upgrade test into its own test Sub-task Resolved Wei-Chiu Chuang  
          118.
          [hsync] Add a client config to limit write concurrency on the same key Sub-task Resolved Siyao Meng  
          119.
          [hsync] Revert config default ozone.fs.hsync.enabled to false Sub-task Resolved Siyao Meng  
          120.
          [hsync] Block ECKeyOutputStream from calling hsync and hflush Sub-task Resolved Siyao Meng  
          121.
          ContainerStateMachine should not crash because of CHUNK_FILE_INCONSISTENCY Sub-task Resolved Duong  
          122.
          Flakiness in KeyOutputStream exception handling Sub-task Resolved Duong  
          123.
          [hsync] Enable PutBlock piggybacking and incremental chunk list by default Sub-task Resolved Wei-Chiu Chuang  
          124.
          Change RatisBlockOutputStream to use HDDS-11174 Sub-task Resolved Tsz-wo Sze  
          125.
          [hsync] Remove hsync and hflush capability check in ContentGenerator Sub-task Resolved Hemant Kumar  
          126.
          Use OMLayoutFeature.HBASE_SUPPORT for HSYNC Sub-task Resolved Hemant Kumar  
          127.
          [hsync] Add upgrade tests Sub-task Resolved Hemant Kumar  
          128.
          [hsync] Add a config as HBase-related features master switch Sub-task Resolved Siyao Meng  
          129.
          [hsync] Remove KeyOutputStreamSemaphore logs Sub-task Resolved Chung En Lee  

          Activity

            People

              weichiu Wei-Chiu Chuang
              weichiu Wei-Chiu Chuang
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 72h
                  72h
                  Remaining:
                  Remaining Estimate - 72h
                  72h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified