Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-10670

[Hsync] OverWrite using "sh key put" shouldn't be allowed on hsynced files.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • None
    • None
    • OM
    • None

    Description

      Scenario: Do a sh key put on the same key that has been hsynced and is open.

      Observations:
      Open a file, write data and do hsync

      Data written on iteration : 1
      Data written on iteration : 2
      Data written on iteration : 3
      Data written on iteration : 4
      Data written on iteration : 5
      Data written on iteration : 6
      Data written on iteration : 7
      Data written on iteration : 8
      Data written on iteration : 9
      Data written on iteration : 10
      Hsync completed on file, counter: 10 

      ---Here the code sleeps of 60 seconds----

      Checking the file is Open and hsynced

      ozone admin om lof --service-id=ozone1712556312 --prefix=/hsyncvol/hsyncbuck/
      1151 total open files (est.). Showing 1 open files (limit 100) under path prefix:
        /hsyncvol/hsyncbuck/Client ID        Creation time    Hsync'ed    Open File Path
      112241980463843099    1712676703855    Yes        /hsyncvol/hsyncbuck/-9223372036849133055/File_new_11.txtReached the end of the list. 

      Doing a key info till now:

      ozone sh key info hsyncvol/hsyncbuck/hsync/File_new_11.txt
      24/04/09 15:31:48 INFO protocolPB.OmTransportFactory: Loading OM transport implementation org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory as specified by configuration.
      24/04/09 15:31:49 INFO client.ClientTrustManager: Loading certificates for client.
      {
        "volumeName" : "hsyncvol",
        "bucketName" : "hsyncbuck",
        "name" : "hsync/File_new_11.txt",
        "dataSize" : 51200,
        "creationTime" : "2024-04-09T15:31:43.855Z",
        "modificationTime" : "2024-04-09T15:31:45.145Z",
        "replicationConfig" : {
          "replicationFactor" : "THREE",
          "requiredNodes" : 3,
          "replicationType" : "RATIS"
        },
        "metadata" : {
          "hsyncClientId" : "112241980463843099"
        },
        "ozoneKeyLocations" : [ {
          "containerID" : 2,
          "localID" : 113750153625604824,
          "length" : 51200,
          "offset" : 0,
          "keyOffset" : 0
        } ],
        "file" : true
      } 

      Now OverWrite the same key using sh key put

       ozone sh key put hsyncvol/hsyncbuck/hsync/File_new_11.txt /etc/passwd
      24/04/09 15:32:08 INFO protocolPB.OmTransportFactory: Loading OM transport implementation org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory as specified by configuration.
      24/04/09 15:32:09 INFO client.ClientTrustManager: Loading certificates for client.
      24/04/09 15:32:10 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-xceiverclientmetrics.properties,hadoop-metrics2.properties
      24/04/09 15:32:10 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
      24/04/09 15:32:10 INFO impl.MetricsSystemImpl: XceiverClientMetrics metrics system started
      24/04/09 15:32:10 INFO metrics.MetricRegistries: Loaded MetricRegistries class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl 

      Key info, successfully overwritten:

      ozone sh key info hsyncvol/hsyncbuck/hsync/File_new_11.txt
      24/04/09 15:32:17 INFO protocolPB.OmTransportFactory: Loading OM transport implementation org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory as specified by configuration.
      24/04/09 15:32:17 INFO client.ClientTrustManager: Loading certificates for client.
      {
        "volumeName" : "hsyncvol",
        "bucketName" : "hsyncbuck",
        "name" : "hsync/File_new_11.txt",
        "dataSize" : 6644,
        "creationTime" : "2024-04-09T15:31:43.855Z",
        "modificationTime" : "2024-04-09T15:32:11.283Z",
        "replicationConfig" : {
          "replicationFactor" : "THREE",
          "requiredNodes" : 3,
          "replicationType" : "RATIS"
        },
        "metadata" : { },
        "ozoneKeyLocations" : [ {
          "containerID" : 1,
          "localID" : 113750153625604825,
          "length" : 6644,
          "offset" : 0,
          "keyOffset" : 0
        } ],
        "file" : true
      } 

      But key is still open in the OpenKeyTable:

      ozone admin om lof --service-id=ozone1712556312 --prefix=/hsyncvol/hsyncbuck/
      1151 total open files (est.). Showing 1 open files (limit 100) under path prefix:
        /hsyncvol/hsyncbuck/Client ID        Creation time    Hsync'ed    Open File Path
      112241980463843099    1712676703855    Yes        /hsyncvol/hsyncbuck/-9223372036849133055/File_new_11.txtReached the end of the list. 

      Perform LeaseRecovery:

      ozone debug recover --path=ofs://ozone1712556312//hsyncvol/hsyncbuck/hsync/File_new_11.txt
      24/04/09 15:32:32 INFO protocolPB.OmTransportFactory: Loading OM transport implementation org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory as specified by configuration.
      24/04/09 15:32:32 INFO client.ClientTrustManager: Loading certificates for client.
      Lease recovery SUCCEEDED on ofs://ozone1712556312//hsyncvol/hsyncbuck/hsync/File_new_11.txt 

      Still the key is open:

      ozone admin om lof --service-id=ozone1712556312 --prefix=/hsyncvol/hsyncbuck/
      1151 total open files (est.). Showing 1 open files (limit 100) under path prefix:
        /hsyncvol/hsyncbuck/Client ID        Creation time    Hsync'ed    Open File Path
      112241980463843099    1712676703855    Yes        /hsyncvol/hsyncbuck/-9223372036849133055/File_new_11.txtReached the end of the list. 

      – 60 seconds of code pause is resumed now--
      Still able to write the data into the file and hsync

      Data written on iteration : 11
      Data written on iteration : 12
      Data written on iteration : 13
      Data written on iteration : 14
      Data written on iteration : 15
      Data written on iteration : 16
      Data written on iteration : 17
      Data written on iteration : 18
      Data written on iteration : 19
      Data written on iteration : 20
      Hsync completed on file, counter: 20 

      Basically looks like lease recovery gets broken if we allow such overwriting.

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pratyush.bhatt Pratyush Bhatt
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: