Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16829 Über-jira: S3A Hadoop 3.3.1 features
  3. HADOOP-17597

Add option to downgrade S3A rejection of Syncable to warning

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.3.1
    • 3.3.1
    • None
    • Hide
      The S3A output streams now raise UnsupportedOperationException on calls to Syncable.hsync() or Syncable.hflush(). This is to make absolutely clear to programs trying to use the syncable API that the stream doesn't save any data at all until close. Programs which use this to flush their write ahead logs will fail immediately, rather than appear to succeed but without saving any data.

      To downgrade the API calls to simply printing a warning, set fs.s3a.downgrade.syncable.exceptions" to true. This will not change the other behaviour: no data is saved.

      Object stores are not filesystems.
      Show
      The S3A output streams now raise UnsupportedOperationException on calls to Syncable.hsync() or Syncable.hflush(). This is to make absolutely clear to programs trying to use the syncable API that the stream doesn't save any data at all until close. Programs which use this to flush their write ahead logs will fail immediately, rather than appear to succeed but without saving any data. To downgrade the API calls to simply printing a warning, set fs.s3a.downgrade.syncable.exceptions" to true. This will not change the other behaviour: no data is saved. Object stores are not filesystems.

    Description

      The Hadoop Filesystem Syncable API is intended to meet the requirements laid out in [StoneBraker81] Operating System Support for Database Management

      The service required from an OS buffer manager is a selectedforce out which would push the intentions list and the commit flag to disk in the proper order. Such a service is not present in any buffer manager known to us.

      It's an expensive operation -so expensive that Syncable.hsync() isn't even called on DFSOutputStream.close(). I

      Even though S3A does not manifest any data until close() is called, applications coming from HDFS may call Syncable methods and expect to them to persist data with the durability guarantees offered by HDFS.

      Since the output stream hardening of HADOOP-13327, S3A throws UnsupportedOperationException to indicate that the synchronization semantics of Syncable absolutely cannot be met.

      As a result, applications which have been calling the Syncable APIs are finding the call failing. In the absence of exception handling to recognise that the durability semantics are being met, they fail.

      If the user and the application actually expects data to be persisted, this is the correct behaviour. The data cannot be persisted this way.

      If, however, they were calling this on HDFS more as a flush() than the full and expensive DBMS-class persistence call, then this failure is unwelcome. The applications really needs to catch the UnsupportedOperationException raised by S3A or any other FS strictly reporting failures, report the problem and perform some other means of safe data storage

      Even better, they can use hasPathCapability on the FS or hasCapability() on the stream to probe before even opening a file or trying to sync it. the hasCapability() on a stream was actually implemented in Hadooop-2.x precisely to allow applications to identify when a stream could not meet the guarantees (e.g some of the encrypted streams, file:// before HADOOP-13...)

      Until they can correct their code, I propose adding the option for s3a to downgrade

      fs.s3a.downgrade.syncable.exceptions

      This will

      • Log once per process at WARN
      • downgrade the calls to noop()
      • increment counters in S3A stats and IO stats of invocations of the Syncable methods. This will allow for stats gathering to let us identify which applications need fixing in cloud deployments

      Testing: copy the hsync tests but expect exceptions to be swallowed and stats to be collected

      Also: UnsupportedException text will link to this JIRA

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h