Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15478

WASB: hflush() and hsync() regression

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.9.0, 3.0.2
    • 2.10.0, 3.1.1
    • fs/azure
    • None
    • WASB: Bug fix for recent regression in hflush() and hsync().

    Description

      HADOOP-14520 introduced a regression in hflush() and hsync().  Previously, for the default case where users upload data as block blobs, these were no-ops.  Unfortunately, HADOOP-14520 accidentally implemented hflush() and hsync() by default, so any data buffered in the stream is immediately uploaded to storage.  This new behavior is undesirable, because block blobs have a limit of 50,000 blocks.  Spark users are now seeing failures due to exceeding the block limit, since Spark frequently invokes hflush().

      Attachments

        1. HADOOP-15478.001.patch
          16 kB
          Thomas Marqardt
        2. HADOOP-15478-002.patch
          16 kB
          Steve Loughran

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tmarquardt Thomas Marqardt
            tmarquardt Thomas Marqardt
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment