[SPARK-21762] FileFormatWriter/BasicWriteTaskStatsTracker metrics collection fails if a new file isn't yet visible - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.3.0
Fix Version/s: 2.3.0
Component/s: SQL
Labels:
None
Environment:

object stores without complete creation consistency (this includes AWS S3's caching of negative GET results)

Description

The metrics collection of ~~SPARK-20703~~ can trigger premature failure if the newly written object isn't actually visible yet, that is if, after writer.close(), a getFileStatus(path) returns a FileNotFoundException.

Strictly speaking, not having a file immediately visible goes against the fundamental expectations of the Hadoop FS APIs, namely full consistent data & medata across all operations, with immediate global visibility of all changes. However, not all object stores make that guarantee, be it only newly created data or updated blobs. And so spurious FNFEs can get raised, ones which should have gone away by the time the actual task is committed. Or if they haven't, the job is in such deep trouble.

What to do?

leave as is: fail fast & so catch blobstores/blobstore clients which don't behave as required. One issue here: will that trigger retries, what happens there, etc, etc.
Swallow the FNFE and hope the file is observable later.
Swallow all IOEs and hope that whatever problem the FS has is transient.

Options 2 & 3 aren't going to collect metrics in the event of a FNFE, or at least, not the counter of bytes written.

Attachments

Issue Links

blocks

SPARK-20901 Feature parity for ORC with Parquet

Open

is duplicated by

SPARK-22258 Writing empty dataset fails with ORC format

Resolved

relates to

SPARK-20703 Add an operator for writing data out

Resolved

SPARK-21669 Internal API for collecting metrics/stats during FileFormatWriter jobs

Resolved

links to

[Github] Pull Request #18979 (steveloughran)

Activity

People

Assignee:: Steve Loughran

Reporter:: Steve Loughran

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 17/Aug/17 15:58

Updated:: 14/Oct/17 06:08

Resolved:: 14/Oct/17 06:08