Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.1.0
Description
For HDFS sink, the syncFs() is called in HDFSSequenceFile. But syncFs() is not available in legacy hadoop 0.20.2, which may be a widely used version. sync() method is available for all hadoop versions. And syncFs() is also implemented by sync() in hadoop (SequenceFile):
/** create a sync point */ public void sync() throws IOException { if (sync != null && lastSyncPos != out.getPos()) { out.writeInt(SYNC_ESCAPE); // mark the start of the sync out.write(sync); // write sync lastSyncPos = out.getPos(); // update lastSyncPos } } /** flush all currently written data to the file system */ public void syncFs() throws IOException { if (out != null) { out.sync(); // flush contents to file system } }
Therefore, using sync() in HDFSSequenceFile may be better.
@Override public void sync() throws IOException { //writer.syncFs(); //for hadoop 0.20.205.0+ writer.sync(); //support hadoop 0.20.2+ }
Attachments
Attachments
Issue Links
- breaks
-
FLUME-1595 HDFS SequenceFile implementation is not durable due to not using syncFs()
- Resolved