Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Invalid
-
None
-
None
Description
HIFIO is currently in PR (https://github.com/apache/beam/pull/1994) and as per discussion in https://lists.apache.org/thread.html/803857877804165e798cf31edf079e6603eb9682b7690d52124c31e7@%3Cdev.beam.apache.org%3E, we'd like to check HIFIO in as-is, then unify the two since they share a lot of code.
dhalperi@google.com has mentioned: "the FileInputFormat reader gets to call some special APIs that the
generic InputFormat reader cannot – so they are not completely redundant. Specifically, FileInputFormat reader can do size-based splitting."
Dan recommended: "See if we can "inline" the FileInputFormat specific parts of HdfsIO inside of HadoopInputFormatIO via reflection. If so, we can get the best of both worlds with shared code."
This seems reasonable to me.
Attachments
Issue Links
- is related to
-
BEAM-2016 Delete HDFSFileSource/Sink
- Resolved