Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Currently we are creating Avro containers using HDFS Sink, and we don't have a fine grained control over when can we roll a file and create new file,
since we are using avro records and containers, whenever the avro version changes we need to roll a new file (close the current file and create new file to roll over to new avro schema).
Suggested changes:
1) Make BucketWriter.java public and make required fields and mthods protected (like, shouldRotate and pass Event to shouldRotate method, so that we can determine based on the event we are parsing currently), so that we can extend that class and modify required changes.
2) In HDFSEventSink.java, make initializeBucketWriter method protected and provide getters to the private properties.