Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10003

Encoder interface inefficient when wanting to use more sophisticated outputstreams

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.6.0
    • Fix Version/s: None
    • Component/s: Connectors / Common
    • Labels:
      None

      Description

      The StreamingFileSink uses the Encoder interface to serialize data.

      public interface Encoder<IN> extends Serializable {
      	void encode(IN element, OutputStream stream) throws IOException;
      }
      

      The implementation (with the exception for strings) must be provided by the user.
      To use any OutputStream implementation that is a little more convenient than the base OutputStream (like DataOutputStream) requires creating a new stream for every single record. If an implementation is used that potentially buffers data users additionally have to call flush().

      Instead we could allow specifying an optional factory for the streams, that would be called once for each part file, and modify the Encoder interface to have a generic type for the output stream.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              chesnay Chesnay Schepler
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: