Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-7214

Add a sink that writes to ORCFile on HDFS

    XMLWordPrintableJSON

Details

    Description

      ORCFile format is currently one of the most efficient storage formats on HDFS from both the storage and search speed perspective, and it's a well supported standard.

      This feature would receive an input stream, map its columns to the columns in a Hive table, and write it to HDFS in ORC format. It would need to support hive bucketing and dynamic hive partitioning, and generate the appropriate metadata in the Hive database.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Mythobeast Robert Rapplean
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: