Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9407

Support orc rolling sink writer

    XMLWordPrintableJSON

    Details

      Description

      Currently, we only support StringWriter, SequenceFileWriter and AvroKeyValueSinkWriter. I would suggest add an orc writer for rolling sink.

      Below, FYI.

      I tested the PR and verify the results with spark sql. Obviously, we can get the results of what we had written down before. But I will give more tests in the next couple of days. Including the performance under compression with short checkpoint intervals. And more UTs.

      scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
      res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]
      
      scala>
      
      scala> res1.registerTempTable("tablerice")
      warning: there was one deprecation warning; re-run with -deprecation for details
      
      scala> spark.sql("select * from tablerice")
      res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]
      
      scala> res3.show(3)
      +-----+---+-------+
      | name|age|married|
      +-----+---+-------+
      |Sagar| 26|  false|
      |Sagar| 30|  false|
      |Sagar| 34|  false|
      +-----+---+-------+
      only showing top 3 rows
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                mingleizhang zhangminglei
              • Votes:
                1 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated: