Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-1919

Add HCatOutputFormat for Tuple data types

    Details

    • Type: New Feature
    • Status: In Progress
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: DataSet API
    • Labels:

      Description

      It would be good to have an OutputFormat that can write data to HCatalog tables.

      The Hadoop `HCatOutputFormat` expects `HCatRecord` objects and writes these to HCatalog tables. We can do the same thing, by creating these `HCatRecord` object with a Map function that precedes a `HadoopOutputFormat` that wraps the Hadoop `HCatOutputFormat`.

      Better support for Flink Tuples can be added by implementing a custom `HCatOutputFormat` that also depends on the Hadoop `HCatOutputFormat` but internally converts Flink Tuples to `HCatRecords`. This would also include to check if the schema of the HCatalog table and the Flink tuples match. For data types other than tuples, the OutputFormat could either require a preceding Map function that converts to `HCatRecords` or let users specify a MapFunction and invoke that internally.

      We have already a Flink `HCatInputFormat` which does this in the reverse directions, i.e., it emits Flink Tuples from HCatalog tables.

        Activity

        Hide
        James_cao James Cao added a comment -

        pull request for this issue:
        https://github.com/apache/flink/pull/1079

        Show
        James_cao James Cao added a comment - pull request for this issue: https://github.com/apache/flink/pull/1079
        Hide
        James_cao James Cao added a comment -

        Thanks!

        Show
        James_cao James Cao added a comment - Thanks!
        Hide
        fhueske Fabian Hueske added a comment -

        Hi James Cao,

        I don't think that somebody is working on this issue.
        I'll assign it to you.

        Thanks, Fabian

        Show
        fhueske Fabian Hueske added a comment - Hi James Cao , I don't think that somebody is working on this issue. I'll assign it to you. Thanks, Fabian
        Hide
        James_cao James Cao added a comment -

        Hello, is there any one working on this issue now? If not, I would like to contribute on this issue. Thanks

        Show
        James_cao James Cao added a comment - Hello, is there any one working on this issue now? If not, I would like to contribute on this issue. Thanks

          People

          • Assignee:
            James_cao James Cao
            Reporter:
            fhueske Fabian Hueske
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development