Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9126

Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

    XMLWordPrintableJSON

Details

    Description

      Currently the DataSet API only has the ability to output data received from Cassandra as a source in as a Tuple. This would be allow the data to be output as a custom POJO that the user has created that has been annotated using Datastax API. This would remove the need of  very long Tuples to be created by the DataSet and then mapped to the custom POJO.

       

      The changes to the CassandraInputFormat object would be minimal, but would require importing the Datastax API into the class. Another option is to make a similar, but slightly different class called CassandraPojoInputFormat.

      I have already gotten code for this working in my own project, but want other thoughts as to the best way this should go about being implemented.

       

      //Example of its use in main

      CassandraPojoInputFormat<CustomCassandraPojo> cassandraInputFormat = new CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, CustomCassandraPojo.class);
      cassandraInputFormat.configure(null);
      cassandraInputFormat.open(null);

      DataSet<CustomCassandraPojo> outputTestSet = exEnv.createInput(cassandraInputFormat, TypeInformation.of(new TypeHint<CustomCassandraPojo>(){}));

       

      //The class that I currently have set up

      CassandraPojoInputFormatText.rtf

       

      Will make another Jira Issue for the Output version next if this is approved

      Attachments

        1. CassandraPojoInputFormatText.rtf
          9 kB
          Jeffrey Carter

        Activity

          People

            Jicaar Jeffrey Carter
            Jicaar Jeffrey Carter
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified