Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-3928

Bulk loading to cassandra with Python Hadoop Job.

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      I was wondering if we can have a OutPutFormat to Bulkload the data to Cassandra with Hadoop Job Written in Python.
      I am having very complex Hadoop job written in Python which processes test data and generate structured data in sequential file. I read this data and stream it to cassandra using BulkOutPutFormat.
      Is there any way that I can avoid writing to sequential file and directly process and stream data to Cassandra(With Hadoop Job written in python)?
      What could be a possible solution for same?

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            brandon.williams Brandon Williams Assign to me
            samarthg1986 Samarth Gahire
            Brandon Williams
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 48h
                48h
                Remaining:
                Remaining Estimate - 48h
                48h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment