Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-7776

Allow multiple MR jobs to concurrently write to the same column family from the same node using CqlBulkOutputFormat

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 2.1.1
    • None

    Description

      After sstable files are written, all files in the specified output directory are loaded (transferred) to the remote cassandra cluster. If multiple writes occur on a node to the same table (i.e. directory), then the multiple load processes end up transferring the same sstable files multiple times. Furthermore, if directory cleanup of successful outputs is set to occur (CASSANDRA-7777), then there could be errors caused by write/load contention.

      This can be simply remedied by using unique output directories for each MR job.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sixpak32577 Paul Pak Assign to me
            sixpak32577 Paul Pak
            Paul Pak
            Piotr Kolaczkowski
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment