Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-7776

Allow multiple MR jobs to concurrently write to the same column family from the same node using CqlBulkOutputFormat

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Low
    • Resolution: Fixed
    • Fix Version/s: 2.1.1
    • Component/s: None
    • Labels:

      Description

      After sstable files are written, all files in the specified output directory are loaded (transferred) to the remote cassandra cluster. If multiple writes occur on a node to the same table (i.e. directory), then the multiple load processes end up transferring the same sstable files multiple times. Furthermore, if directory cleanup of successful outputs is set to occur (CASSANDRA-7777), then there could be errors caused by write/load contention.

      This can be simply remedied by using unique output directories for each MR job.

        Attachments

        1. trunk-7776-v1.txt
          1 kB
          Paul Pak

          Activity

            People

            • Assignee:
              sixpak32577 Paul Pak
              Reporter:
              sixpak32577 Paul Pak
              Authors:
              Paul Pak
              Reviewers:
              Piotr Kolaczkowski
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: