Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-7776

Allow multiple MR jobs to concurrently write to the same column family from the same node using CqlBulkOutputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 2.1.1
    • None

    Description

      After sstable files are written, all files in the specified output directory are loaded (transferred) to the remote cassandra cluster. If multiple writes occur on a node to the same table (i.e. directory), then the multiple load processes end up transferring the same sstable files multiple times. Furthermore, if directory cleanup of successful outputs is set to occur (CASSANDRA-7777), then there could be errors caused by write/load contention.

      This can be simply remedied by using unique output directories for each MR job.

      Attachments

        1. trunk-7776-v1.txt
          1 kB
          Paul Pak

        Activity

          People

            sixpak32577 Paul Pak
            sixpak32577 Paul Pak
            Paul Pak
            Piotr Kolaczkowski
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: