Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-1473

Implement a Cassandra aware Hadoop mapreduce.Partitioner

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Incomplete
    • None
    • None
    • None

    Description

      When using a IPartitioner that does not sort data in byte order (RandomPartitioner for example) with Cassandra's Hadoop integration, Hadoop is unaware of the output order of the data.

      We can make Hadoop aware of the proper order of the output data by implementing Hadoop's mapreduce.Partitioner interface: then Hadoop will handle sorting all of the data according to Cassandra's IPartitioner, and the writing clients will be able to connect to smaller numbers of Cassandra nodes.

      Attachments

        Activity

          People

            Unassigned Unassigned
            stuhood Stu Hood
            Votes:
            4 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: