Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-1278

Make bulk loading into Cassandra less crappy, more pluggable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 0.8.1
    • Component/s: Tools
    • Labels:
      None

      Description

      Currently bulk loading into Cassandra is a black art. People are either directed to just do it responsibly with thrift or a higher level client, or they have to explore the contrib/bmt example - http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires delving into the code to find out how it works and then applying it to the given problem. Using either method, the user also needs to keep in mind that overloading the cluster is possible - which will hopefully be addressed in CASSANDRA-685

      This improvement would be to create a contrib module or set of documents dealing with bulk loading. Perhaps it could include code in the Core to make it more pluggable for external clients of different types.

      It is just that this is something that many that are new to Cassandra need to do - bulk load their data into Cassandra.

        Attachments

        1. 0001-Add-bulk-loader-utility-v2.patch
          36 kB
          Sylvain Lebresne
        2. 0001-Add-bulk-loader-utility.patch
          33 kB
          Sylvain Lebresne
        3. 1278-cassandra-0.7-v2.txt
          338 kB
          Matthew F. Dennis
        4. 1278-cassandra-0.7.1.txt
          159 kB
          Matthew F. Dennis
        5. 1278-cassandra-0.7.txt
          159 kB
          Matthew F. Dennis

          Issue Links

            Activity

              People

              • Assignee:
                slebresne Sylvain Lebresne
                Reporter:
                jeromatron Jeremy Hanna
                Reviewer:
                Jonathan Ellis
              • Votes:
                2 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 40h Original Estimate - 40h
                  40h
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40h 40m
                  40h 40m