Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8404

CQLSSTableLoader can not create SSTable for csv file of 10M rows.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Normal
    • Resolution: Cannot Reproduce
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
    • Severity:
      Normal
    • Since Version:

      Description

      I am able to create SSTable for one file of 10M rows but not for other file. The data file which works is subscribers1.gz and data file which does not work is subscriber2.gz. Both files have same values in first column but different values for second column. I wonder why CQLSSTableLoader does not work for different set of data.
      Program expected unzipped txt files. So please unzip files before running program. What I have observed is High GC when program processes around 5.2M lines of file subscriber2.gz. It is able to process till 5.8M lines with very frequent Full GC runs. It is not able to process beyond 5.8M rows because of memory not being available.
      I have attached Test1.java and cassandra.yaml I used for creating sstable. In classpath I am specifying all jars of lib folder of extracted apache-cassandra-2.1.1-bin.tar.gz

      Jira does not allow a file of size greater than 10 MB. So I am sharing data files in google drive.
      link to download subscribers1.gz
      https://drive.google.com/file/d/0B6_-ugKWlrfoOTRTa2FCNTFWU2c/view?usp=sharing

      link to download subscribers2.gz
      https://drive.google.com/file/d/0B6_-ugKWlrfocndycm9yM21rN0E/view?usp=sharing

        Attachments

        1. Test1.java
          2 kB
          Manish
        2. cassandra.yaml
          35 kB
          Manish

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              manish.kothawade Manish
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: