Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-7271

Bulk data loading in Cassandra causing OOM

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Not A Problem
    • None
    • None

    Description

      I am trying to load data from a csv file into Cassandra table using SSTableSimpleUnsortedWriter. As the latest maven cassandra dependencies have some issues with it, I have taken the next beta (rc) version cut as suggested in CASSANDRA-7218. But after taking it, I am facing issues with bulk data loading

      Here is the piece of code which loads data:

      public void loadData(TableDefinition tableDefinition, InputStream csvInputStream){
      
          createDataInDBFormat(tableDefinition, csvInputStream);
      
          Path dbFilePath = Paths.get(TEMP_DIR, keyspace, tableDefinition.getName());
          //BulkLoader.main(new String[]{"-d","localhost",dbFilePath.toUri().getPath()});
      
          try {
              JMXServiceURL jmxUrl = new JMXServiceURL(String.format(
                      "service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", cassandraHost, cassandraJMXPort));
              JMXConnector connector = JMXConnectorFactory.connect(jmxUrl, new HashMap<String, Object>());
              MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection();
              ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService");
              StorageServiceMBean storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class);
      
              storageBean.bulkLoad(dbFilePath.toUri().getPath());
              connector.close();
      
          } catch (IOException | MalformedObjectNameException e) {
              e.printStacktrace()
          }
          FileUtils.deleteQuietly(dbFilePath.toFile());
      }
      
      private void createDataInDBFormat(TableDefinition tableDefinition, InputStream csvInputStream) {
          try(Reader reader = new InputStreamReader(csvInputStream)){
      
              String tableName = tableDefinition.getName();
              File directory = Paths.get(TEMP_DIR, keyspace, tableName).toFile();
              directory.mkdirs();
      
              String yamlPath = "file:\\"+CASSANDRA_HOME+File.separator+"conf"+File.separator+"cassandra.yaml";
              System.setProperty("cassandra.config", yamlPath);
      
              SSTableSimpleUnsortedWriter writer = new SSTableSimpleUnsortedWriter(
                      directory, new Murmur3Partitioner(), keyspace,
                      tableName, AsciiType.instance, null, 10);
              long timestamp = System.currentTimeMillis() * 1000;
      
              CSVReader csvReader = new CSVReader(reader);
              String[] colValues = null;
              List<ColumnDefinition> columnDefinitions = tableDefinition.getColumnDefinitions();
              while((colValues = csvReader.readNext()) != null){
                  if(colValues.length != 0){
                      writer.newRow(bytes(colValues[0]));
                      for(int index = 1; index< colValues.length; index++){
                          ColumnDefinition columnDefinition = columnDefinitions.get(index);       
                          writer.addColumn(bytes(columnDefinition.getName()), bytes(colValues[index]), timestamp);
                      }
                  }
              }
              csvReader.close();
              writer.close();
          } catch (IOException e) {
              e.printStacktrace();
          }
      }
      

      On trying to run loadData, it is giving me the following exception:

      11:23:18.035 [45742123@qtp-1703018180-0] ERROR com.adaequare.common.config.TransactionPerRequestFilter.doInTransactionWithoutResult 39 - Problem in executing request : [http://localhost:8081/mapro-engine/rest/masterdata/pumpData]. Cause :org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: Java heap space
      javax.servlet.ServletException: org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: Java heap space
              at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:392) ~[jersey-container-servlet-core-2.6.jar:na]
              at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:381) ~[jersey-container-servlet-core-2.6.jar:na]
              at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:344) ~[jersey-container-servlet-core-2.6.jar:na]
              at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:219) ~[jersey-container-servlet-core-2.6.jar:na]
              at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) ~[jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166) [jetty-6.1.25.jar:6.1.25]
              at com.adaequare.common.config.TransactionPerRequestFilter$1.doInTransactionWithoutResult(TransactionPerRequestFilter.java:37) ~[mapro-commons-1.0.jar:na]
              at org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:34) [spring-tx-4.0.3.RELEASE.jar:4.0.3.RELEASE]
      :4.0.3.RELEASE]
              at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) [spring-tx-4.0.3.RELEASE.jar:4.0.3.RELEASE]
              at com.adaequare.common.config.TransactionPerRequestFilter.doFilter(TransactionPerRequestFilter.java:33) [mapro-commons-1.0.jar:na]
              at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.Server.handle(Server.java:326) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) [jetty-6.1.25.jar:6.1.25]
              at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) [jetty-util-6.1.25.jar:6.1.25]
      Caused by: org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: Java heap space
              at org.glassfish.jersey.servlet.internal.ResponseWriter.rethrow(ResponseWriter.java:249) ~[jersey-container-servlet-core-2.6.jar:na]
              at org.glassfish.jersey.servlet.internal.ResponseWriter.failure(ResponseWriter.java:231) ~[jersey-container-servlet-core-2.6.jar:na]
              at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:433) ~[jersey-server-2.6.jar:na]
              at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:265) ~[jersey-server-2.6.jar:na]
              at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) ~[jersey-common-2.6.jar:na]
              at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) ~[jersey-common-2.6.jar:na]
              at org.glassfish.jersey.internal.Errors.process(Errors.java:315) ~[jersey-common-2.6.jar:na]
              at org.glassfish.jersey.internal.Errors.process(Errors.java:297) ~[jersey-common-2.6.jar:na]
              at org.glassfish.jersey.internal.Errors.process(Errors.java:267) ~[jersey-common-2.6.jar:na]
              at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:319) ~[jersey-common-2.6.jar:na]
              at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:236) ~[jersey-server-2.6.jar:na]
              at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1028) ~[jersey-server-2.6.jar:na]
              at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:373) ~[jersey-container-servlet-core-2.6.jar:na]
              ... 26 common frames omitted
      Caused by: java.lang.OutOfMemoryError: Java heap space
      

      Here are my GRADLE_OPTS: -Xms1024m -Xmx1024m -XX:MaxPermSize=256m -Xdebug -Xrunjdwp:transport=dt_socket,address=9999,server=y,suspend=n

      And my csv file hardly contains 2 lines of data. Not sure what is causing OOM here? Is there some problem with the latest code I have taken or am I missing something here?

      Attachments

        1. Classes.zip
          4 kB
          Prasanth Gullapalli
        2. BulkLoadFiles.zip
          12 kB
          Prasanth Gullapalli

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mshuler Michael Shuler Assign to me
            prasanthnath Prasanth Gullapalli
            Michael Shuler
            Prasanth Gullapalli Prasanth Gullapalli
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment