Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-2454

OOM error when running gremlin queries asynchronously with JAVA

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Duplicate
    • Affects Version/s: 3.4.6, 3.4.8
    • Fix Version/s: None
    • Component/s: driver
    • Labels:
      None

      Description

      We have created a rest API that executes a gremlin query on the Janus graph and returns the result in JSON format. API works file for small result sets. But for large result sets, when we hit the API asynchronously, it gives the following error, (max heap size -Xmx4g

      java.lang.OutOfMemoryError: GC overhead limit exceeded

      I am using curl with & to hit API asynchronously,

      curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
      curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
      curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
      curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &

      Code to connect to janus graph,

      cluster = Cluster.open(config);
      connect = cluster.connect();

      submit = connect.submit(gremlin);
      Iterator<Result> resultIterator = submit.iterator();
      int count=0;
      while (resultIterator.hasNext()){
      //add to list, commented to check OOM error
      }

       
      Configurations,

      config.setProperty("connectionPool.maxContentLength", "50000000");
      config.setProperty("connectionPool.maxInProcessPerConnection", "30");
      config.setProperty("connectionPool.maxInProcessPerConnection", "30");
      config.setProperty("connectionPool.maxSize", "30");
      config.setProperty("connectionPool.minSize", "1");
      config.setProperty("connectionPool.resultIterationBatchSize", "200");
       

      Gremlin driver,
       org.apache.tinkerpop.gremlin-driver:3.4.6
       

      Query returns around 17K records with 80MB size.

      How to handle a large resultset like a cursor so that not all the data is loaded in the memory?
      Is there any configuration that I am missing?

      From profiling, it is clear that the gremlin driver is causing the issue but I am not sure how to fix it and release the memory.

      Please let me know if you need more details.

       

      Thanks.

        Attachments

        1. jc8rc.png
          488 kB
          Vikas Yadav
        2. wUqam.png
          421 kB
          Vikas Yadav
        3. wUqam.png
          421 kB
          Vikas Yadav
        4. jc8rc.png
          488 kB
          Vikas Yadav

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vikasyadav Vikas Yadav
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: