Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-12543

When put List<List<SomeObject>>, the data was increased much larger.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.6
    • 2.9
    • thin client
    • None
    • Important
    • Docs Required, Release Notes Required

    Description

      I use Ignite 2.6 version of Java Thin Client.

       

      When I put data in the form List<List<SomeObject>>, 

      The size of the original 200KB data was increased to 50MB when inquired by Ignite servers.

      On the Heap Dump, the list element was repeatedly accumulated, increasing the data size.

       

      When I checked org.apacheignite.internal.binary.BinaryWriterExImpl.java doWriteBinaryObject() method,

      // org.apacheignite.internal.binary.BinaryWriterExImpl.java
      
          public void doWriteBinaryObject(@Nullable BinaryObjectImpl po) {
              if (po == null)
                  out.writeByte(GridBinaryMarshaller.NULL);
              else {
                  byte[] poArr = po.array();
                  out.unsafeEnsure(1 + 4 + poArr.length +4);
                  out.unsafeWriteByte(GridBinaryMarshaller.BINARY_OBJ);
                  out.unsafeWriteInt(poArr.length);
                  out.writeByteArray(poArr);
                  out.unsafeWriteInt(po.start());
              }
          }
      

       

      The current Ignite implementation for storing data in the form List<List<Some_Objectject>> is:

      In the Marshalling stage, for example, data the size of List(5 members)<List(10 members)<Some_Object(size:200 KB)> is:

      As many as 10*5 of the list's elements are duplicated.

      If the above data contains five objects of 200KB size, ten by one,

      50 iterations are stored and 200K*10**5 = 100MB of data is used for cache and transfer.

      As a result of this increase in data size, it is confirmed that the failure of OOM, GC, etc. is caused by occupying Heap memory.

      Unnecessarily redundant data is used for cache storage and network transport.

      When looking up cache data, only some of the data at the top is read based on file location information from the entire data, so that normal data is retrieved.

      The way we're implemented today is safe from basic behavior, but we're wasting memory and network unnecessarily using inefficient algorithms

      This can have very serious consequences. Please check.

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              redcomet LEE PYUNG BEOM
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h