Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9454

Reduce object allocation during indexing because of JavaBinCodec.writeExternString()

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      I setup Java Flight Recorder to profile indexing a 650MB JSON file using bin/post on 2 shard, 2 replica setup. It shows that the JavaBinCodec.writeExternString(String) method contributes a lot of garbage during indexing in SolrCloud. More specifically, it contributes ~1GB of HashMap$Node objects and ~450MB of HashMap$Node[] objects.

      Most of this allocation is because every request is serialized using a new instance of JavaBinUpdateRequestCodec which internally allocates a new HashMap for storing the extern strings.

      We should explore keeping a global extern string map to eliminate redundant allocations.

      Attachments

        1. HashMapNode_Allocations.png
          163 kB
          Shalin Shekhar Mangar
        2. HashMapNodeArray_Allocations.png
          162 kB
          Shalin Shekhar Mangar
        3. javabin-optimization.patch
          1 kB
          Noble Paul
        4. SOLR-9454.patch
          5 kB
          Noble Paul

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            shalin Shalin Shekhar Mangar

            Dates

              Created:
              Updated:

              Slack

                Issue deployment