Uploaded image for project: 'Lucy'
  1. Lucy
  2. LUCY-326

C lib: Possible memory leak in SnowStemmer when provided schema for the indexer is not DECREFFED

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.6.1
    • Fix Version/s: None
    • Component/s: C bindings
    • Labels:
      None
    • Environment:

      linux

      Description

      In my C library I create a static global struct (which contains some runtime variables as well as lucy_Schema pointer) which is created when the program is loaded.  There is also a destroy function which cleans up (also DECREFs the schema) the runtime data. When I index some documents by providing this schema to the indexer, and call destroy function before the program (using the lib) exits, I do not see any memory leaks in the valgrind output. I only see (still reachable has some non-zero values due to lucy_bootstrap_parcel function).

      On the other hand if I do not call the destroy function before the exit, I would expect to see only an increase in "still reachable" block in valgrind output, but I also see "possibly lost" as following:

      ---------------------------------------------------------------------------------------------------

      ==16942== 70 bytes in 1 blocks are possibly lost in loss record 147 of 178
      ==16942== at 0x4C29B78: realloc (vg_replace_malloc.c:785)
      ==16942== by 0x4F86CC4: increase_size (utilities.c:332)
      ==16942== by 0x4F87865: replace_s (utilities.c:360)
      ==16942== by 0x4EF4195: SN_set_current (api.c:62)
      ==16942== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80)
      ==16942== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80)
      ==16942== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197)
      ==16942== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP (PolyAnalyzer.c:110)
      ==16942== by 0x4F15368: LUCY_Analyzer_Transform_Text (Analyzer.h:204)
      ==16942== by 0x4F15368: LUCY_Inverter_Add_Field_IMP (Inverter.c:181)
      ==16942== by 0x4F14E91: LUCY_Inverter_Add_Field (Inverter.h:296)
      ==16942== by 0x4F14E91: LUCY_Inverter_Invert_Doc_IMP (Inverter.c:109)
      ==16942== by 0x4F63164: LUCY_Inverter_Invert_Doc (Inverter.h:275)
      ==16942== by 0x4F63164: LUCY_SegWriter_Add_Doc_IMP (SegWriter.c:109)
      ==16942== by 0x4F7E069: LUCY_Indexer_Add_Doc (Indexer.h:260)
      ==16942== by 0x4F7F23F: index_messages_json (Search.c:432)
      ==16942==
      ==16942== LEAK SUMMARY:
      ==16942== definitely lost: 0 bytes in 0 blocks
      ==16942== indirectly lost: 0 bytes in 0 blocks
      ==16942== possibly lost: 70 bytes in 1 blocks
      ==16942== still reachable: 246,683 bytes in 5,077 blocks
      ==16942== suppressed: 0 bytes in 0 blocks

      ---------------------------------------------------------------------------------------------------

      Similarly for another program where I do only search (not indexing), I see the similar behaviour. Valgrind output is below for that one:

      -----------------------------------------------------------------------------------------------------

      ==16949==
      ==16949== HEAP SUMMARY:
      ==16949== in use at exit: 229,312 bytes in 5,061 blocks
      ==16949== total heap usage: 34,993 allocs, 29,932 frees, 1,791,083 bytes allocated
      ==16949==
      ==16949== 37 bytes in 1 blocks are possibly lost in loss record 96 of 177
      ==16949== at 0x4C29B78: realloc (vg_replace_malloc.c:785)
      ==16949== by 0x4F86CC4: increase_size (utilities.c:332)
      ==16949== by 0x4F87865: replace_s (utilities.c:360)
      ==16949== by 0x4EF4195: SN_set_current (api.c:62)
      ==16949== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80)
      ==16949== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80)
      ==16949== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197)
      ==16949== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP (PolyAnalyzer.c:110)
      ==16949== by 0x4EF35F3: LUCY_Analyzer_Transform_Text (Analyzer.h:204)
      ==16949== by 0x4EF35F3: LUCY_Analyzer_Split_IMP (Analyzer.c:48)
      ==16949== by 0x4F5AAC8: LUCY_Analyzer_Split (Analyzer.h:211)
      ==16949== by 0x4F5AAC8: LUCY_QParser_Expand_Leaf_IMP (QueryParser.c:916)
      ==16949== by 0x4F59ECA: LUCY_QParser_Expand (QueryParser.h:298)
      ==16949== by 0x4F59ECA: LUCY_QParser_Parse_IMP (QueryParser.c:207)
      ==16949== by 0x4F7E358: LUCY_QParser_Parse (QueryParser.h:284)
      ==16949== by 0x4F7F492: get_query (Search.c:483)
      ==16949==
      ==16949== LEAK SUMMARY:
      ==16949== definitely lost: 0 bytes in 0 blocks
      ==16949== indirectly lost: 0 bytes in 0 blocks
      ==16949== possibly lost: 37 bytes in 1 blocks
      ==16949== still reachable: 229,275 bytes in 5,060 blocks
      ==16949== suppressed: 0 bytes in 0 blocks
      ==16949== Reachable blocks (those to which a pointer was found) are not shown.
      ==16949== To see them, rerun with: --leak-check=full --show-leak-kinds=all

      ----------------------------------------------------------------------------------------------------

      If I remove the SnowStemmer from the Analyzers, I see that this issue does not happen( and I only see still reachable is non-zero)

       

       

       

       

       

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              serkanmulayim@gmail.com Serkan Mulayim
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: