[PYLUCENE-2] Memory leak when searching in real time reader - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Invalid
Labels:
- memory_leak
- real_time_reader
Environment:
ubuntu 9.10, python 2.6, pylucene 3.0

Description

Our codes is following:
We have 31 index dirs in /tmp (there are about 5 million records in our indexs), we want to real time search, so we use the writer.getReader() to get the real time reader.
Then we did the search repeatly, finally java 'out of memory' issue will happen(about 10 minites).

initVM(CLASSPATH,initialheap='100m',maxheap='100m')
keywordQuery = QueryParser(Version.LUCENE_CURRENT,"content", StandardAnalyzer(Version.LUCENE_CURRENT)).parse("when AND you")
writers = []
for i in range(1,32):
dir = os.path.join("/tmp",str)

luceneDir = SimpleFSDirectory(File(dir))

writer = IndexWriter(luceneDir, StandardAnalyzer(Version.LUCENE_CURRENT), False,IndexWriter.MaxFieldLength.LIMITED)
writer.setRAMBufferSizeMB(32.0)
writer.setUseCompoundFile(True)
writer.setMergeFactor(10)
writers.append(writer)

while True:
searchersList = []
readers = []
for writer in writers:
reader = writer.getReader()
searcher = IndexSearcher(reader)
searchersList.append(searcher)
readers.append(reader)
multiSearcherInstance = MultiSearcher(searchersList)
docs = multiSearcherInstance.search(keywordQuery,IndexerCons.TOP_DOC_NUMBER).scoreDocs

multiSearcherInstance.close()
for searcher in searchersList:
searcher.close()
for reader in readers:
reader.close()

Then we use the normal reader (directly open from the dirs) instead of the real time reader, the test is OK, no 'out of memory' issue.
The bug maybe come from java lucene, i don't sure.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: feng xiaojie

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 17/Mar/10 06:35

Updated:: 02/Apr/10 19:16

Resolved:: 02/Apr/10 19:16

Time Tracking

Estimated:

336h

Remaining:

336h

Logged:

Not Specified