Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Invalid
-
ubuntu 9.10, python 2.6, pylucene 3.0
Description
Our codes is following:
We have 31 index dirs in /tmp (there are about 5 million records in our indexs), we want to real time search, so we use the writer.getReader() to get the real time reader.
Then we did the search repeatly, finally java 'out of memory' issue will happen(about 10 minites).
initVM(CLASSPATH,initialheap='100m',maxheap='100m')
keywordQuery = QueryParser(Version.LUCENE_CURRENT,"content", StandardAnalyzer(Version.LUCENE_CURRENT)).parse("when AND you")
writers = []
for i in range(1,32):
dir = os.path.join("/tmp",str)
luceneDir = SimpleFSDirectory(File(dir))
writer = IndexWriter(luceneDir, StandardAnalyzer(Version.LUCENE_CURRENT), False,IndexWriter.MaxFieldLength.LIMITED)
writer.setRAMBufferSizeMB(32.0)
writer.setUseCompoundFile(True)
writer.setMergeFactor(10)
writers.append(writer)
while True:
searchersList = []
readers = []
for writer in writers:
reader = writer.getReader()
searcher = IndexSearcher(reader)
searchersList.append(searcher)
readers.append(reader)
multiSearcherInstance = MultiSearcher(searchersList)
docs = multiSearcherInstance.search(keywordQuery,IndexerCons.TOP_DOC_NUMBER).scoreDocs
multiSearcherInstance.close()
for searcher in searchersList:
searcher.close()
for reader in readers:
reader.close()
Then we use the normal reader (directly open from the dirs) instead of the real time reader, the test is OK, no 'out of memory' issue.
The bug maybe come from java lucene, i don't sure.