Lucene - Core
  1. Lucene - Core
  2. LUCENE-1313

Near Realtime Search (using a built in RAMDirectory)

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: 2.4.1
    • Fix Version/s: 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Enable near realtime search in Lucene without external
      dependencies. When RAM NRT is enabled, the implementation adds a
      RAMDirectory to IndexWriter. Flushes go to the ramdir unless
      there is no available space. Merges are completed in the ram
      dir until there is no more available ram.

      IW.optimize and IW.commit flush the ramdir to the primary
      directory, all other operations try to keep segments in ram
      until there is no more space.

      1. TestLuceneNRT.java
        6 kB
        Jingkei Ly
      2. LUCENE-1313.patch
        46 kB
        Jason Rutherglen
      3. LUCENE-1313.patch
        35 kB
        Jason Rutherglen
      4. LUCENE-1313.patch
        60 kB
        Jason Rutherglen
      5. LUCENE-1313.patch
        39 kB
        Jason Rutherglen
      6. LUCENE-1313.patch
        37 kB
        Jason Rutherglen
      7. LUCENE-1313.patch
        36 kB
        Jason Rutherglen
      8. LUCENE-1313.patch
        22 kB
        Jason Rutherglen
      9. LUCENE-1313.patch
        22 kB
        Jason Rutherglen
      10. LUCENE-1313.patch
        14 kB
        Jason Rutherglen
      11. LUCENE-1313.patch
        36 kB
        Jason Rutherglen
      12. LUCENE-1313.patch
        33 kB
        Jason Rutherglen
      13. LUCENE-1313.patch
        175 kB
        Jason Rutherglen
      14. LUCENE-1313.patch
        166 kB
        Jason Rutherglen
      15. LUCENE-1313.patch
        175 kB
        Jason Rutherglen
      16. LUCENE-1313.patch
        175 kB
        Jason Rutherglen
      17. LUCENE-1313.patch
        173 kB
        Jason Rutherglen
      18. LUCENE-1313.patch
        170 kB
        Jason Rutherglen
      19. LUCENE-1313.patch
        131 kB
        Jason Rutherglen
      20. LUCENE-1313.patch
        129 kB
        Jason Rutherglen
      21. LUCENE-1313.patch
        109 kB
        Jason Rutherglen
      22. LUCENE-1313.patch
        69 kB
        Jason Rutherglen
      23. LUCENE-1313.patch
        47 kB
        Jason Rutherglen
      24. LUCENE-1313.patch
        52 kB
        Jason Rutherglen
      25. LUCENE-1313.patch
        49 kB
        Jason Rutherglen
      26. LUCENE-1313.patch
        37 kB
        Jason Rutherglen
      27. LUCENE-1313.patch
        21 kB
        Jason Rutherglen
      28. LUCENE-1313.patch
        14 kB
        Jason Rutherglen
      29. LUCENE-1313.jar
        5 kB
        Jason Rutherglen
      30. LUCENE-1313.patch
        12 kB
        Jason Rutherglen
      31. LUCENE-1313.patch
        473 kB
        Jason Rutherglen
      32. lucene-1313.patch
        474 kB
        Jason Rutherglen
      33. lucene-1313.patch
        698 kB
        Jason Rutherglen
      34. lucene-1313.patch
        680 kB
        Jason Rutherglen
      35. lucene-1313.patch
        1.88 MB
        Jason Rutherglen

        Issue Links

          Activity

          Jason Rutherglen created issue -
          Jason Rutherglen made changes -
          Field Original Value New Value
          Attachment lucene-1313.patch [ 12384453 ]
          Jason Rutherglen made changes -
          Attachment lucene-1313.patch [ 12384545 ]
          Jason Rutherglen made changes -
          Attachment lucene-1313.patch [ 12384629 ]
          Jason Rutherglen made changes -
          Attachment lucene-1313.patch [ 12386303 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12391299 ]
          Jason Rutherglen made changes -
          Summary Ocean Realtime Search Realtime Search
          Affects Version/s 2.4.1 [ 12313516 ]
          Description Provides realtime search using Lucene. Conceptually, updates are divided into discrete transactions. The transaction is recorded to a transaction log which is similar to the mysql bin log. Deletes from the transaction are made to the existing indexes. Document additions are made to an in memory InstantiatedIndex. The transaction is then complete. After each transaction TransactionSystem.getSearcher() may be called which allows searching over the index including the latest transaction.

          TransactionSystem is the main class. Methods similar to IndexWriter are provided for updating. getSearcher returns a Searcher class.

          - getSearcher()
          - addDocument(Document document)
          - addDocument(Document document, Analyzer analyzer)
          - updateDocument(Term term, Document document)
          - updateDocument(Term term, Document document, Analyzer analyzer)
          - deleteDocument(Term term)
          - deleteDocument(Query query)
          - commitTransaction(List<Document> documents, Analyzer analyzer, List<Term> deleteByTerms, List<Query> deleteByQueries)

          Sample code:

          {code}
          // setup
          FSDirectoryMap directoryMap = new FSDirectoryMap(new File("/testocean"), "log");
          LogDirectory logDirectory = directoryMap.getLogDirectory();
          TransactionLog transactionLog = new TransactionLog(logDirectory);
          TransactionSystem system = new TransactionSystem(transactionLog, new SimpleAnalyzer(), directoryMap);

          // transaction
          Document d = new Document();
          d.add(new Field("contents", "hello world", Field.Store.YES, Field.Index.TOKENIZED));
          system.addDocument(d);

          // search
          OceanSearcher searcher = system.getSearcher();
          ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
          System.out.println(hits.length + " total results");
          for (int i = 0; i < hits.length && i < 10; i++) {
            Document d = searcher.doc(hits[i].doc);
            System.out.println(i + " " + hits[i].score+ " " + d.get("contents");
          }
          {code}

          There is a test class org.apache.lucene.ocean.TestSearch that was used for basic testing.

          A sample disk directory structure is as follows:

          |/snapshot_105_00.xml | XML file containing which indexes and their generation numbers correspond to a snapshot. Each transaction creates a new snapshot file. In this file the 105 is the snapshotid, also known as the transactionid. The 00 is the minor version of the snapshot corresponding to a merge. A merge is a minor snapshot version because the data does not change, only the underlying structure of the index|
          |/3 | Directory containing an on disk Lucene index|
          |/log | Directory containing log files|
          |/log/log00000001.bin | Log file. As new log files are created the suffix number is incremented|

          Realtime search with transactional semantics.

          Possible future directions:
            * Optimistic concurrency
            * Replication

          Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication. It is difficult to replicate using other methods because while the document may easily be serialized, the analyzer cannot.

          I think this issue can hold realtime benchmarks which include indexing and searching concurrently.
          Priority Major [ 3 ] Minor [ 4 ]
          Fix Version/s 2.9 [ 12312682 ]
          Component/s Index [ 12310232 ]
          Component/s contrib/* [ 12312028 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12404393 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.jar [ 12404907 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12405804 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12405961 ]
          Yonik Seeley made changes -
          Link This issue depends upon LUCENE-1618 [ LUCENE-1618 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12406953 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12406961 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12406973 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12407030 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12407201 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12407841 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12408521 ]
          Jason Rutherglen made changes -
          Description Realtime search with transactional semantics.

          Possible future directions:
            * Optimistic concurrency
            * Replication

          Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication. It is difficult to replicate using other methods because while the document may easily be serialized, the analyzer cannot.

          I think this issue can hold realtime benchmarks which include indexing and searching concurrently.
          Enable near realtime search in Lucene without external
          dependencies. When RAM NRT is enabled, the implementation adds a
          RAMDirectory to IndexWriter. Flushes go to the ramdir unless
          there is no available space. Merges are completed in the ram
          dir until there is no more available ram.

          IW.optimize and IW.commit flush the ramdir to the primary
          directory, all other operations try to keep segments in ram
          until there is no more space.
          Jason Rutherglen made changes -
          Link This issue blocks LUCENE-1667 [ LUCENE-1667 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12409931 ]
          Michael McCandless made changes -
          Fix Version/s 3.1 [ 12314025 ]
          Fix Version/s 2.9 [ 12312682 ]
          Jason Rutherglen made changes -
          Summary Realtime Search Near Realtime Search
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12411017 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12411152 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12411466 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12412209 ]
          Jason Rutherglen made changes -
          Summary Near Realtime Search Near Realtime Search (using a built in RAMDirectory)
          Jason Rutherglen made changes -
          Link This issue blocks LUCENE-1738 [ LUCENE-1738 ]
          Jason Rutherglen made changes -
          Link This issue blocks LUCENE-1738 [ LUCENE-1738 ]
          Jason Rutherglen made changes -
          Link This issue blocks SOLR-1278 [ SOLR-1278 ]
          Jason Rutherglen made changes -
          Link This issue relates to LUCENE-1577 [ LUCENE-1577 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12420342 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12420343 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12423842 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12423869 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12424050 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12424086 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12424087 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12424106 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12424147 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12424156 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12424392 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12426035 ]
          Jason Rutherglen made changes -
          Attachment LUCENE-1313.patch [ 12429415 ]
          Jingkei Ly made changes -
          Attachment TestLuceneNRT.java [ 12429459 ]
          Jason Rutherglen made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Resolution Won't Fix [ 2 ]
          Mark Thomas made changes -
          Workflow jira [ 12433818 ] Default workflow, editable Closed status [ 12563984 ]
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12563984 ] jira [ 12585458 ]
          Gavin made changes -
          Link This issue blocks SOLR-1278 [ SOLR-1278 ]
          Gavin made changes -
          Link This issue is depended upon by SOLR-1278 [ SOLR-1278 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Jason Rutherglen
            • Votes:
              2 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development