Lucene - Core
  1. Lucene - Core
  2. LUCENE-1336

Distributed Lucene using Hadoop RPC based RMI with dynamic classloading

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: 2.3.1
    • Fix Version/s: None
    • Component/s: modules/other
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Hadoop RPC based RMI system for use with Lucene Searchable. Keeps the application logic on the client side with removing the need to deploy application logic to the Lucene servers. Removes the need to provision new code to potentially hundreds of servers for every application logic change.

      The use case is any deployment requiring Lucene on many servers. This system provides the added advantage of allowing custom Query and Filter classes (or other classes) to be defined on for example a development machine and executed on the server without deploying the custom classes to the servers first. This can save a lot of time and effort in provisioning, restarting processes. In the future this patch will include an IndexWriterService interface which will enable document indexing. This will allow subclasses of Analyzer to be dynamically loaded onto a server as documents are added by the client.

      Hadoop RPC is more scalable than Sun's RMI implementation because it uses non blocking sockets. Hadoop RPC is also far easier to understand and customize if needed as it is embodied in 2 main class files org.apache.hadoop.ipc.Client and org.apache.hadoop.ipc.Server.

      Features include automatic dynamic classloading. The dynamic classloading enables newly compiled client classes inheriting core objects such as Query or Filter to be used to query the server without first deploying the code to the server.

      Using RMI dynamic classloading is not used in practice because it is hard to setup, requiring placing the new code in jar files on a web server on the client. Then requires custom system properties to be setup as well as Java security manager configuration.

      The dynamic classloading in Hadoop RMI for Lucene uses RMI to load the classes. Custom serialization and deserialization manages the classes and the class versions on the server and client side. New class files are automatically detected and loaded using ClassLoader.getResourceAsStream and so this system does not require creating a JAR file. The use of the same networking system used for the remote method invocation is used for the loading classes over the network. This removes the necessity of a separate web server dedicated to the task and makes deployment a few lines of code.

      1. lucene-1336.patch
        54 kB
        Jason Rutherglen
      2. lucene-1336.patch
        139 kB
        Jason Rutherglen
      3. lucene-1336.patch
        177 kB
        Jason Rutherglen

        Activity

        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12564691 ] jira [ 12585028 ]
        Mark Thomas made changes -
        Workflow jira [ 12435342 ] Default workflow, editable Closed status [ 12564691 ]
        Jason Rutherglen made changes -
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Won't Fix [ 2 ]
        Jason Rutherglen made changes -
        Attachment lucene-1336.patch [ 12386545 ]
        Jason Rutherglen made changes -
        Attachment lucene-1336.patch [ 12386424 ]
        Jason Rutherglen made changes -
        Field Original Value New Value
        Attachment lucene-1336.patch [ 12386081 ]
        Jason Rutherglen created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Jason Rutherglen
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development