Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6398

Add IterativeMergeStrategy to support running Parallel Iterative Algorithms inside of Solr

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.0
    • Component/s: None
    • Labels:
      None

      Description

      This ticket builds on the existing AnalyticsQuery / MergeStrategy framework by adding the abstract class IterativeMergeStrategy, which has built-in support for call-backs to the shards. The IterativeMergeStrategy is designed to support the execution of Parallel iterative Algorithms, such as Gradient Descent, inside of Solr.

      To use the IterativeMergeStrategy you extend it and implement process(). This gives you access to the callback() method which makes it easy to make repeated calls to all the shards and run algorithms that require iteration.

      Below is an example of a class that extends IterativeMergeStrategy. In this example it collects the count from the shards and then calls back to shards executing the !count AnalyticsQuery and sending it merged counts from all the shards.

      class TestIterative extends IterativeMergeStrategy  {
      
          public void process(ResponseBuilder rb, ShardRequest sreq) throws Exception {
            int count = 0;
            for(ShardResponse shardResponse : sreq.responses) {
              NamedList response = shardResponse.getSolrResponse().getResponse();
              NamedList analytics = (NamedList)response.get("analytics");
              Integer c = (Integer)analytics.get("mycount");
              count += c.intValue();
            }
      
            ModifiableSolrParams params = new ModifiableSolrParams();
            params.add("distrib", "false");
            params.add("fq","{!count base="+count+"}");
            params.add("q","*:*");
      
      
            /*
            *  Call back to all the shards in the response and process the result.
             */
      
            QueryRequest request = new QueryRequest(params);
            List<Future<CallBack>> futures = callBack(sreq.responses, request);
      
            int nextCount = 0;
      
            for(Future<CallBack> future : futures) {
              QueryResponse response = future.get().getResponse();
              NamedList analytics = (NamedList)response.getResponse().get("analytics");
              Integer c = (Integer)analytics.get("mycount");
              nextCount += c.intValue();
            }
      
            NamedList merged = new NamedList();
            merged.add("mycount", nextCount);
            rb.rsp.add("analytics", merged);
          }
        }
      
      
      1. SOLR-6398.patch
        10 kB
        Joel Bernstein
      2. SOLR-6398.patch
        10 kB
        Joel Bernstein
      3. SOLR-6398.patch
        10 kB
        Joel Bernstein
      4. SOLR-6398.patch
        11 kB
        Joel Bernstein

        Activity

        Hide
        joel.bernstein Joel Bernstein added a comment -

        Initial implementation.

        Show
        joel.bernstein Joel Bernstein added a comment - Initial implementation.
        Hide
        joel.bernstein Joel Bernstein added a comment -

        New patch with a simplified callBack mechanism. Will also provide more granular callBack support.

        Show
        joel.bernstein Joel Bernstein added a comment - New patch with a simplified callBack mechanism. Will also provide more granular callBack support.
        Hide
        joel.bernstein Joel Bernstein added a comment -

        Updated patch that works with latest trunk.

        Show
        joel.bernstein Joel Bernstein added a comment - Updated patch that works with latest trunk.
        Hide
        joel.bernstein Joel Bernstein added a comment -

        This is looking ready to commit to trunk I believe. I'll be experimenting with this framework in next couples weeks with gradient descent and logisitic regression modeling.

        Show
        joel.bernstein Joel Bernstein added a comment - This is looking ready to commit to trunk I believe. I'll be experimenting with this framework in next couples weeks with gradient descent and logisitic regression modeling.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1720422 from Joel Bernstein in branch 'dev/trunk'
        [ https://svn.apache.org/r1720422 ]

        SOLR-6398: Add IterativeMergeStrategy to support running Parallel Iterative Algorithms inside of Solr

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1720422 from Joel Bernstein in branch 'dev/trunk' [ https://svn.apache.org/r1720422 ] SOLR-6398 : Add IterativeMergeStrategy to support running Parallel Iterative Algorithms inside of Solr

          People

          • Assignee:
            Unassigned
            Reporter:
            joel.bernstein Joel Bernstein
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development