Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1120

Simplify EntityProcessor API



    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.3
    • 1.4
    • None


      Writing an EntityProcessor is deceptively complex. There are so many gotchas.

      I propose the following:

      1. Extract out the Transformer application logic from EntityProcessor and add it to DocBuilder. Then EntityProcessor do not need to call applyTransformer or know about rowIterator and getFromRowCache() methods.
      2. Change the meaning of EntityProcessor#destroy to be called on end of parent's row – Right now init is called once per parent row but destroy actually means the end of import. In fact, there is no correct way for an entity processor to do clean up right now. Most do clean up when returning null (end of data) but with the introduction of $skipDoc, a transformer can return $skipDoc and the entity processor will never get a chance to clean up for the current init.
      3. EntityProcessor will use the EventListener API to listen for import end. This should be used by EntityProcessor to do a final cleanup.


        1. SOLR-1120.patch
          27 kB
          Noble Paul
        2. SOLR-1120.patch
          32 kB
          Noble Paul
        3. SOLR-1120.patch
          38 kB
          Shalin Shekhar Mangar
        4. SOLR-1120.patch
          40 kB
          Shalin Shekhar Mangar
        5. SOLR-1120.patch
          12 kB
          Noble Paul
        6. SOLR-1120.patch
          11 kB
          Noble Paul
        7. SOLR-1120.patch
          9 kB
          Shalin Shekhar Mangar
        8. SOLR-1120.patch
          1 kB
          Noble Paul
        9. SOLR-1120.patch
          2 kB
          Noble Paul



            shalin Shalin Shekhar Mangar
            shalin Shalin Shekhar Mangar
            0 Vote for this issue
            1 Start watching this issue