Solr
  1. Solr
  2. SOLR-828

A RequestProcessor to support updates

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.

        Issue Links

          Activity

          Noble Paul created issue -
          Noble Paul made changes -
          Field Original Value New Value
          Link This issue is related to SOLR-139 [ SOLR-139 ]
          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniquekeyField>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing the backup
           * *temp.backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) in the document
           * *backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) which are not stored in the actual schema and the fields which are targets of copyField.
          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the document to temp.backup.index . And calls next {{UpdateProcessor}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from *backup.index* . if it is a delete by id delete the document with that id from *temp.backup.index* . call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          {{UpdateableIndexProcessor}} calls next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptize}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the documents from the *temp.backup.index* one by one . if the document is present in the main index it is copied to *backup.index* .Finally it commits the *backup.index*. *temp.backup.index* is detryed after that

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the document first in *temp.backup.index* . If it is present read the document . if it is not present , check in *backup.index* .If it is present there , get the searcher from the main index and read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backp indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* and searches the *temp.backup.index* first for the id and if the document is absent then it checks in the *backup.index* and returns the document.
          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniquekeyField>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing the backup
           * *temp.backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) in the document
           * *backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) which are not stored in the actual schema and the fields which are targets of copyField.
          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the document to *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from *backup.index* . if it is a delete by id delete the document with that id from *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptize}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the documents from the *temp.backup.index* one by one . If the document is present in the main index it is copied to *backup.index* , else it is thrown away because a deletebyquery would have deleted it .Finally it commits the *backup.index*. *temp.backup.index* is destroyed after that

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the document first in *temp.backup.index* . If it is present read the document . if it is not present , check in *backup.index* .If it is present there , get the searcher from the main index and read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backp indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* and searches the *temp.backup.index* first for the id and if the document is absent then it checks in the *backup.index* and returns the document.


          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniquekeyField>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing the backup
           * *temp.backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) in the document
           * *backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) which are not stored in the actual schema and the fields which are targets of copyField.
          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the document to *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from *backup.index* . if it is a delete by id delete the document with that id from *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptize}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the documents from the *temp.backup.index* one by one . If the document is present in the main index it is copied to *backup.index* , else it is thrown away because a deletebyquery would have deleted it .Finally it commits the *backup.index*. *temp.backup.index* is destroyed after that

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the document first in *temp.backup.index* . If it is present read the document . if it is not present , check in *backup.index* .If it is present there , get the searcher from the main index and read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backp indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* and searches the *temp.backup.index* first for the id and if the document is absent then it checks in the *backup.index* and returns the document.


          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing the backup
           * *temp.backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) in the document
           * *backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) which are not stored in the main index and the fields which are targets of copyField.
          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the document to *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from *backup.index* . if it is a delete by id delete the document with that id from *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptize}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the documents from the *temp.backup.index* one by one . If the document is present in the main index it is copied to *backup.index* , else it is thrown away because a deletebyquery would have deleted it .Finally it commits the *backup.index*. *temp.backup.index* is destroyed after that. A new *temp.backup.index* is recreated when new documents are added

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the document first in *temp.backup.index* . If it is present read the document . If it is not present , check in *backup.index* .If it is present there , get the searcher from the main index and read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned


          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing the backup
           * *temp.backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) in the document
           * *backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) which are not stored in the main index and the fields which are targets of copyField.
          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the document to *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from *backup.index* . if it is a delete by id delete the document with that id from *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptize}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the documents from the *temp.backup.index* one by one . If the document is present in the main index it is copied to *backup.index* , else it is thrown away because a deletebyquery would have deleted it .Finally it commits the *backup.index*. *temp.backup.index* is destroyed after that. A new *temp.backup.index* is recreated when new documents are added

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the document first in *temp.backup.index* . If it is present read the document . If it is not present , check in *backup.index* .If it is present there , get the searcher from the main index and read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned


          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing the backup
           * *temp.backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) in the document
           * *backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) which are not stored in the main index and the fields which are targets of copyField.
          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the document to *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from *backup.index* . if it is a delete by id delete the document with that id from *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the documents from the *temp.backup.index* one by one . If the document is present in the main index it is copied to *backup.index* , else it is thrown away because a deletebyquery would have deleted it .Finally it commits the *backup.index*. *temp.backup.index* is destroyed after that. A new *temp.backup.index* is recreated when new documents are added

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the document first in *temp.backup.index* . If it is present read the document . If it is not present , check in *backup.index* .If it is present there , get the searcher from the main index and read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned


          Noble Paul made changes -
          Issue Type Improvement [ 4 ] New Feature [ 2 ]
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing the backup
           * *temp.backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) in the document
           * *backup.index* : This index stores (not indexed) all the fields (except uniquekey which is stored and indexed) which are not stored in the main index and the fields which are targets of copyField.
          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the document to *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from *backup.index* . if it is a delete by id delete the document with that id from *temp.backup.index* . Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the documents from the *temp.backup.index* one by one . If the document is present in the main index it is copied to *backup.index* , else it is thrown away because a deletebyquery would have deleted it .Finally it commits the *backup.index*. *temp.backup.index* is destroyed after that. A new *temp.backup.index* is recreated when new documents are added

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the document first in *temp.backup.index* . If it is present read the document . If it is not present , check in *backup.index* .If it is present there , get the searcher from the main index and read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned


          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be NamedListCodec serialized format. The NamedListCodec in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          DATA : LONGVARBINARY : A NamedListCodec Serialized data
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A NamedListCodec serialized boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. That can be another iteration

          Noble Paul made changes -
          Link This issue is blocked by SOLR-810 [ SOLR-810 ]
          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be NamedListCodec serialized format. The NamedListCodec in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          DATA : LONGVARBINARY : A NamedListCodec Serialized data
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A NamedListCodec serialized boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are fillled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. That can be another iteration

          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          DATA : LONGVARBINARY : A NamedListCodec Serialized data
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          DATA : LONGVARBINARY : A NamedListCodec Serialized data
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          DATA : LONGVARBINARY : A {{javabin}} Serialized SolrInputDocument
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          DATA : LONGVARBINARY : A {{javabin}} Serialized SolrInputDocument
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          ID : VARCHAR The primarykey of the document as string
          DATA : LONGVARBINARY : A {{javabin}} Serialized SolrInputDocument
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          ID : VARCHAR The primarykey of the document as string
          DATA : LONGVARBINARY : A {{javabin}} Serialized SolrInputDocument
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so. The {{BackupIndexRequestHandler}} does a commit on *temp.backup.index* .It first searches the *temp.backup.index* with the id .If the document is not found, then it searches the *backup.index* . If it finds the document(s) it is returned

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          ID : VARCHAR The primarykey of the document as string
          DATA : LONGVARBINARY : A {{javabin}} Serialized SolrInputDocument
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so.

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          Noble Paul made changes -
          Description This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.


          The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be inserted before {{RunUpdateProcessor}}.

          * The {{UpdateProcessor}} must add an update method.
          * the {{AddUpdateCommand}} has a new boolean field append. If append= true multivalued fields will be appended else old ones are removed and new ones are added
          * The schema must have a {{<uniqueKey>}}
          * {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

          h1.Implementation
          {{UpdateableIndexProcessor}} uses a DB (JDBC / Berkley DB java?) to store the data. Each document will be a row in the DB . The uniqueKey of the document will be used as the primary key. The data will be written as a BLOB into a DB column . The format will be {{javabin}} serialized format. The {{javabin}} format in the current form is inefficient but it is possible to enhance it (SOLR-810)

          The schema of the table would be
          ID : VARCHAR The primarykey of the document as string
          DATA : LONGVARBINARY : A {{javabin}} Serialized SolrInputDocument
          COMMITTED:BOOL
          BOOST:DOUBLE
          FIELD_BOOSTS:VARBINARY A {{javabin}} serialized data with boosts of each fields

          h1.Implementation of various methods

          h2.{{processAdd()}}
          {{UpdateableIndexProcessor}} writes the serialized document to the DB (COMMITTED=false) . Call next {{UpdateProcessor#add()}}

          h2.{{processDelete()}}
          {{UpdateableIndexProcessor}} gets the Searcher from a core query and find the documents which matches the query and delete from the data table . If it is a delete by id delete the document with that id from data table. Call next {{UpdateProcessor}}

          h2.{{processCommit()}}
          Call next {{UpdateProcessor}}

          h2.on {{postCommit/postOmptimize}}
          {{UpdateableIndexProcessor}} gets all the documents from the data table which is committed =false. If the document is present in the main index it is marked as COMMITTED=true, else it is deleted because a deletebyquery would have deleted it .

          h2.{{processUpdate()}}
          {{UpdateableIndexProcessor}} check the document first in data table. If it is present read the document . If it is not present , read all the missing fields from there, and the backup document is prepared

          The single valued fields are used from the incoming document (if present) others are filled from backup doc . If append=true all the multivalues values from backup document are added to the incoming document else the values from backup document is not used if they are present in incoming document also.

          {{processAdd()}} is called on the next {{UpdateProcessor}}

          h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
          This exposes the data present in the backup indexes. The user must be able to get any document by id by invoking {{/backup?id=<value>}} (multiple id values can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index and construct the new doc if he wishes to do so.

          h2.Next steps
          The datastore can be optimized by not storing the stored fields in the DB. This means on {{postCommit/postOptimize}} we must read back the data and remove the already stored fields and store it back. That can be another iteration

          This is same as SOLR-139. A new issue is opened so that the UpdateProcessor approach is highlighted and we can easily focus on that solution.

          Shalin Shekhar Mangar made changes -
          Fix Version/s 1.5 [ 12313566 ]
          Fix Version/s 1.4 [ 12313351 ]
          Hoss Man made changes -
          Fix Version/s Next [ 12315093 ]
          Fix Version/s 1.5 [ 12313566 ]
          Hoss Man made changes -
          Fix Version/s 3.2 [ 12316172 ]
          Fix Version/s Next [ 12315093 ]
          Robert Muir made changes -
          Fix Version/s 3.3 [ 12316471 ]
          Fix Version/s 3.2 [ 12316172 ]
          Robert Muir made changes -
          Fix Version/s 3.3 [ 12316471 ]
          Fix Version/s 3.4 [ 12316683 ]
          Fix Version/s 4.0 [ 12314992 ]
          Robert Muir made changes -
          Fix Version/s 3.5 [ 12317876 ]
          Fix Version/s 3.4 [ 12316683 ]
          Simon Willnauer made changes -
          Fix Version/s 3.6 [ 12319065 ]
          Fix Version/s 3.5 [ 12317876 ]
          Hoss Man made changes -
          Fix Version/s 3.6 [ 12319065 ]
          Robert Muir made changes -
          Fix Version/s 4.1 [ 12321141 ]
          Fix Version/s 4.0 [ 12314992 ]
          Mark Miller made changes -
          Fix Version/s 4.2 [ 12323893 ]
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.1 [ 12321141 ]
          Robert Muir made changes -
          Fix Version/s 4.3 [ 12324128 ]
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.2 [ 12323893 ]
          Uwe Schindler made changes -
          Fix Version/s 4.4 [ 12324324 ]
          Fix Version/s 4.3 [ 12324128 ]
          Steve Rowe made changes -
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.5 [ 12324743 ]
          Fix Version/s 4.4 [ 12324324 ]
          Adrien Grand made changes -
          Fix Version/s 4.6 [ 12325000 ]
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.5 [ 12324743 ]
          Uwe Schindler made changes -
          Fix Version/s 4.7 [ 12325573 ]
          Fix Version/s 4.6 [ 12325000 ]
          David Smiley made changes -
          Fix Version/s 4.8 [ 12326254 ]
          Fix Version/s 4.7 [ 12325573 ]
          Shalin Shekhar Mangar made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 4.8 [ 12326254 ]
          Resolution Won't Fix [ 2 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Noble Paul
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development