XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      IndexTool verification generates an expected list of index mutations from the data table rows and uses this list to check if index table rows are consistent with the data table. To do that it follows the following steps:

      1. The data table rows are scanned with a raw scan. This raw scan is configured to read all versions of rows. 
      2. For each scanned row, the cells that are scanned are grouped into two sets: put and delete. The put set is the set of put cells and the delete set is the set of delete cells.
      3. The put and delete sets for a given row are further grouped based on their timestamps into put and delete mutations such that all the cells in a mutation have the timestamp. 
      4. The put and delete mutations are then sorted within a single list. Mutations in this list are sorted in ascending order of their timestamp. 

      The above process assumes that for each data table update, the index table will be updated with the correct index row key. However, this assumption does not hold in the presence of concurrent updates.

      From the consistent indexing design (PHOENIX-5156) perspective, two or more pending updates from different batches on the same data row are concurrent if and only if for all of these updates the data table row state is read from HBase under the row lock and for none of them the row lock has been acquired the second time for updating the data table. In other words, all of them are in the first update phase concurrently. For concurrent updates, the first two update phases are done but the last update phase is skipped. This means the data table row will be updated by these updates but the corresponding index table rows will be left with the unverified status. Then, the read repair process will repair these unverified index rows during scans.

      Since expected index mutations are derived from the data table row after these concurrent mutations are applied, the expected list would not match with the actual list of index mutations.  

       

      Attachments

        1. PHOENIX-5791.4.x-HBase-1.5.001.patch
          252 kB
          Kadir OZDEMIR
        2. PHOENIX-5791.4.x-HBase-1.5.002.patch
          253 kB
          Kadir OZDEMIR

        Issue Links

          Activity

            People

              kozdemir Kadir OZDEMIR
              kozdemir Kadir OZDEMIR
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h 50m
                  5h 50m