Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-2949

Write explicit "close" markers for WALs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • None
    • None
    • logger, replication
    • None

    Description

      To ensure that WALs are not left in a dangling "open" state WRT replication, the garbage collector scans the tablets and constructs a view of WALs that are currently in use. It consults that view to determine which WALs can move to a "closed" replication state.

      This isn't entirely correct because a WAL can "come back" again after being removed from a Tablet. Consider the following:

      1. Table has one tablet hosted on one tserver
      2. Tablet gets some mutations
      3. Tablet gets MinC
      4. Tablet removes WAL entry as part of MinC
      5. WAL is "closed" WRT replication
      6. Tablet receives more mutations, starts using the same WAL

      There are a couple of ways that this could present itself, each of which would result in re-replication of data we've potentially already sent once. On an active system, I don't think this is of big concern, and we already don't guarantee a "once and only once" replication contract so this isn't critical. The combiner set on the replication table will also mitigate most of the re-replication concerns as those records persist until the entire file is replicated (which should outlast the use on the local system).

      ecn recommended that we could record a "closed" marker for a WAL as a part of TabletServerLogger.close() which would prevent the need to "guess" at when a WAL will no longer be used.

      If we want to move to explicit "end" tracking (see ACCUMULO-2835), we will need this implemented.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              elserj Josh Elser
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: