Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: M3
    • Fix Version/s: 1.0.0
    • Component/s: tablet
    • Labels:
      None
    • Target Version/s:

      Description

      Now that UNDOs are in we need to garbage collect row history somehow. Right now we're keeping everything forever

        Issue Links

          Activity

          Hide
          dralves David Alves added a comment -

          should we do something super simple about this? e.g. on compaction GC UNDOS that are older than 1 week?

          Show
          dralves David Alves added a comment - should we do something super simple about this? e.g. on compaction GC UNDOS that are older than 1 week?
          Hide
          tlipcon Todd Lipcon added a comment -

          Seems reasonable, though we still wouldn't really have anything that would schedule such GCs, right?

          Show
          tlipcon Todd Lipcon added a comment - Seems reasonable, though we still wouldn't really have anything that would schedule such GCs, right?
          Hide
          dralves David Alves added a comment -

          Well we do have regular block/delta compactions. You're right that if we're perfectly compacted then stuff wouldn't get GC'd but then again maybe that means it wasn't that big of a problem in the first place. I.e. if a tablet is under update load it's when we need this the most and then we would have compactions. What do you think?

          Show
          dralves David Alves added a comment - Well we do have regular block/delta compactions. You're right that if we're perfectly compacted then stuff wouldn't get GC'd but then again maybe that means it wasn't that big of a problem in the first place. I.e. if a tablet is under update load it's when we need this the most and then we would have compactions. What do you think?
          Hide
          tlipcon Todd Lipcon added a comment -

          Seems reasonable. If you do add this, please make sure there's a flag to disable it so if people start hitting issues we could back it out simply.

          Show
          tlipcon Todd Lipcon added a comment - Seems reasonable. If you do add this, please make sure there's a flag to disable it so if people start hitting issues we could back it out simply.
          Hide
          dralves David Alves added a comment -

          Sounds good

          Show
          dralves David Alves added a comment - Sounds good
          Hide
          tlipcon Todd Lipcon added a comment -

          Mike Percy mentioned he's working on this – want to reassign to yourself?

          Show
          tlipcon Todd Lipcon added a comment - Mike Percy mentioned he's working on this – want to reassign to yourself?
          Hide
          mpercy Mike Percy added a comment -

          I was thinking along the same lines as David's suggestion above on implementing this. Essentially, make sure that we can apply UNDO history back up to ancient_history_mark time. We can default this to 7 days. So by default, if an UNDO is more than 7 days old, then we delete it on compaction.

          Show
          mpercy Mike Percy added a comment - I was thinking along the same lines as David's suggestion above on implementing this. Essentially, make sure that we can apply UNDO history back up to ancient_history_mark time. We can default this to 7 days. So by default, if an UNDO is more than 7 days old, then we delete it on compaction.
          Hide
          tlipcon Todd Lipcon added a comment -

          Do we also deny any scanners from reading at a snapshot older than the AHM?

          Show
          tlipcon Todd Lipcon added a comment - Do we also deny any scanners from reading at a snapshot older than the AHM?
          Hide
          dralves David Alves added a comment -

          We should, since the snapshot might not be repeatable, i.e. some of the data might be missing.

          Show
          dralves David Alves added a comment - We should, since the snapshot might not be repeatable, i.e. some of the data might be missing.
          Hide
          mpercy Mike Percy added a comment -

          I think that ideally the behavior would be deny opening new scanners before the AHM but allow scanners to keep segments at snapshots before the AHM open and available while they scan, posix files-style. The log block manager has similar semantics so I think it should be achievable, but I am not sure yet.

          Show
          mpercy Mike Percy added a comment - I think that ideally the behavior would be deny opening new scanners before the AHM but allow scanners to keep segments at snapshots before the AHM open and available while they scan, posix files-style. The log block manager has similar semantics so I think it should be achievable, but I am not sure yet.
          Hide
          tlipcon Todd Lipcon added a comment -

          How would you manage that in a distributed setting? Scanners would have to register themselves with the master or some other central authority and propagate the lower bound of all current scanners.

          Show
          tlipcon Todd Lipcon added a comment - How would you manage that in a distributed setting? Scanners would have to register themselves with the master or some other central authority and propagate the lower bound of all current scanners.
          Hide
          mpercy Mike Percy added a comment -

          I was thinking that AHM validation would be a local notion. We can leave time synchronization up to NTP. Validation when opening a scanner could be along the lines of:

           if (snapshot_ts < Now() - history_retention_time) { return INVALID_SNAPSHOT; } 

          This would result in a ragged AHM when considered cluster-wide. The result of a scan right on top of that ragged edge is that some of the scanners would return an error, resulting in an error for the whole scan.

          Show
          mpercy Mike Percy added a comment - I was thinking that AHM validation would be a local notion. We can leave time synchronization up to NTP. Validation when opening a scanner could be along the lines of: if (snapshot_ts < Now() - history_retention_time) { return INVALID_SNAPSHOT; } This would result in a ragged AHM when considered cluster-wide. The result of a scan right on top of that ragged edge is that some of the scanners would return an error, resulting in an error for the whole scan.
          Hide
          mpercy Mike Percy added a comment -

          Following up on that, if people want to be able to scan back for 7 days, we would recommend that they give themselves some slack, and retain history for 8 days, for example.

          Show
          mpercy Mike Percy added a comment - Following up on that, if people want to be able to scan back for 7 days, we would recommend that they give themselves some slack, and retain history for 8 days, for example.
          Hide
          mpercy Mike Percy added a comment -

          Committed in the following revs: be719edc3581802e094c3af6a88d67acba44ba71, 89b54c9f9e6789a8f6b758127625029aafa17589, e2f5e250f50b983f7ab10dac134246fc46870f32

          Show
          mpercy Mike Percy added a comment - Committed in the following revs: be719edc3581802e094c3af6a88d67acba44ba71, 89b54c9f9e6789a8f6b758127625029aafa17589, e2f5e250f50b983f7ab10dac134246fc46870f32
          Hide
          mpercy Mike Percy added a comment - - edited

          The above changes implement this feature for rowsets involved in flushes and compactions. Existing UNDO files on rowsets not affected by compactions will be handled in a follow-up task as part of KUDU-1601.

          Show
          mpercy Mike Percy added a comment - - edited The above changes implement this feature for rowsets involved in flushes and compactions. Existing UNDO files on rowsets not affected by compactions will be handled in a follow-up task as part of KUDU-1601 .

            People

            • Assignee:
              mpercy Mike Percy
              Reporter:
              dralves David Alves
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development