HBase
  1. HBase
  2. HBASE-2376

Add special SnapshotScanner which presents view of all data at some time in the past

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.20.3
    • Fix Version/s: None
    • Component/s: Client, regionserver
    • Labels:
      None

      Description

      In order to support a particular kind of database "snapshot" feature which doesn't require copying data, we came up with the idea for a special SnapshotScanner that would present a view of your data at some point in the past. The primary use case for this would be to be able to recover particular data/rows (but not all data, like a global rollback) should they have somehow been messed up (application fault, application bug, user error, etc.).

        Issue Links

          Activity

          Hide
          Jonathan Gray added a comment -

          The initial idea is that we'd introduce a new per-family configuration, something like TTKAV (Time To Keep All Versions). This setting would define how far back you would be able to perform this snapshot. In practice, it might be something like 1 day or 1 week, but could be configured. This setting would ensure that no versions are deleted until outside the TTKAV. Once outside that range, the original TTL and MaxVersions settings would be enforced (on major compactions).

          The second piece would be a special SnapshotScanner that took a timestamp and returned data as it was at that timestamp. This stamp would of course have to be less than now() and greater than now() - TTKAV. It would have the smarts to basically toss out anything with a newer stamp, including deletes.

          The reason for introducing a new parameter is that you must keep all versions of everything in order to perform the back-in-time snapshot, but you don't want to force constraints on how you would use the normal maxVersions/TTL parameters to clear out multiple versions of stuff during major compactions (once outside the snapshotable time window).

          Show
          Jonathan Gray added a comment - The initial idea is that we'd introduce a new per-family configuration, something like TTKAV (Time To Keep All Versions). This setting would define how far back you would be able to perform this snapshot. In practice, it might be something like 1 day or 1 week, but could be configured. This setting would ensure that no versions are deleted until outside the TTKAV. Once outside that range, the original TTL and MaxVersions settings would be enforced (on major compactions). The second piece would be a special SnapshotScanner that took a timestamp and returned data as it was at that timestamp. This stamp would of course have to be less than now() and greater than now() - TTKAV. It would have the smarts to basically toss out anything with a newer stamp, including deletes. The reason for introducing a new parameter is that you must keep all versions of everything in order to perform the back-in-time snapshot, but you don't want to force constraints on how you would use the normal maxVersions/TTL parameters to clear out multiple versions of stuff during major compactions (once outside the snapshotable time window).
          Hide
          dhruba borthakur added a comment -

          Awesome. This will be really helpful to mitigate any kind of application logical bugs where the app mistakenly removes a bunch of records from Hbase, then the user/admin realises his/her mistake, and wants to refetch those records from a previous point in time.

          Show
          dhruba borthakur added a comment - Awesome. This will be really helpful to mitigate any kind of application logical bugs where the app mistakenly removes a bunch of records from Hbase, then the user/admin realises his/her mistake, and wants to refetch those records from a previous point in time.
          Hide
          Andrew Purtell added a comment -

          I assume the default TTKAV would be something like 0 (no-op)?

          Show
          Andrew Purtell added a comment - I assume the default TTKAV would be something like 0 (no-op)?
          Hide
          ryan rawson added a comment -

          This would cause challenges when users set timestamps, also when people use the unadorned cell-delete command which does an (internal) timestamp set. We might have to add a new timestamp that would allow us to discern the age of a put regardless of the actually 'user visible' timestamp.

          Show
          ryan rawson added a comment - This would cause challenges when users set timestamps, also when people use the unadorned cell-delete command which does an (internal) timestamp set. We might have to add a new timestamp that would allow us to discern the age of a put regardless of the actually 'user visible' timestamp.
          Hide
          dhruba borthakur added a comment -

          @andrew: Yes, the TTAK should default to 0.

          @ryan: If we explain that when a user set timestamps, then the TTKAV will apply to it. If the user dos not set timestamps, then the system will assign a timestamp to every record and will use it to not delete records within the TTKAV. do you see any confusion here?

          Show
          dhruba borthakur added a comment - @andrew: Yes, the TTAK should default to 0. @ryan: If we explain that when a user set timestamps, then the TTKAV will apply to it. If the user dos not set timestamps, then the system will assign a timestamp to every record and will use it to not delete records within the TTKAV. do you see any confusion here?
          Hide
          ryan rawson added a comment -

          dhruba: but if a user does a delete (cell) then the system will pick the timestamp of the cell to apply it to and enter it in thusly. so you would never have a point in time where that cell was not deleted. there would be 2 keyvalues, one put ts=X and the other delete ts=X, even if the delete was executed at X+L where L might be days, years, centuries (hey with longs it would allow it).

          Show
          ryan rawson added a comment - dhruba: but if a user does a delete (cell) then the system will pick the timestamp of the cell to apply it to and enter it in thusly. so you would never have a point in time where that cell was not deleted. there would be 2 keyvalues, one put ts=X and the other delete ts=X, even if the delete was executed at X+L where L might be days, years, centuries (hey with longs it would allow it).
          Hide
          Lars Hofhansl added a comment - - edited

          Is this addressed with HBASE-4536?

          Show
          Lars Hofhansl added a comment - - edited Is this addressed with HBASE-4536 ?
          Show
          Pritam Damania added a comment - Here are the relevant patches for this feature in the 89fb branch : http://svn.apache.org/viewvc?view=revision&revision=r1395032 http://svn.apache.org/viewvc?view=revision&revision=r1406789 http://svn.apache.org/viewvc?view=revision&revision=r1410118
          Hide
          Lars Hofhansl added a comment -

          Is this addressed with HBASE-4536 together with HBASE-4071?

          Show
          Lars Hofhansl added a comment - Is this addressed with HBASE-4536 together with HBASE-4071 ?
          Hide
          Pritam Damania added a comment -

          I don't think those two JIRAs give us that flexibility. For example if TTL is 5 days and max versions is 3. If we want to support a query that gives us a view upto 7 days in the past, then we need to retain versions upto 7 + 5 days (effective TTL for compactions and flushes is 12 days and not 5) and we need to retain a maximum of 3 versions before 7 days so that if we do a query for data on 7 days in the past, we have 3 versions to surface for that query if present.

          The basic idea is if we support queries in the past upto 'x' days, the compactions and flushes should be behaving as if they were happening 'x days' in the past.

          Simply retaining deletes or retaining just one version is not sufficient.

          Show
          Pritam Damania added a comment - I don't think those two JIRAs give us that flexibility. For example if TTL is 5 days and max versions is 3. If we want to support a query that gives us a view upto 7 days in the past, then we need to retain versions upto 7 + 5 days (effective TTL for compactions and flushes is 12 days and not 5) and we need to retain a maximum of 3 versions before 7 days so that if we do a query for data on 7 days in the past, we have 3 versions to surface for that query if present. The basic idea is if we support queries in the past upto 'x' days, the compactions and flushes should be behaving as if they were happening 'x days' in the past. Simply retaining deletes or retaining just one version is not sufficient.
          Hide
          Lars Hofhansl added a comment -

          HBASE-4071 is a bit misnamed. It introduces MIN_VERSIONS. So setting TTL and MIN_VERSIONS does give you exactly what you describe above (I think).

          Show
          Lars Hofhansl added a comment - HBASE-4071 is a bit misnamed. It introduces MIN_VERSIONS. So setting TTL and MIN_VERSIONS does give you exactly what you describe above (I think).
          Hide
          Pritam Damania added a comment -

          I'm not sure how you would achieve it with TTL and MIN_VERSIONS, but lets take an example, suppose the current time in millliseconds is 60 and the table has max versions set to 3.

          Suppose we want to support FlashBackQueries for upto 10ms in the past. What you want is atmost 3 versions for the time between t=0 to t=50. How does MIN_VERSIONS achieve keeping atmost 3 versions in that time range ? Does MIN_VERSIONS apply only to expired kvs ?

          Also an issue with TTL and MIN_VERSIONS is that you cannot support something like if I want a TTL of 6 days but a FlashBack upto 8 days. The FlashBack and TTL time have to be the same which some applications might not want. Some applications might want to keep all their other parameters the same and just specify that they want to do a read back in time for 'x' days. Changing the TTL value for an application to provide this functionality would also change what a scan returns since although you are pushing TTL back to retain enough data to do a read in the past, your queries in the current time are also affected since they will surface all kvs which are within TTL.

          Show
          Pritam Damania added a comment - I'm not sure how you would achieve it with TTL and MIN_VERSIONS, but lets take an example, suppose the current time in millliseconds is 60 and the table has max versions set to 3. Suppose we want to support FlashBackQueries for upto 10ms in the past. What you want is atmost 3 versions for the time between t=0 to t=50. How does MIN_VERSIONS achieve keeping atmost 3 versions in that time range ? Does MIN_VERSIONS apply only to expired kvs ? Also an issue with TTL and MIN_VERSIONS is that you cannot support something like if I want a TTL of 6 days but a FlashBack upto 8 days. The FlashBack and TTL time have to be the same which some applications might not want. Some applications might want to keep all their other parameters the same and just specify that they want to do a read back in time for 'x' days. Changing the TTL value for an application to provide this functionality would also change what a scan returns since although you are pushing TTL back to retain enough data to do a read in the past, your queries in the current time are also affected since they will surface all kvs which are within TTL.
          Hide
          Lars Hofhansl added a comment -

          MIN_VERSIONS affects expired KVs. It's means: "You can expire KVs after this TTL, but keep at least MIN_VERSIONS versions around".

          I was going to Jonathan's initial description:

          TTKAV (Time To Keep All Versions) This setting would define how far back you would be able to perform this snapshot.

          You'd do that by setting MAX_VERSIONS to MAX_LONG, TTL to the flashback time, and MIN_VERSIONS to the number of version you want keep around. Now within the TTL you'd keep all versions, outside of the TTL MIN_VERSIONS versions are kept.

          But I see now. So the snapshot scanner is special in that only through this specific scanner you can look further back than the TTL.

          This does seem a pretty esoteric feature then, though.
          Flashback only makes sense together with TTL (otherwise you could set the TTL). You have a TTL and a sort of a super TTL for which you can only use a special scanner.

          Show
          Lars Hofhansl added a comment - MIN_VERSIONS affects expired KVs. It's means: "You can expire KVs after this TTL, but keep at least MIN_VERSIONS versions around". I was going to Jonathan's initial description: TTKAV (Time To Keep All Versions) This setting would define how far back you would be able to perform this snapshot. You'd do that by setting MAX_VERSIONS to MAX_LONG, TTL to the flashback time, and MIN_VERSIONS to the number of version you want keep around. Now within the TTL you'd keep all versions, outside of the TTL MIN_VERSIONS versions are kept. But I see now. So the snapshot scanner is special in that only through this specific scanner you can look further back than the TTL. This does seem a pretty esoteric feature then, though. Flashback only makes sense together with TTL (otherwise you could set the TTL). You have a TTL and a sort of a super TTL for which you can only use a special scanner.
          Hide
          Kannan Muthukkaruppan added a comment -

          Lars wrote: <<<Flashback queries only makes sense with TTL>>>. This is not true. A simple CF with VERSIONS=1 & no TTL (i.e. TTL of infinity) can also benefit from ability to FlashBack query. Flash back is simply an ability to query the DB as of a previous point in time. Why should we overload that functionality with versions, TTL, etc.?

          I think it is useful to think of FlashBack as completely independent of other settings like TTL, MAXVERSIONS, MINVERSIONS, etc. The latter should be picked at schema design time based on the application requirements. For example, you may have many tables in your system with different TTL, VERSIONS requirements. Maybe you have different CFs within a table, with differing TTL & VERSION requirements.

          But on top of all those, suppose across all my tables I want to be able to query the entire DB as of a previous point in time. From a user's point of view, the only setting they need to worry about is the "time period" (back in time) up to which flash back queries are supported.

          For example, you might have one CF, with VERSIONS=1, where you are keeping hourly rollup data that you want to retain for 1 month (TTL) and, another CF where you keep daily rollup data also with VERSIONS=1 where you want to retain data for 3 years. But separately, I want the ability to be able to do flash back queries up to say 7 days back. This "7 days" should be a completely different setting, and there seems to be no reason to confuse this with TTL & Verions.

          Now, API wise, we need the ability to say that we are doing a flashback query i.e. "Scan @ T" instead of regular "Scan". In Oracle DB too, for instance, flash back queries have this special syntax:

          SELECT * FROM employee
          AS OF TIMESTAMP <TS>
          WHERE name = 'JOHN';

          Regarding <<< So the snapshot scanner is special in that only through this specific scanner you can look further back than the TTL.>>>: I think that is by design. Note: Scan @ T (flash back query) is different than doing a Scan with setTimeRange(0, T). A delete done a T+1 of a key is immaterial for Scan @ T query; whereas for a Scan with setTimeRange(0, T), you will still see the effect of the delete done at T+1.


          In summary, we should not confuse our users by forcing them to change their schema design (i.e. choice of VERSIONS, TTL, etc.) to support flashback queries. Flashback support should be configured using a simple extra knob that can be set a system, table or CF level. We should NOT overload that knob with TTL and Versions.


          Show
          Kannan Muthukkaruppan added a comment - Lars wrote: <<<Flashback queries only makes sense with TTL>>>. This is not true. A simple CF with VERSIONS=1 & no TTL (i.e. TTL of infinity) can also benefit from ability to FlashBack query. Flash back is simply an ability to query the DB as of a previous point in time. Why should we overload that functionality with versions, TTL, etc.? I think it is useful to think of FlashBack as completely independent of other settings like TTL, MAXVERSIONS, MINVERSIONS, etc. The latter should be picked at schema design time based on the application requirements. For example, you may have many tables in your system with different TTL, VERSIONS requirements. Maybe you have different CFs within a table, with differing TTL & VERSION requirements. But on top of all those, suppose across all my tables I want to be able to query the entire DB as of a previous point in time. From a user's point of view, the only setting they need to worry about is the "time period" (back in time) up to which flash back queries are supported. For example, you might have one CF, with VERSIONS=1, where you are keeping hourly rollup data that you want to retain for 1 month (TTL) and, another CF where you keep daily rollup data also with VERSIONS=1 where you want to retain data for 3 years. But separately, I want the ability to be able to do flash back queries up to say 7 days back. This "7 days" should be a completely different setting, and there seems to be no reason to confuse this with TTL & Verions. Now, API wise, we need the ability to say that we are doing a flashback query i.e. "Scan @ T" instead of regular "Scan". In Oracle DB too, for instance, flash back queries have this special syntax: SELECT * FROM employee AS OF TIMESTAMP <TS> WHERE name = 'JOHN'; Regarding <<< So the snapshot scanner is special in that only through this specific scanner you can look further back than the TTL.>>>: I think that is by design. Note: Scan @ T (flash back query) is different than doing a Scan with setTimeRange(0, T). A delete done a T+1 of a key is immaterial for Scan @ T query; whereas for a Scan with setTimeRange(0, T), you will still see the effect of the delete done at T+1. In summary, we should not confuse our users by forcing them to change their schema design (i.e. choice of VERSIONS, TTL, etc.) to support flashback queries. Flashback support should be configured using a simple extra knob that can be set a system, table or CF level. We should NOT overload that knob with TTL and Versions.

            People

            • Assignee:
              Pritam Damania
              Reporter:
              Jonathan Gray
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:

                Development