Details
-
New Feature
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
I would like to see the option to have a TTL on the columns in hbase this feature could be helpfully in removing stale data from large datasets with out havening to do a full scan of the dataset and then issuing deletes.
Example
Say I am crawling pages and only refreshing pages based on a set score and some pages doe not get updated over X days the old version of the page gets removed from the data set.
Say I am striping out links form html and storing them say a link is removed from a page then I would need to issue a delete statement to remove that links form the data set with a ttl the link data would remove its self if not updated in x secs. These are just examples based on crawling like nutch but I can foresee many apps using this option.
This is a feature in bigtables thats is handled when bigtable does garbage-collection.
Attachments
Attachments
Issue Links
- incorporates
-
HBASE-612 HColumnDescriptor's readFields() method is version aware but its write() method is not
- Closed
- is related to
-
HBASE-2893 Table metacolumns
- Closed