Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-47

option to set TTL for columns in hbase

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.2.0
    • regionserver, util
    • None

    Description

      I would like to see the option to have a TTL on the columns in hbase this feature could be helpfully in removing stale data from large datasets with out havening to do a full scan of the dataset and then issuing deletes.

      Example
      Say I am crawling pages and only refreshing pages based on a set score and some pages doe not get updated over X days the old version of the page gets removed from the data set.

      Say I am striping out links form html and storing them say a link is removed from a page then I would need to issue a delete statement to remove that links form the data set with a ttl the link data would remove its self if not updated in x secs. These are just examples based on crawling like nutch but I can foresee many apps using this option.

      This is a feature in bigtables thats is handled when bigtable does garbage-collection.

      Attachments

        1. hbase-ttl-0.2-r652919.patch
          85 kB
          Andrew Kyle Purtell
        2. hbase-ttl-0.2-r652725.patch
          84 kB
          Andrew Kyle Purtell
        3. hbase-ttl-0.2-r652401.patch
          83 kB
          Andrew Kyle Purtell

        Issue Links

          Activity

            People

              apurtell Andrew Kyle Purtell
              viper799 Billy Pearson
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: