Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-1792

Add 'mask' command to parquet-tools/parquet-cli

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.12.0
    • 1.12.0
    • parquet-mr
    • None

    Description

      Some personal data columns need to be masked instead of being pruned(Parquet-1791). We need a tool to replace the raw data columns with masked value. The masked value could be hash, null, redact etc.  For the unchanged columns, they should be moved as a whole like 'merge', 'prune' command in Parquet-tools. 

       

      Implementing this feature in file format is 10X faster than doing it by rewriting the table data in the query engine. 

      Attachments

        Activity

          People

            shangx@uber.com Xinli Shang
            shangx@uber.com Xinli Shang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: