Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1295

Implement: Metadata based bloom index - write path

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Idea here to maintain our bloom filters outside of parquet for speedier access from bloom.

       

      • Design and impl bloom filter migration to metadata table. 

      Design:

      schema for the payload: 

      key: partitionName_fileName

      payload schema:

      isDeleted (boolean): true/false

      bloom_type: short

      ser_bloom: byte[] representing serialized bloom filter. 

       

       

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            manojg Manoj Govindassamy
            vinoth Vinoth Chandar
            Vinoth Chandar
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Agile

                Completed Sprints:
                Hudi-Sprint-Jan-3 ended 11/Jan/22
                Hudi-Sprint-Jan-10 ended 19/Jan/22
                Hudi-Sprint-Jan-18 ended 25/Jan/22
                Hudi-Sprint-Jan-24 ended 01/Feb/22
                Hudi-Sprint-Jan-31 ended 08/Feb/22
                View on Board

                Slack

                  Issue deployment