Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26349

Improve recent change to IntegrationTestLoadCommonCrawl

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Using BufferedMutator wasn't a good idea because we assign client timestamps, and the store loop is fast enough that on rare occasion two temporally adjacent URLs in the set of WARCs are equivalent and the timestamp does not advance, leading later to a rare false positive CORRUPT finding.

      While making changes, support direct S3N paths as input paths on the command line.

      Attachments

        Issue Links

          Activity

            People

              apurtell Andrew Kyle Purtell
              apurtell Andrew Kyle Purtell
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: