Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7240 Scaling HDFS
  3. HDFS-12213

Ozone: Corona: Support for online mode

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • ozone

    Description

      This jira brings support for online mode in corona.
      In online mode, common crawl data from AWS will be used to populate ozone with data. Default source is CC-MAIN-2017-17/warc.paths.gz (it contains the path to actual data segment), user can override this using -source.
      The following values are derived from URL of Common Crawl data

      • Domain will be used as Volume
      • URL will be used as Bucket
      • FileName will be used as Key

      Attachments

        Activity

          People

            nanda Nandakumar
            nanda Nandakumar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: