Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1775

FileNotFoundException caused by aborting the process of downloading Wikipedia dataset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • None
    • 0.11.1
    • classic
    • None

    Description

      When running the script examples/bin/classify-wikipedia.sh for the first time, it will create a wikixml folder and starts fetching data via curl. If this downloading process is aborted, then in the future when the script is run, it won't extract the .bz2 file (since extracion is guarded by the condition where wikixml doesn't exist) and starts to run Mahout, which will definately end up with throwing up a FileNotFoundException.

      Attachments

        Activity

          People

            smarthi Suneel Marthi
            psyclaudeZ Bowei Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: