Author: Taras Bobrovytsky <firstname.lastname@example.org>
Date: Thu Mar 30 13:08:21 2017 -0700
IMPALA-5181: Extract PYPI metadata from a webpage
There were some build failures due to a failure to download a JSON file
containing package metadata from PYPI. We need to switch to downloading
this from a PYPI mirror. In order to be able to download the metadata
from a PYPI mirror, we need be able to extract the data from a web page,
because PYPI mirrors do not always have a JSON interface.
We implement a regex based html parser in this patch. Also, we increase
the number of download attempts and randomly vary the amount of time
between each attempt.
- Tested locally against PYPI and a PYPI mirror.
- Ran a private build that passed (which used a PYPI mirror).
Reviewed-on: http: Reviewed-by: Tim Armstrong <email@example.com>
Tested-by: Impala Public Jenkins