Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.6.0
Description
The test_scrach_disk tests are currently skipped when run against S3 in Jenkins. When we fix that with IMPALA-3614 (review https://gerrit.cloudera.org/#/c/3265/ ), the tests now run, but they fail.
The failure comes from the fact that the query expected to spill (and fail) ends up succeeding.
It seems that on S3, the expected-to-spill query doesn't need 200M, the mem limit we set to make it spill.
Should such a discrepancy exist?
I collected profiles of both the "expected to spill" and "expected not to spill" queries used in the test, both for HDFS and S3. I'll attach them here.
It's possible to run tests locally against S3. You need a S3 bucket and keys. It worked for me to set environment variables accordingly, re-source impala-config.sh, and proceed:
$ cat ~/.awsrc-REDACTED export S3_BUCKET="your bucket name" export AWS_ACCESS_KEY_ID="your aws access key" export AWS_SECRET_ACCESS_KEY="your aws secret key" export TARGET_FILESYSTEM="s3" export FILESYSTEM_PREFIX="s3a://${S3_BUCKET}" $ source ~/.awsrc-REDACTED $ source bin/impala-config.sh $ ./buildall.sh
There is a bucket already loaded with data I think you can use without having to reformat. sailesh might be able to help if you get stuck on this part if I can't help.