Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
When specifying an http package path for yarn jobs the jobs fail with the error:
14/03/05 16:28:40 WARN security.UserGroupInformation: PriviledgedActionException as:samza (auth:SIMPLE) cause:java.io.IOException: No FileSystem for scheme: http
14/03/05 16:28:40 INFO localizer.ResourceLocalizationService: DEBUG: FAILED
, No FileSystem for scheme: http
14/03/05 16:28:40 INFO localizer.LocalizedResource: Resource http://s3.amazonaws.com/samza_packages/wikipedia-job-package.tar.gz transitioned from DOWNLOADING to FAILED
14/03/05 16:28:40 INFO container.Container: Container container_1394035672475_0003_02_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
14/03/05 16:28:40 INFO localizer.LocalResourcesTrackerImpl: Container container_1394035672475_0003_02_000001 sent RELEASE event on a resource request
not present in cache.
yarn.package.path=http://s3.amazonaws.com/samza_packages/wikipedia-job-package.tar.gz
It looks like some work has already been done to support this feature by configuring the "fs.http.imp". I also noticed that this configuration was updated in SAMZA-63.
hConfig.set("fs.http.impl", classOf[HttpFileSystem].getName)
However my understanding is that the job package itself contains the necessary HttpFileSystem class to load http packages, and YARN does not support this configuration out of the box so i'm at a loss as to how to load a remote package over http.
Attachments
Issue Links
- duplicates
-
SAMZA-208 Write multi-node YARN tutorial for hello-samza
- Resolved