Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.2.0-incubating
    • Fix Version/s: 0.10.0
    • Component/s: yarn
    • Labels:
      None

      Description

      Currently when launching an application, two new jars are always created locally, one for AM (appMaster.jar) and one for Container (container.jar) and copied to HDFS before submitting the application. The jar files could potentially be big and if it doesn't changed, it should require copying to HDFS again.

      The general approach is better jar files management and to cache and reuse jar files created through
      class dependency tracing. The changes are further broken down as follows:

      1. Refactor jars generation
        • One jar containing the TwillLauncher (launcher.jar), created through dependency tracing.
          • This jar is the same for all applications.
        • One jar containing all twill classes (twill.jar), created through dependency tracing.
          • This jar is the same for all applications.
        • One jar containing the application class, created through dependency tracing.
          • This jar is generated based on the application being launched. It is reusable when launching the same app multiple times.
        • One jar containing user resources setup through TwillPreparer.
          • This jar is not reused between apps.
        • One jar containing runtime config needed by Twill
          • logback.xml, jvm opts, environment, classpaths, ... etc
      2. Let YARN to expand jars when localizing to containers instead of expanding it programatically
        • This save time in jar expansion when multiple containers are running on the same host
      3. Introduce a new configuration "twill.location.cache.dir" to enable jar caching and reuse
        • Currently only the launcher.jar, twill.jar and application jar will be cached and reuse when possible
        • Cache cleanup logic is also in place to remove files in cache directory that is no longer used by application
      4. The ApplicationBundler is improved to allow more flexible usage

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/twill/pull/21

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/twill/pull/21
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on the issue:

          https://github.com/apache/twill/pull/21

          Also rebased on `master` branch to resolve conflicts in `YarnTwillController`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on the issue: https://github.com/apache/twill/pull/21 Also rebased on `master` branch to resolve conflicts in `YarnTwillController`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/twill/pull/21#discussion_r95052541

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -277,9 +281,16 @@ public TwillPreparer prepare(TwillApplication application) {
          Preconditions.checkState(serviceDelegate.isRunning(), "Service not start. Please call start() first.");
          final TwillSpecification twillSpec = application.configure();
          final String appName = twillSpec.getName();
          + RunId runId = RunIds.generate();
          + Location appLocation = locationFactory.create(String.format("/%s/%s", twillSpec.getName(), runId.getId()));
          + LocationCache locationCache = this.locationCache;
          + if (locationCache == null) {
          + locationCache = new NoCachingLocationCache(appLocation);
          — End diff –

          fixed

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/21#discussion_r95052541 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -277,9 +281,16 @@ public TwillPreparer prepare(TwillApplication application) { Preconditions.checkState(serviceDelegate.isRunning(), "Service not start. Please call start() first."); final TwillSpecification twillSpec = application.configure(); final String appName = twillSpec.getName(); + RunId runId = RunIds.generate(); + Location appLocation = locationFactory.create(String.format("/%s/%s", twillSpec.getName(), runId.getId())); + LocationCache locationCache = this.locationCache; + if (locationCache == null) { + locationCache = new NoCachingLocationCache(appLocation); — End diff – fixed
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/twill/pull/21#discussion_r95052538

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -97,10 +76,10 @@ public ApplicationBundler(Iterable<String> excludePackages) {
          }

          /**

          • * Constructs a ApplicationBundler.
            + * Constructs an ApplicationBundler.
            *
          • @param excludePackages Class packages to exclude
          • * @param includePackages Class packages that should be included. Anything in this list will override the
            + * @param includePackages Class packages that should be includwwed. Anything in this list will override the
              • End diff –

          fixed

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/21#discussion_r95052538 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -97,10 +76,10 @@ public ApplicationBundler(Iterable<String> excludePackages) { } /** * Constructs a ApplicationBundler. + * Constructs an ApplicationBundler. * @param excludePackages Class packages to exclude * @param includePackages Class packages that should be included. Anything in this list will override the + * @param includePackages Class packages that should be includwwed. Anything in this list will override the End diff – fixed
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/twill/pull/21#discussion_r95052416

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/LocationCacheCleaner.java —
          @@ -0,0 +1,206 @@
          +/*
          + * Licensed to the Apache Software Foundation (ASF) under one
          + * or more contributor license agreements. See the NOTICE file
          + * distributed with this work for additional information
          + * regarding copyright ownership. The ASF licenses this file
          + * to you under the Apache License, Version 2.0 (the
          + * "License"); you may not use this file except in compliance
          + * with the License. You may obtain a copy of the License at
          + *
          + * http://www.apache.org/licenses/LICENSE-2.0
          + *
          + * Unless required by applicable law or agreed to in writing, software
          + * distributed under the License is distributed on an "AS IS" BASIS,
          + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          + * See the License for the specific language governing permissions and
          + * limitations under the License.
          + */
          +package org.apache.twill.yarn;
          +
          +import com.google.common.annotations.VisibleForTesting;
          +import com.google.common.base.Predicate;
          +import com.google.common.util.concurrent.AbstractIdleService;
          +import com.google.common.util.concurrent.Futures;
          +import org.apache.hadoop.conf.Configuration;
          +import org.apache.twill.api.Configs;
          +import org.apache.twill.common.Threads;
          +import org.apache.twill.filesystem.Location;
          +import org.apache.twill.internal.io.LocationCache;
          +import org.slf4j.Logger;
          +import org.slf4j.LoggerFactory;
          +
          +import java.io.IOException;
          +import java.util.HashSet;
          +import java.util.Iterator;
          +import java.util.Objects;
          +import java.util.Set;
          +import java.util.concurrent.Executors;
          +import java.util.concurrent.ScheduledExecutorService;
          +import java.util.concurrent.TimeUnit;
          +
          +/**
          + * Responsible for cleanup of

          {@link LocationCache}

          .
          + */
          +final class LocationCacheCleaner extends AbstractIdleService {
          +
          + private static final Logger LOG = LoggerFactory.getLogger(LocationCacheCleaner.class);
          +
          + private final Location cacheBaseLocation;
          + private final String sessionId;
          + private final long expiry;
          + private final long antiqueExpiry;
          + private final Predicate<Location> cleanupPredicate;
          + private final Set<PendingCleanup> pendingCleanups;
          + private ScheduledExecutorService scheduler;
          +
          + LocationCacheCleaner(Configuration config, Location cacheBaseLocation,
          + String sessionId, Predicate<Location> cleanupPredicate)

          { + this.cacheBaseLocation = cacheBaseLocation; + this.sessionId = sessionId; + this.expiry = config.getLong(Configs.Keys.LOCATION_CACHE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_EXPIRY_MS); + this.antiqueExpiry = config.getLong(Configs.Keys.LOCATION_CACHE_ANTIQUE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_ANTIQUE_EXPIRY_MS); + this.cleanupPredicate = cleanupPredicate; + this.pendingCleanups = new HashSet<>(); + }

          +
          + @Override
          + protected void startUp() throws Exception {
          + scheduler = Executors.newSingleThreadScheduledExecutor(Threads.createDaemonThreadFactory("location-cache-cleanup"));
          + scheduler.execute(new Runnable() {
          + @Override
          + public void run() {
          + long currentTime = System.currentTimeMillis();
          + cleanup(currentTime);
          +
          + // By default, run the cleanup at half of the expiry
          + long scheduleDelay = expiry / 2;
          + for (PendingCleanup pendingCleanup : pendingCleanups) {
          + // If there is any pending cleanup that needs to be cleanup early, schedule the run earlier.
          + if (pendingCleanup.getExpireTime() - currentTime < scheduleDelay)

          { + scheduleDelay = pendingCleanup.getExpireTime() - currentTime; + }

          + }
          + scheduler.schedule(this, scheduleDelay, TimeUnit.MILLISECONDS);
          + }
          + });
          + }
          +
          + @Override
          + protected void shutDown() throws Exception

          { + scheduler.shutdownNow(); + }

          +
          + @VisibleForTesting
          + void forceCleanup(final long currentTime) {
          + Futures.getUnchecked(scheduler.submit(new Runnable() {
          + @Override
          + public void run()

          { + cleanup(currentTime); + }

          + }));
          + }
          +
          + /**
          + * Performs cleanup based on the given time.
          + */
          + private void cleanup(long currentTime) {
          + // First go through the pending cleanup list and remove those that can be removed
          + Iterator<PendingCleanup> iterator = pendingCleanups.iterator();
          + while (iterator.hasNext()) {
          + PendingCleanup pendingCleanup = iterator.next();
          +
          + // If rejected by the predicate, it means it is being used, hence remove it from the pending cleanup list.
          + if (!cleanupPredicate.apply(pendingCleanup.getLocation()))

          { + iterator.remove(); + }

          else {
          + try {
          + // If time is up for the pending entry, the location will be deleted,
          + // hence can be removed from the pending cleanup list.
          + // Otherwise retain it for the next cycle.
          + if (pendingCleanup.deleteIfExpired(currentTime))

          { + iterator.remove(); + }

          + } catch (IOException e) {
          + // Log and retain the entry so that another attempt on deletion will be made in next cleanup cycle
          + LOG.warn("Failed to delete {}", pendingCleanup.getLocation(), e);
          + }
          + }
          + }
          +
          + // Then collects the next set of locations to be removed
          + try {
          + for (Location cacheDir : cacheBaseLocation.list()) {
          + try {
          + for (Location location : cacheDir.list()) {
          + if (cleanupPredicate.apply(location)) {
          + long expireTime = currentTime;
          + if (cacheDir.getName().equals(sessionId))

          { + expireTime += expiry; + }

          else

          { + // If the cache entry is from different YarnTwillRunnerService session, use the anti expiry time. + expireTime += antiqueExpiry; + }

          + // If the location is already pending for cleanup, this won't update the expire time as
          + // the comparison of PendingCleanup is only by location.
          + pendingCleanups.add(new PendingCleanup(location, expireTime));
          + }
          + }
          + } catch (IOException e) {
          + LOG.warn("Failed to list cache content from {}", cacheDir, e);
          + }
          + }
          + } catch (IOException e) {
          + LOG.warn("Failed to list cache directories from {}", cacheBaseLocation, e);
          + }
          + }
          +
          + /**
          + * Class for holding information about cache location that is pending to be removed.
          + * The equality and hash code is only based on the location.
          + */
          + private static final class PendingCleanup {
          + private final Location location;
          + private final long expireTime;
          +
          + PendingCleanup(Location location, long expireTime)

          { + this.location = location; + this.expireTime = expireTime; + }

          +
          + Location getLocation()

          { + return location; + }

          +
          + long getExpireTime()

          { + return expireTime; + }

          +
          + boolean deleteIfExpired(long currentTime) throws IOException {
          + if (currentTime < expireTime)

          { + return false; + }

          + location.delete();
          — End diff –

          It is intentional to ignore the result of `delete()` call. When this method returns true, it means it is expired and delete was attempt so that it can be removed from the `pendingCleanups` set. Logically we don't care if the location actually be deleted successfully (if the location is already gone, we don't care. If fail to delete, it will get pick up again in the next cycle).

          I'll add a log and also a comment to make this clear.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/21#discussion_r95052416 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/LocationCacheCleaner.java — @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.twill.yarn; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Predicate; +import com.google.common.util.concurrent.AbstractIdleService; +import com.google.common.util.concurrent.Futures; +import org.apache.hadoop.conf.Configuration; +import org.apache.twill.api.Configs; +import org.apache.twill.common.Threads; +import org.apache.twill.filesystem.Location; +import org.apache.twill.internal.io.LocationCache; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.HashSet; +import java.util.Iterator; +import java.util.Objects; +import java.util.Set; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Responsible for cleanup of {@link LocationCache} . + */ +final class LocationCacheCleaner extends AbstractIdleService { + + private static final Logger LOG = LoggerFactory.getLogger(LocationCacheCleaner.class); + + private final Location cacheBaseLocation; + private final String sessionId; + private final long expiry; + private final long antiqueExpiry; + private final Predicate<Location> cleanupPredicate; + private final Set<PendingCleanup> pendingCleanups; + private ScheduledExecutorService scheduler; + + LocationCacheCleaner(Configuration config, Location cacheBaseLocation, + String sessionId, Predicate<Location> cleanupPredicate) { + this.cacheBaseLocation = cacheBaseLocation; + this.sessionId = sessionId; + this.expiry = config.getLong(Configs.Keys.LOCATION_CACHE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_EXPIRY_MS); + this.antiqueExpiry = config.getLong(Configs.Keys.LOCATION_CACHE_ANTIQUE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_ANTIQUE_EXPIRY_MS); + this.cleanupPredicate = cleanupPredicate; + this.pendingCleanups = new HashSet<>(); + } + + @Override + protected void startUp() throws Exception { + scheduler = Executors.newSingleThreadScheduledExecutor(Threads.createDaemonThreadFactory("location-cache-cleanup")); + scheduler.execute(new Runnable() { + @Override + public void run() { + long currentTime = System.currentTimeMillis(); + cleanup(currentTime); + + // By default, run the cleanup at half of the expiry + long scheduleDelay = expiry / 2; + for (PendingCleanup pendingCleanup : pendingCleanups) { + // If there is any pending cleanup that needs to be cleanup early, schedule the run earlier. + if (pendingCleanup.getExpireTime() - currentTime < scheduleDelay) { + scheduleDelay = pendingCleanup.getExpireTime() - currentTime; + } + } + scheduler.schedule(this, scheduleDelay, TimeUnit.MILLISECONDS); + } + }); + } + + @Override + protected void shutDown() throws Exception { + scheduler.shutdownNow(); + } + + @VisibleForTesting + void forceCleanup(final long currentTime) { + Futures.getUnchecked(scheduler.submit(new Runnable() { + @Override + public void run() { + cleanup(currentTime); + } + })); + } + + /** + * Performs cleanup based on the given time. + */ + private void cleanup(long currentTime) { + // First go through the pending cleanup list and remove those that can be removed + Iterator<PendingCleanup> iterator = pendingCleanups.iterator(); + while (iterator.hasNext()) { + PendingCleanup pendingCleanup = iterator.next(); + + // If rejected by the predicate, it means it is being used, hence remove it from the pending cleanup list. + if (!cleanupPredicate.apply(pendingCleanup.getLocation())) { + iterator.remove(); + } else { + try { + // If time is up for the pending entry, the location will be deleted, + // hence can be removed from the pending cleanup list. + // Otherwise retain it for the next cycle. + if (pendingCleanup.deleteIfExpired(currentTime)) { + iterator.remove(); + } + } catch (IOException e) { + // Log and retain the entry so that another attempt on deletion will be made in next cleanup cycle + LOG.warn("Failed to delete {}", pendingCleanup.getLocation(), e); + } + } + } + + // Then collects the next set of locations to be removed + try { + for (Location cacheDir : cacheBaseLocation.list()) { + try { + for (Location location : cacheDir.list()) { + if (cleanupPredicate.apply(location)) { + long expireTime = currentTime; + if (cacheDir.getName().equals(sessionId)) { + expireTime += expiry; + } else { + // If the cache entry is from different YarnTwillRunnerService session, use the anti expiry time. + expireTime += antiqueExpiry; + } + // If the location is already pending for cleanup, this won't update the expire time as + // the comparison of PendingCleanup is only by location. + pendingCleanups.add(new PendingCleanup(location, expireTime)); + } + } + } catch (IOException e) { + LOG.warn("Failed to list cache content from {}", cacheDir, e); + } + } + } catch (IOException e) { + LOG.warn("Failed to list cache directories from {}", cacheBaseLocation, e); + } + } + + /** + * Class for holding information about cache location that is pending to be removed. + * The equality and hash code is only based on the location. + */ + private static final class PendingCleanup { + private final Location location; + private final long expireTime; + + PendingCleanup(Location location, long expireTime) { + this.location = location; + this.expireTime = expireTime; + } + + Location getLocation() { + return location; + } + + long getExpireTime() { + return expireTime; + } + + boolean deleteIfExpired(long currentTime) throws IOException { + if (currentTime < expireTime) { + return false; + } + location.delete(); — End diff – It is intentional to ignore the result of `delete()` call. When this method returns true, it means it is expired and delete was attempt so that it can be removed from the `pendingCleanups` set. Logically we don't care if the location actually be deleted successfully (if the location is already gone, we don't care. If fail to delete, it will get pick up again in the next cycle). I'll add a log and also a comment to make this clear.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user albertshau commented on a diff in the pull request:

          https://github.com/apache/twill/pull/21#discussion_r95048004

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -97,10 +76,10 @@ public ApplicationBundler(Iterable<String> excludePackages) {
          }

          /**

          • * Constructs a ApplicationBundler.
            + * Constructs an ApplicationBundler.
            *
          • @param excludePackages Class packages to exclude
          • * @param includePackages Class packages that should be included. Anything in this list will override the
            + * @param includePackages Class packages that should be includwwed. Anything in this list will override the
              • End diff –

          typo

          Show
          githubbot ASF GitHub Bot added a comment - Github user albertshau commented on a diff in the pull request: https://github.com/apache/twill/pull/21#discussion_r95048004 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -97,10 +76,10 @@ public ApplicationBundler(Iterable<String> excludePackages) { } /** * Constructs a ApplicationBundler. + * Constructs an ApplicationBundler. * @param excludePackages Class packages to exclude * @param includePackages Class packages that should be included. Anything in this list will override the + * @param includePackages Class packages that should be includwwed. Anything in this list will override the End diff – typo
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user albertshau commented on a diff in the pull request:

          https://github.com/apache/twill/pull/21#discussion_r95050605

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -277,9 +281,16 @@ public TwillPreparer prepare(TwillApplication application) {
          Preconditions.checkState(serviceDelegate.isRunning(), "Service not start. Please call start() first.");
          final TwillSpecification twillSpec = application.configure();
          final String appName = twillSpec.getName();
          + RunId runId = RunIds.generate();
          + Location appLocation = locationFactory.create(String.format("/%s/%s", twillSpec.getName(), runId.getId()));
          + LocationCache locationCache = this.locationCache;
          + if (locationCache == null) {
          + locationCache = new NoCachingLocationCache(appLocation);
          — End diff –

          nit: extra space

          Show
          githubbot ASF GitHub Bot added a comment - Github user albertshau commented on a diff in the pull request: https://github.com/apache/twill/pull/21#discussion_r95050605 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -277,9 +281,16 @@ public TwillPreparer prepare(TwillApplication application) { Preconditions.checkState(serviceDelegate.isRunning(), "Service not start. Please call start() first."); final TwillSpecification twillSpec = application.configure(); final String appName = twillSpec.getName(); + RunId runId = RunIds.generate(); + Location appLocation = locationFactory.create(String.format("/%s/%s", twillSpec.getName(), runId.getId())); + LocationCache locationCache = this.locationCache; + if (locationCache == null) { + locationCache = new NoCachingLocationCache(appLocation); — End diff – nit: extra space
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user albertshau commented on a diff in the pull request:

          https://github.com/apache/twill/pull/21#discussion_r95050388

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/LocationCacheCleaner.java —
          @@ -0,0 +1,206 @@
          +/*
          + * Licensed to the Apache Software Foundation (ASF) under one
          + * or more contributor license agreements. See the NOTICE file
          + * distributed with this work for additional information
          + * regarding copyright ownership. The ASF licenses this file
          + * to you under the Apache License, Version 2.0 (the
          + * "License"); you may not use this file except in compliance
          + * with the License. You may obtain a copy of the License at
          + *
          + * http://www.apache.org/licenses/LICENSE-2.0
          + *
          + * Unless required by applicable law or agreed to in writing, software
          + * distributed under the License is distributed on an "AS IS" BASIS,
          + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          + * See the License for the specific language governing permissions and
          + * limitations under the License.
          + */
          +package org.apache.twill.yarn;
          +
          +import com.google.common.annotations.VisibleForTesting;
          +import com.google.common.base.Predicate;
          +import com.google.common.util.concurrent.AbstractIdleService;
          +import com.google.common.util.concurrent.Futures;
          +import org.apache.hadoop.conf.Configuration;
          +import org.apache.twill.api.Configs;
          +import org.apache.twill.common.Threads;
          +import org.apache.twill.filesystem.Location;
          +import org.apache.twill.internal.io.LocationCache;
          +import org.slf4j.Logger;
          +import org.slf4j.LoggerFactory;
          +
          +import java.io.IOException;
          +import java.util.HashSet;
          +import java.util.Iterator;
          +import java.util.Objects;
          +import java.util.Set;
          +import java.util.concurrent.Executors;
          +import java.util.concurrent.ScheduledExecutorService;
          +import java.util.concurrent.TimeUnit;
          +
          +/**
          + * Responsible for cleanup of

          {@link LocationCache}

          .
          + */
          +final class LocationCacheCleaner extends AbstractIdleService {
          +
          + private static final Logger LOG = LoggerFactory.getLogger(LocationCacheCleaner.class);
          +
          + private final Location cacheBaseLocation;
          + private final String sessionId;
          + private final long expiry;
          + private final long antiqueExpiry;
          + private final Predicate<Location> cleanupPredicate;
          + private final Set<PendingCleanup> pendingCleanups;
          + private ScheduledExecutorService scheduler;
          +
          + LocationCacheCleaner(Configuration config, Location cacheBaseLocation,
          + String sessionId, Predicate<Location> cleanupPredicate)

          { + this.cacheBaseLocation = cacheBaseLocation; + this.sessionId = sessionId; + this.expiry = config.getLong(Configs.Keys.LOCATION_CACHE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_EXPIRY_MS); + this.antiqueExpiry = config.getLong(Configs.Keys.LOCATION_CACHE_ANTIQUE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_ANTIQUE_EXPIRY_MS); + this.cleanupPredicate = cleanupPredicate; + this.pendingCleanups = new HashSet<>(); + }

          +
          + @Override
          + protected void startUp() throws Exception {
          + scheduler = Executors.newSingleThreadScheduledExecutor(Threads.createDaemonThreadFactory("location-cache-cleanup"));
          + scheduler.execute(new Runnable() {
          + @Override
          + public void run() {
          + long currentTime = System.currentTimeMillis();
          + cleanup(currentTime);
          +
          + // By default, run the cleanup at half of the expiry
          + long scheduleDelay = expiry / 2;
          + for (PendingCleanup pendingCleanup : pendingCleanups) {
          + // If there is any pending cleanup that needs to be cleanup early, schedule the run earlier.
          + if (pendingCleanup.getExpireTime() - currentTime < scheduleDelay)

          { + scheduleDelay = pendingCleanup.getExpireTime() - currentTime; + }

          + }
          + scheduler.schedule(this, scheduleDelay, TimeUnit.MILLISECONDS);
          + }
          + });
          + }
          +
          + @Override
          + protected void shutDown() throws Exception

          { + scheduler.shutdownNow(); + }

          +
          + @VisibleForTesting
          + void forceCleanup(final long currentTime) {
          + Futures.getUnchecked(scheduler.submit(new Runnable() {
          + @Override
          + public void run()

          { + cleanup(currentTime); + }

          + }));
          + }
          +
          + /**
          + * Performs cleanup based on the given time.
          + */
          + private void cleanup(long currentTime) {
          + // First go through the pending cleanup list and remove those that can be removed
          + Iterator<PendingCleanup> iterator = pendingCleanups.iterator();
          + while (iterator.hasNext()) {
          + PendingCleanup pendingCleanup = iterator.next();
          +
          + // If rejected by the predicate, it means it is being used, hence remove it from the pending cleanup list.
          + if (!cleanupPredicate.apply(pendingCleanup.getLocation()))

          { + iterator.remove(); + }

          else {
          + try {
          + // If time is up for the pending entry, the location will be deleted,
          + // hence can be removed from the pending cleanup list.
          + // Otherwise retain it for the next cycle.
          + if (pendingCleanup.deleteIfExpired(currentTime))

          { + iterator.remove(); + }

          + } catch (IOException e) {
          + // Log and retain the entry so that another attempt on deletion will be made in next cleanup cycle
          + LOG.warn("Failed to delete {}", pendingCleanup.getLocation(), e);
          + }
          + }
          + }
          +
          + // Then collects the next set of locations to be removed
          + try {
          + for (Location cacheDir : cacheBaseLocation.list()) {
          + try {
          + for (Location location : cacheDir.list()) {
          + if (cleanupPredicate.apply(location)) {
          + long expireTime = currentTime;
          + if (cacheDir.getName().equals(sessionId))

          { + expireTime += expiry; + }

          else

          { + // If the cache entry is from different YarnTwillRunnerService session, use the anti expiry time. + expireTime += antiqueExpiry; + }

          + // If the location is already pending for cleanup, this won't update the expire time as
          + // the comparison of PendingCleanup is only by location.
          + pendingCleanups.add(new PendingCleanup(location, expireTime));
          + }
          + }
          + } catch (IOException e) {
          + LOG.warn("Failed to list cache content from {}", cacheDir, e);
          + }
          + }
          + } catch (IOException e) {
          + LOG.warn("Failed to list cache directories from {}", cacheBaseLocation, e);
          + }
          + }
          +
          + /**
          + * Class for holding information about cache location that is pending to be removed.
          + * The equality and hash code is only based on the location.
          + */
          + private static final class PendingCleanup {
          + private final Location location;
          + private final long expireTime;
          +
          + PendingCleanup(Location location, long expireTime)

          { + this.location = location; + this.expireTime = expireTime; + }

          +
          + Location getLocation()

          { + return location; + }

          +
          + long getExpireTime()

          { + return expireTime; + }

          +
          + boolean deleteIfExpired(long currentTime) throws IOException {
          + if (currentTime < expireTime)

          { + return false; + }

          + location.delete();
          — End diff –

          shouldn't this return location.delete()?

          Show
          githubbot ASF GitHub Bot added a comment - Github user albertshau commented on a diff in the pull request: https://github.com/apache/twill/pull/21#discussion_r95050388 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/LocationCacheCleaner.java — @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.twill.yarn; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Predicate; +import com.google.common.util.concurrent.AbstractIdleService; +import com.google.common.util.concurrent.Futures; +import org.apache.hadoop.conf.Configuration; +import org.apache.twill.api.Configs; +import org.apache.twill.common.Threads; +import org.apache.twill.filesystem.Location; +import org.apache.twill.internal.io.LocationCache; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.HashSet; +import java.util.Iterator; +import java.util.Objects; +import java.util.Set; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Responsible for cleanup of {@link LocationCache} . + */ +final class LocationCacheCleaner extends AbstractIdleService { + + private static final Logger LOG = LoggerFactory.getLogger(LocationCacheCleaner.class); + + private final Location cacheBaseLocation; + private final String sessionId; + private final long expiry; + private final long antiqueExpiry; + private final Predicate<Location> cleanupPredicate; + private final Set<PendingCleanup> pendingCleanups; + private ScheduledExecutorService scheduler; + + LocationCacheCleaner(Configuration config, Location cacheBaseLocation, + String sessionId, Predicate<Location> cleanupPredicate) { + this.cacheBaseLocation = cacheBaseLocation; + this.sessionId = sessionId; + this.expiry = config.getLong(Configs.Keys.LOCATION_CACHE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_EXPIRY_MS); + this.antiqueExpiry = config.getLong(Configs.Keys.LOCATION_CACHE_ANTIQUE_EXPIRY_MS, + Configs.Defaults.LOCATION_CACHE_ANTIQUE_EXPIRY_MS); + this.cleanupPredicate = cleanupPredicate; + this.pendingCleanups = new HashSet<>(); + } + + @Override + protected void startUp() throws Exception { + scheduler = Executors.newSingleThreadScheduledExecutor(Threads.createDaemonThreadFactory("location-cache-cleanup")); + scheduler.execute(new Runnable() { + @Override + public void run() { + long currentTime = System.currentTimeMillis(); + cleanup(currentTime); + + // By default, run the cleanup at half of the expiry + long scheduleDelay = expiry / 2; + for (PendingCleanup pendingCleanup : pendingCleanups) { + // If there is any pending cleanup that needs to be cleanup early, schedule the run earlier. + if (pendingCleanup.getExpireTime() - currentTime < scheduleDelay) { + scheduleDelay = pendingCleanup.getExpireTime() - currentTime; + } + } + scheduler.schedule(this, scheduleDelay, TimeUnit.MILLISECONDS); + } + }); + } + + @Override + protected void shutDown() throws Exception { + scheduler.shutdownNow(); + } + + @VisibleForTesting + void forceCleanup(final long currentTime) { + Futures.getUnchecked(scheduler.submit(new Runnable() { + @Override + public void run() { + cleanup(currentTime); + } + })); + } + + /** + * Performs cleanup based on the given time. + */ + private void cleanup(long currentTime) { + // First go through the pending cleanup list and remove those that can be removed + Iterator<PendingCleanup> iterator = pendingCleanups.iterator(); + while (iterator.hasNext()) { + PendingCleanup pendingCleanup = iterator.next(); + + // If rejected by the predicate, it means it is being used, hence remove it from the pending cleanup list. + if (!cleanupPredicate.apply(pendingCleanup.getLocation())) { + iterator.remove(); + } else { + try { + // If time is up for the pending entry, the location will be deleted, + // hence can be removed from the pending cleanup list. + // Otherwise retain it for the next cycle. + if (pendingCleanup.deleteIfExpired(currentTime)) { + iterator.remove(); + } + } catch (IOException e) { + // Log and retain the entry so that another attempt on deletion will be made in next cleanup cycle + LOG.warn("Failed to delete {}", pendingCleanup.getLocation(), e); + } + } + } + + // Then collects the next set of locations to be removed + try { + for (Location cacheDir : cacheBaseLocation.list()) { + try { + for (Location location : cacheDir.list()) { + if (cleanupPredicate.apply(location)) { + long expireTime = currentTime; + if (cacheDir.getName().equals(sessionId)) { + expireTime += expiry; + } else { + // If the cache entry is from different YarnTwillRunnerService session, use the anti expiry time. + expireTime += antiqueExpiry; + } + // If the location is already pending for cleanup, this won't update the expire time as + // the comparison of PendingCleanup is only by location. + pendingCleanups.add(new PendingCleanup(location, expireTime)); + } + } + } catch (IOException e) { + LOG.warn("Failed to list cache content from {}", cacheDir, e); + } + } + } catch (IOException e) { + LOG.warn("Failed to list cache directories from {}", cacheBaseLocation, e); + } + } + + /** + * Class for holding information about cache location that is pending to be removed. + * The equality and hash code is only based on the location. + */ + private static final class PendingCleanup { + private final Location location; + private final long expireTime; + + PendingCleanup(Location location, long expireTime) { + this.location = location; + this.expireTime = expireTime; + } + + Location getLocation() { + return location; + } + + long getExpireTime() { + return expireTime; + } + + boolean deleteIfExpired(long currentTime) throws IOException { + if (currentTime < expireTime) { + return false; + } + location.delete(); — End diff – shouldn't this return location.delete()?
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user chtyim opened a pull request:

          https://github.com/apache/twill/pull/21

          (TWILL-63) Speed up application launch time

          The general approach is better jar files management and to cache and reuse jar files created through
          class dependency tracing. The changes are further broken down as follows:

          1. Refactor jars generation

          • One jar containing the TwillLauncher (launcher.jar), created through dependency tracing.
          • This jar is the same for all applications.
          • One jar containing all twill classes (twill.jar), created through dependency tracing.
          • This jar is the same for all applications.
          • One jar containing the application class, created through dependency tracing.
          • This jar is generated based on the application being launched. It is reusable when launching the same app multiple times.
          • One jar containing user resources setup through TwillPreparer.
          • This jar is not reused between apps.
          • One jar containing runtime config needed by Twill
          • logback.xml, jvm opts, environment, classpaths, ... etc
            2. Let YARN to expand jars when localizing to containers instead of expanding it programatically
          • This save time in jar expansion when multiple containers are running on the same host
            3. Introduce a new configuration "twill.location.cache.dir" to enable jar caching and reuse
          • Currently only the launcher.jar, twill.jar and application jar will be cached and reuse when possible
          • Cache cleanup logic is also in place to remove files in cache directory that is no longer used by application
            4. The ApplicationBundler is improved to allow more flexible usage

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/chtyim/twill feature/twill-63

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/twill/pull/21.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #21


          commit 786f3c6e075c15d929b43e322f8a869963a95b81
          Author: Terence Yim <chtyim@apache.org>
          Date: 2016-12-07T01:05:11Z

          (TWILL-63) Speed up application launch time

          The general approach is better jar files management and to cache and reuse jar files created through
          class dependency tracing. The changes are further broken down as follows:

          1. Refactor jars generation

          • One jar containing the TwillLauncher (launcher.jar), created through dependency tracing.
          • This jar is the same for all applications.
          • One jar containing all twill classes (twill.jar), created through dependency tracing.
          • This jar is the same for all applications.
          • One jar containing the application class, created through dependency tracing.
          • This jar is generated based on the application being launched. It is reusable when launching the same app multiple times.
          • One jar containing user resources setup through TwillPreparer.
          • This jar is not reused between apps.
          • One jar containing runtime config needed by Twill
          • logback.xml, jvm opts, environment, classpaths, ... etc
            2. Let YARN to expand jars when localizing to containers instead of expanding it programatically
          • This save time in jar expansion when multiple containers are running on the same host
            3. Introduce a new configuration "twill.location.cache.dir" to enable jar caching and reuse
          • Currently only the launcher.jar, twill.jar and application jar will be cached and reuse when possible
          • Cache cleanup logic is also in place to remove files in cache directory that is no longer used by application
            4. The ApplicationBundler is improved to allow more flexible usage

          Show
          githubbot ASF GitHub Bot added a comment - GitHub user chtyim opened a pull request: https://github.com/apache/twill/pull/21 ( TWILL-63 ) Speed up application launch time The general approach is better jar files management and to cache and reuse jar files created through class dependency tracing. The changes are further broken down as follows: 1. Refactor jars generation One jar containing the TwillLauncher (launcher.jar), created through dependency tracing. This jar is the same for all applications. One jar containing all twill classes (twill.jar), created through dependency tracing. This jar is the same for all applications. One jar containing the application class, created through dependency tracing. This jar is generated based on the application being launched. It is reusable when launching the same app multiple times. One jar containing user resources setup through TwillPreparer. This jar is not reused between apps. One jar containing runtime config needed by Twill logback.xml, jvm opts, environment, classpaths, ... etc 2. Let YARN to expand jars when localizing to containers instead of expanding it programatically This save time in jar expansion when multiple containers are running on the same host 3. Introduce a new configuration "twill.location.cache.dir" to enable jar caching and reuse Currently only the launcher.jar, twill.jar and application jar will be cached and reuse when possible Cache cleanup logic is also in place to remove files in cache directory that is no longer used by application 4. The ApplicationBundler is improved to allow more flexible usage You can merge this pull request into a Git repository by running: $ git pull https://github.com/chtyim/twill feature/twill-63 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/twill/pull/21.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21 commit 786f3c6e075c15d929b43e322f8a869963a95b81 Author: Terence Yim <chtyim@apache.org> Date: 2016-12-07T01:05:11Z ( TWILL-63 ) Speed up application launch time The general approach is better jar files management and to cache and reuse jar files created through class dependency tracing. The changes are further broken down as follows: 1. Refactor jars generation One jar containing the TwillLauncher (launcher.jar), created through dependency tracing. This jar is the same for all applications. One jar containing all twill classes (twill.jar), created through dependency tracing. This jar is the same for all applications. One jar containing the application class, created through dependency tracing. This jar is generated based on the application being launched. It is reusable when launching the same app multiple times. One jar containing user resources setup through TwillPreparer. This jar is not reused between apps. One jar containing runtime config needed by Twill logback.xml, jvm opts, environment, classpaths, ... etc 2. Let YARN to expand jars when localizing to containers instead of expanding it programatically This save time in jar expansion when multiple containers are running on the same host 3. Introduce a new configuration "twill.location.cache.dir" to enable jar caching and reuse Currently only the launcher.jar, twill.jar and application jar will be cached and reuse when possible Cache cleanup logic is also in place to remove files in cache directory that is no longer used by application 4. The ApplicationBundler is improved to allow more flexible usage
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user hsaputra commented on the issue:

          https://github.com/apache/incubator-twill/pull/66

          Could you kindly move the PR to the new Twill repo: https://github.com/apache/twill

          Thanks

          Show
          githubbot ASF GitHub Bot added a comment - Github user hsaputra commented on the issue: https://github.com/apache/incubator-twill/pull/66 Could you kindly move the PR to the new Twill repo: https://github.com/apache/twill Thanks
          Hide
          chtyim Terence Yim added a comment -

          There are still quite a lot of unaddressed comments. Moving this out to next release.

          Show
          chtyim Terence Yim added a comment - There are still quite a lot of unaddressed comments. Moving this out to next release.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user gsps1 commented on the pull request:

          https://github.com/apache/incubator-twill/pull/66#issuecomment-160762781

          thanks for the review @chtyim , @hsaputra i will be addressing the comments soon, got busy with other things. will update soon. Thanks for following up!

          Show
          githubbot ASF GitHub Bot added a comment - Github user gsps1 commented on the pull request: https://github.com/apache/incubator-twill/pull/66#issuecomment-160762781 thanks for the review @chtyim , @hsaputra i will be addressing the comments soon, got busy with other things. will update soon. Thanks for following up!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user hsaputra commented on the pull request:

          https://github.com/apache/incubator-twill/pull/66#issuecomment-160559379

          Any update on this PR?

          Show
          githubbot ASF GitHub Bot added a comment - Github user hsaputra commented on the pull request: https://github.com/apache/incubator-twill/pull/66#issuecomment-160559379 Any update on this PR?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44348453

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro

          • @throws IOException
            */
            public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException {
          • LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName());
            + createBundleAndGetClassPathUrls(target, classes, resources);
            + }
            +
            + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + }

            +
            + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath,
            + OutputStream targetOutputStream,
            + Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException {
            + LOG.debug("start creating bundle {}. building a temporary file locally at first", targetName);
            // Write the jar to local tmp file first

          • File tmpJar = File.createTempFile(target.getName(), ".tmp");
            + File tmpJar = File.createTempFile(targetName, ".tmp");
            + Set<URL> classPathUrls;
            try {
            Set<String> entries = Sets.newHashSet();
            try (JarOutputStream jarOut = new JarOutputStream(new FileOutputStream(tmpJar))) {
            // Find class dependencies
          • findDependencies(classes, entries, jarOut);
            + classPathUrls = findDependenciesAndGetClassPathUrls(classes, entries, jarOut);

          // Add extra resources
          for (URI resource : resources)

          { copyResource(resource, entries, jarOut); }

          }

          • LOG.debug("copying temporary bundle to destination {} ({} bytes)", target, tmpJar.length());
            + LOG.debug("copying temporary bundle to destination {} ({} bytes)", targetPath, tmpJar.length());
            // Copy the tmp jar into destination.
            try {
          • OutputStream os = new BufferedOutputStream(target.getOutputStream());
            + OutputStream os = new BufferedOutputStream(targetOutputStream);
            try { Files.copy(tmpJar, os); }

            finally

            { Closeables.closeQuietly(os); }

            } catch (IOException e)

            { - throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + target, e); + throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + targetPath, e); }
          • LOG.debug("finished creating bundle at {}", target);
            + LOG.debug("finished creating bundle at {}", targetPath);
            } finally {
            tmpJar.delete();
            LOG.debug("cleaned up local temporary for bundle {}", tmpJar.toURI());
            }
            + return classPathUrls;
            +
            + }
            +
            + public Set<URL> createBundleAndGetClassPathUrls(File target, Iterable<Class<?>> classes,
              • End diff –

          no doc.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44348453 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro @throws IOException */ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException { LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName()); + createBundleAndGetClassPathUrls(target, classes, resources); + } + + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + } + + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath, + OutputStream targetOutputStream, + Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + LOG.debug("start creating bundle {}. building a temporary file locally at first", targetName); // Write the jar to local tmp file first File tmpJar = File.createTempFile(target.getName(), ".tmp"); + File tmpJar = File.createTempFile(targetName, ".tmp"); + Set<URL> classPathUrls; try { Set<String> entries = Sets.newHashSet(); try (JarOutputStream jarOut = new JarOutputStream(new FileOutputStream(tmpJar))) { // Find class dependencies findDependencies(classes, entries, jarOut); + classPathUrls = findDependenciesAndGetClassPathUrls(classes, entries, jarOut); // Add extra resources for (URI resource : resources) { copyResource(resource, entries, jarOut); } } LOG.debug("copying temporary bundle to destination {} ({} bytes)", target, tmpJar.length()); + LOG.debug("copying temporary bundle to destination {} ({} bytes)", targetPath, tmpJar.length()); // Copy the tmp jar into destination. try { OutputStream os = new BufferedOutputStream(target.getOutputStream()); + OutputStream os = new BufferedOutputStream(targetOutputStream); try { Files.copy(tmpJar, os); } finally { Closeables.closeQuietly(os); } } catch (IOException e) { - throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + target, e); + throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + targetPath, e); } LOG.debug("finished creating bundle at {}", target); + LOG.debug("finished creating bundle at {}", targetPath); } finally { tmpJar.delete(); LOG.debug("cleaned up local temporary for bundle {}", tmpJar.toURI()); } + return classPathUrls; + + } + + public Set<URL> createBundleAndGetClassPathUrls(File target, Iterable<Class<?>> classes, End diff – no doc.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44348039

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -175,6 +189,27 @@ protected void shutDown() throws Exception {
          };
          }

          + private File createTwillJar() throws IOException {
          + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor());
          + List<Class<?>> classes = Lists.newArrayList();
          + classes.add(ApplicationMasterMain.class);
          + classes.add(TwillContainerMain.class);
          + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version
          + classes.add(yarnAppClient.getClass());
          + String parentDir = yarnConfig.get(Constants.TWILL_JAR_LOCATION, Constants.DEFAULT_TWILL_JAR_LOCATION);
          + File twillJarFile = new File(parentDir + Constants.Files.TWILL_JAR);
          +
          + Set<URL> twillClassPathURLs = applicationBundler.createBundleAndGetClassPathUrls(twillJarFile,
          + classes, ImmutableList.<URI>of());
          + List<URL> normalizeTwillURLs = new ArrayList<>();
          + for (URL url : twillClassPathURLs)

          { + normalizeTwillURLs.add(new File(url.getPath()).toURI().toURL()); + }

          +
          + twillClassLoader = new URLClassLoader(normalizeTwillURLs.toArray(new URL[normalizeTwillURLs.size()]), null);
          — End diff –

          Oh, seems like it will get reused. That's ok then.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44348039 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -175,6 +189,27 @@ protected void shutDown() throws Exception { }; } + private File createTwillJar() throws IOException { + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor()); + List<Class<?>> classes = Lists.newArrayList(); + classes.add(ApplicationMasterMain.class); + classes.add(TwillContainerMain.class); + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version + classes.add(yarnAppClient.getClass()); + String parentDir = yarnConfig.get(Constants.TWILL_JAR_LOCATION, Constants.DEFAULT_TWILL_JAR_LOCATION); + File twillJarFile = new File(parentDir + Constants.Files.TWILL_JAR); + + Set<URL> twillClassPathURLs = applicationBundler.createBundleAndGetClassPathUrls(twillJarFile, + classes, ImmutableList.<URI>of()); + List<URL> normalizeTwillURLs = new ArrayList<>(); + for (URL url : twillClassPathURLs) { + normalizeTwillURLs.add(new File(url.getPath()).toURI().toURL()); + } + + twillClassLoader = new URLClassLoader(normalizeTwillURLs.toArray(new URL [normalizeTwillURLs.size()] ), null); — End diff – Oh, seems like it will get reused. That's ok then.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44347994

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -175,6 +189,27 @@ protected void shutDown() throws Exception {
          };
          }

          + private File createTwillJar() throws IOException {
          + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor());
          + List<Class<?>> classes = Lists.newArrayList();
          + classes.add(ApplicationMasterMain.class);
          + classes.add(TwillContainerMain.class);
          + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version
          + classes.add(yarnAppClient.getClass());
          + String parentDir = yarnConfig.get(Constants.TWILL_JAR_LOCATION, Constants.DEFAULT_TWILL_JAR_LOCATION);
          + File twillJarFile = new File(parentDir + Constants.Files.TWILL_JAR);
          +
          + Set<URL> twillClassPathURLs = applicationBundler.createBundleAndGetClassPathUrls(twillJarFile,
          + classes, ImmutableList.<URI>of());
          + List<URL> normalizeTwillURLs = new ArrayList<>();
          + for (URL url : twillClassPathURLs)

          { + normalizeTwillURLs.add(new File(url.getPath()).toURI().toURL()); + }

          +
          + twillClassLoader = new URLClassLoader(normalizeTwillURLs.toArray(new URL[normalizeTwillURLs.size()]), null);
          — End diff –

          Who will close this ClassLoader?

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44347994 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -175,6 +189,27 @@ protected void shutDown() throws Exception { }; } + private File createTwillJar() throws IOException { + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor()); + List<Class<?>> classes = Lists.newArrayList(); + classes.add(ApplicationMasterMain.class); + classes.add(TwillContainerMain.class); + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version + classes.add(yarnAppClient.getClass()); + String parentDir = yarnConfig.get(Constants.TWILL_JAR_LOCATION, Constants.DEFAULT_TWILL_JAR_LOCATION); + File twillJarFile = new File(parentDir + Constants.Files.TWILL_JAR); + + Set<URL> twillClassPathURLs = applicationBundler.createBundleAndGetClassPathUrls(twillJarFile, + classes, ImmutableList.<URI>of()); + List<URL> normalizeTwillURLs = new ArrayList<>(); + for (URL url : twillClassPathURLs) { + normalizeTwillURLs.add(new File(url.getPath()).toURI().toURL()); + } + + twillClassLoader = new URLClassLoader(normalizeTwillURLs.toArray(new URL [normalizeTwillURLs.size()] ), null); — End diff – Who will close this ClassLoader?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44346230

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro

          • @throws IOException
            */
            public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException {
          • LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName());
            + createBundleAndGetClassPathUrls(target, classes, resources);
            + }
            +
            + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + }

            +
            + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath,
            + OutputStream targetOutputStream,
            + Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException {
            + LOG.debug("start creating bundle {}. building a temporary file locally at first", targetName);
            // Write the jar to local tmp file first

          • File tmpJar = File.createTempFile(target.getName(), ".tmp");
            + File tmpJar = File.createTempFile(targetName, ".tmp");
            + Set<URL> classPathUrls;
            try {
            Set<String> entries = Sets.newHashSet();
            try (JarOutputStream jarOut = new JarOutputStream(new FileOutputStream(tmpJar))) {
            // Find class dependencies
          • findDependencies(classes, entries, jarOut);
            + classPathUrls = findDependenciesAndGetClassPathUrls(classes, entries, jarOut);

          // Add extra resources
          for (URI resource : resources)

          { copyResource(resource, entries, jarOut); }

          }

          • LOG.debug("copying temporary bundle to destination {} ({} bytes)", target, tmpJar.length());
            + LOG.debug("copying temporary bundle to destination {} ({} bytes)", targetPath, tmpJar.length());
            // Copy the tmp jar into destination.
            try {
          • OutputStream os = new BufferedOutputStream(target.getOutputStream());
            + OutputStream os = new BufferedOutputStream(targetOutputStream);
            try { Files.copy(tmpJar, os); }

            finally

            { Closeables.closeQuietly(os); }

            } catch (IOException e)

            { - throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + target, e); + throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + targetPath, e); }
          • LOG.debug("finished creating bundle at {}", target);
            + LOG.debug("finished creating bundle at {}", targetPath);
            } finally {
            tmpJar.delete();
            LOG.debug("cleaned up local temporary for bundle {}", tmpJar.toURI());
            }
            + return classPathUrls;
            +
              • End diff –

          remove extra new line.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44346230 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro @throws IOException */ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException { LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName()); + createBundleAndGetClassPathUrls(target, classes, resources); + } + + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + } + + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath, + OutputStream targetOutputStream, + Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + LOG.debug("start creating bundle {}. building a temporary file locally at first", targetName); // Write the jar to local tmp file first File tmpJar = File.createTempFile(target.getName(), ".tmp"); + File tmpJar = File.createTempFile(targetName, ".tmp"); + Set<URL> classPathUrls; try { Set<String> entries = Sets.newHashSet(); try (JarOutputStream jarOut = new JarOutputStream(new FileOutputStream(tmpJar))) { // Find class dependencies findDependencies(classes, entries, jarOut); + classPathUrls = findDependenciesAndGetClassPathUrls(classes, entries, jarOut); // Add extra resources for (URI resource : resources) { copyResource(resource, entries, jarOut); } } LOG.debug("copying temporary bundle to destination {} ({} bytes)", target, tmpJar.length()); + LOG.debug("copying temporary bundle to destination {} ({} bytes)", targetPath, tmpJar.length()); // Copy the tmp jar into destination. try { OutputStream os = new BufferedOutputStream(target.getOutputStream()); + OutputStream os = new BufferedOutputStream(targetOutputStream); try { Files.copy(tmpJar, os); } finally { Closeables.closeQuietly(os); } } catch (IOException e) { - throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + target, e); + throw new IOException("failed to copy bundle from " + tmpJar.toURI() + " to " + targetPath, e); } LOG.debug("finished creating bundle at {}", target); + LOG.debug("finished creating bundle at {}", targetPath); } finally { tmpJar.delete(); LOG.debug("cleaned up local temporary for bundle {}", tmpJar.toURI()); } + return classPathUrls; + End diff – remove extra new line.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44346146

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro

          • @throws IOException
            */
            public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException {
          • LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName());
            + createBundleAndGetClassPathUrls(target, classes, resources);
            + }
            +
            + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + }

            +
            + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath,
            + OutputStream targetOutputStream,
            + Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException {
            + LOG.debug("start creating bundle {}. building a temporary file locally at first", targetName);
            // Write the jar to local tmp file first

          • File tmpJar = File.createTempFile(target.getName(), ".tmp");
            + File tmpJar = File.createTempFile(targetName, ".tmp");
              • End diff –

          So this will be using the system temp directory. Maybe better use the one that user configurated

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44346146 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro @throws IOException */ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException { LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName()); + createBundleAndGetClassPathUrls(target, classes, resources); + } + + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + } + + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath, + OutputStream targetOutputStream, + Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + LOG.debug("start creating bundle {}. building a temporary file locally at first", targetName); // Write the jar to local tmp file first File tmpJar = File.createTempFile(target.getName(), ".tmp"); + File tmpJar = File.createTempFile(targetName, ".tmp"); End diff – So this will be using the system temp directory. Maybe better use the one that user configurated
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44344057

          — Diff: twill-common/src/main/java/org/apache/twill/internal/Constants.java —
          @@ -51,6 +51,10 @@
          public static final String RESTART_ALL_RUNNABLE_INSTANCES = "restartAllRunnableInstances";
          public static final String RESTART_RUNNABLES_INSTANCES = "restartRunnablesInstances";

          + /** Twill Jar location configuration name and the default value for that config */
          + public static final String TWILL_JAR_LOCATION = "twill.jar.location";
          — End diff –

          What is the location? Is it a local directory or a directory based on the `LocationFactory`? Also, better make it clear that it's a directory.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44344057 — Diff: twill-common/src/main/java/org/apache/twill/internal/Constants.java — @@ -51,6 +51,10 @@ public static final String RESTART_ALL_RUNNABLE_INSTANCES = "restartAllRunnableInstances"; public static final String RESTART_RUNNABLES_INSTANCES = "restartRunnablesInstances"; + /** Twill Jar location configuration name and the default value for that config */ + public static final String TWILL_JAR_LOCATION = "twill.jar.location"; — End diff – What is the location? Is it a local directory or a directory based on the `LocationFactory`? Also, better make it clear that it's a directory.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44343897

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro

          • @throws IOException
            */
            public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException {
          • LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName());
            + createBundleAndGetClassPathUrls(target, classes, resources);
            + }
            +
            + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + }

            +
            + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath,
            + OutputStream targetOutputStream,

              • End diff –

          If it takes output stream, then why need to take name and path? You either take a `Location` or `File`, then create an output stream and manage and closing it in this method, or you take a OutputStream and write to it, without caring where does it actually output to.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44343897 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro @throws IOException */ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException { LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName()); + createBundleAndGetClassPathUrls(target, classes, resources); + } + + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); + } + + private Set<URL> createBundleAndGetClassPathUrls(String targetName, String targetPath, + OutputStream targetOutputStream, End diff – If it takes output stream, then why need to take name and path? You either take a `Location` or `File`, then create an output stream and manage and closing it in this method, or you take a OutputStream and write to it, without caring where does it actually output to.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44343712

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro

          • @throws IOException
            */
            public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException {
          • LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName());
            + createBundleAndGetClassPathUrls(target, classes, resources);
            + }
            +
            + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes,
            + Iterable<URI> resources) throws IOException {
            + return createBundleAndGetClassPathUrls(target.getName(), target.toString(),
            + target.getOutputStream(), classes, resources);
              • End diff –

          Also, this leaks the output stream resource.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44343712 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro @throws IOException */ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException { LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName()); + createBundleAndGetClassPathUrls(target, classes, resources); + } + + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes, + Iterable<URI> resources) throws IOException { + return createBundleAndGetClassPathUrls(target.getName(), target.toString(), + target.getOutputStream(), classes, resources); End diff – Also, this leaks the output stream resource.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44343615

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro

          • @throws IOException
            */
            public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException {
          • LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName());
            + createBundleAndGetClassPathUrls(target, classes, resources);
            + }
            +
            + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes,
              • End diff –

          Who call this? Seems like it's only called in this class (from above method). Maybe the private one can be called directly?

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44343615 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -145,41 +145,63 @@ public void createBundle(Location target, Class<?> clz, Class<?>...classes) thro @throws IOException */ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<URI> resources) throws IOException { LOG.debug("start creating bundle {}. building a temporary file locally at first", target.getName()); + createBundleAndGetClassPathUrls(target, classes, resources); + } + + public Set<URL> createBundleAndGetClassPathUrls(Location target, Iterable<Class<?>> classes, End diff – Who call this? Seems like it's only called in this class (from above method). Maybe the private one can be called directly?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44343302

          — Diff: twill-common/src/main/java/org/apache/twill/internal/Constants.java —
          @@ -64,8 +68,8 @@
          public static final class Files {

          public static final String LAUNCHER_JAR = "launcher.jar";

          • public static final String APP_MASTER_JAR = "appMaster.jar";
          • public static final String CONTAINER_JAR = "container.jar";
            + public static final String TWILL_JAR = "twill.jar";
            + public static final String PROGRAM_JAR = "program.jar";
              • End diff –

          Better call it `application.jar`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44343302 — Diff: twill-common/src/main/java/org/apache/twill/internal/Constants.java — @@ -64,8 +68,8 @@ public static final class Files { public static final String LAUNCHER_JAR = "launcher.jar"; public static final String APP_MASTER_JAR = "appMaster.jar"; public static final String CONTAINER_JAR = "container.jar"; + public static final String TWILL_JAR = "twill.jar"; + public static final String PROGRAM_JAR = "program.jar"; End diff – Better call it `application.jar`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r44342963

          — Diff: twill-common/src/main/java/org/apache/twill/internal/Constants.java —
          @@ -51,6 +51,10 @@
          public static final String RESTART_ALL_RUNNABLE_INSTANCES = "restartAllRunnableInstances";
          public static final String RESTART_RUNNABLES_INSTANCES = "restartRunnablesInstances";

          + /** Twill Jar location configuration name and the default value for that config */
          + public static final String TWILL_JAR_LOCATION = "twill.jar.location";
          + public static final String DEFAULT_TWILL_JAR_LOCATION = "/tmp/";
          — End diff –

          Default should be the system tmp, which may not be `/tmp`. You can get it through `System.getProperty("java.io.tmpdir")`

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r44342963 — Diff: twill-common/src/main/java/org/apache/twill/internal/Constants.java — @@ -51,6 +51,10 @@ public static final String RESTART_ALL_RUNNABLE_INSTANCES = "restartAllRunnableInstances"; public static final String RESTART_RUNNABLES_INSTANCES = "restartRunnablesInstances"; + /** Twill Jar location configuration name and the default value for that config */ + public static final String TWILL_JAR_LOCATION = "twill.jar.location"; + public static final String DEFAULT_TWILL_JAR_LOCATION = "/tmp/"; — End diff – Default should be the system tmp, which may not be `/tmp`. You can get it through `System.getProperty("java.io.tmpdir")`
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user gsps1 commented on the pull request:

          https://github.com/apache/incubator-twill/pull/66#issuecomment-151940449

          @chtyim have addressed your comments, please review, thanks

          Show
          githubbot ASF GitHub Bot added a comment - Github user gsps1 commented on the pull request: https://github.com/apache/incubator-twill/pull/66#issuecomment-151940449 @chtyim have addressed your comments, please review, thanks
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41661046

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -175,6 +187,36 @@ protected void shutDown() throws Exception {
          };
          }

          + public Location createTwillJar() throws IOException

          { + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor()); + List<Class<?>> classes = Lists.newArrayList(); + classes.add(ApplicationMasterMain.class); + classes.add(TwillContainerMain.class); + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version + classes.add(yarnAppClient.getClass()); + + twillDependencyClasses = getTwillDependencyClasses(classes); + File tempFile = File.createTempFile("twill", ".jar"); + Location twillJar = new LocalLocationFactory().create(tempFile.toURI()); + applicationBundler.createBundle(twillJar, classes); + return twillJar; + }

          +
          + private Set<String> getTwillDependencyClasses(List<Class<?>> classes) throws IOException {
          + Iterable<String> classNames = Iterables.transform(classes, new Function<Class<?>, String>() {
          + @Override
          + public String apply(Class<?> input)

          { + return input.getName(); + }

          + });
          +
          + ClassLoader classLoader = Thread.currentThread().getContextClassLoader();
          + if (classLoader == null)

          { + classLoader = getClass().getClassLoader(); + }

          + return Dependencies.getClassDependencies(classLoader, new ClassAcceptor(), classNames);
          — End diff –

          The `getClassDependencies` doesn't give you all twill classes. It only gives you all classes in the dependencies. I think the better way is using all classPathURL encountered during the creation of the twill jar to create a `URLClassLoader`. To check if a class is inside the twill.jar, you can use `ClassLoader.getResource()` on the `URLClassLoader`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41661046 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -175,6 +187,36 @@ protected void shutDown() throws Exception { }; } + public Location createTwillJar() throws IOException { + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor()); + List<Class<?>> classes = Lists.newArrayList(); + classes.add(ApplicationMasterMain.class); + classes.add(TwillContainerMain.class); + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version + classes.add(yarnAppClient.getClass()); + + twillDependencyClasses = getTwillDependencyClasses(classes); + File tempFile = File.createTempFile("twill", ".jar"); + Location twillJar = new LocalLocationFactory().create(tempFile.toURI()); + applicationBundler.createBundle(twillJar, classes); + return twillJar; + } + + private Set<String> getTwillDependencyClasses(List<Class<?>> classes) throws IOException { + Iterable<String> classNames = Iterables.transform(classes, new Function<Class<?>, String>() { + @Override + public String apply(Class<?> input) { + return input.getName(); + } + }); + + ClassLoader classLoader = Thread.currentThread().getContextClassLoader(); + if (classLoader == null) { + classLoader = getClass().getClassLoader(); + } + return Dependencies.getClassDependencies(classLoader, new ClassAcceptor(), classNames); — End diff – The `getClassDependencies` doesn't give you all twill classes. It only gives you all classes in the dependencies. I think the better way is using all classPathURL encountered during the creation of the twill jar to create a `URLClassLoader`. To check if a class is inside the twill.jar, you can use `ClassLoader.getResource()` on the `URLClassLoader`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41659626

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -175,6 +187,36 @@ protected void shutDown() throws Exception {
          };
          }

          + public Location createTwillJar() throws IOException {
          + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor());
          + List<Class<?>> classes = Lists.newArrayList();
          + classes.add(ApplicationMasterMain.class);
          + classes.add(TwillContainerMain.class);
          + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version
          + classes.add(yarnAppClient.getClass());
          +
          + twillDependencyClasses = getTwillDependencyClasses(classes);
          + File tempFile = File.createTempFile("twill", ".jar");
          + Location twillJar = new LocalLocationFactory().create(tempFile.toURI());
          — End diff –

          Why need to wrap it to location? Can't it be just `File`?

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41659626 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -175,6 +187,36 @@ protected void shutDown() throws Exception { }; } + public Location createTwillJar() throws IOException { + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor()); + List<Class<?>> classes = Lists.newArrayList(); + classes.add(ApplicationMasterMain.class); + classes.add(TwillContainerMain.class); + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version + classes.add(yarnAppClient.getClass()); + + twillDependencyClasses = getTwillDependencyClasses(classes); + File tempFile = File.createTempFile("twill", ".jar"); + Location twillJar = new LocalLocationFactory().create(tempFile.toURI()); — End diff – Why need to wrap it to location? Can't it be just `File`?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41659573

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -175,6 +187,36 @@ protected void shutDown() throws Exception {
          };
          }

          + public Location createTwillJar() throws IOException {
          + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor());
          + List<Class<?>> classes = Lists.newArrayList();
          + classes.add(ApplicationMasterMain.class);
          + classes.add(TwillContainerMain.class);
          + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version
          + classes.add(yarnAppClient.getClass());
          +
          + twillDependencyClasses = getTwillDependencyClasses(classes);
          + File tempFile = File.createTempFile("twill", ".jar");
          — End diff –

          Potentially this file can be deleted by the system when the runner process runs for a long time.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41659573 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -175,6 +187,36 @@ protected void shutDown() throws Exception { }; } + public Location createTwillJar() throws IOException { + ApplicationBundler applicationBundler = new ApplicationBundler(new ClassAcceptor()); + List<Class<?>> classes = Lists.newArrayList(); + classes.add(ApplicationMasterMain.class); + classes.add(TwillContainerMain.class); + // Stuck in the yarnAppClient class to make bundler being able to pickup the right yarn-client version + classes.add(yarnAppClient.getClass()); + + twillDependencyClasses = getTwillDependencyClasses(classes); + File tempFile = File.createTempFile("twill", ".jar"); — End diff – Potentially this file can be deleted by the system when the runner process runs for a long time.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41659508

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java —
          @@ -175,6 +187,36 @@ protected void shutDown() throws Exception {
          };
          }

          + public Location createTwillJar() throws IOException {
          — End diff –

          Why public?

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41659508 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillRunnerService.java — @@ -175,6 +187,36 @@ protected void shutDown() throws Exception { }; } + public Location createTwillJar() throws IOException { — End diff – Why public?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user gsps1 commented on the pull request:

          https://github.com/apache/incubator-twill/pull/66#issuecomment-146699555

          Thanks for the review @chtyim , i have addressed your comments, please review again, thanks!

          Show
          githubbot ASF GitHub Bot added a comment - Github user gsps1 commented on the pull request: https://github.com/apache/incubator-twill/pull/66#issuecomment-146699555 Thanks for the review @chtyim , i have addressed your comments, please review again, thanks!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user gsps1 commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41460247

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -122,15 +122,16 @@ public boolean accept(String className, URL classUrl, URL classPathUrl) {
          });
          }

          • public void createBundle(Location target, Iterable<Class<?>> classes) throws IOException {
          • createBundle(target, classes, ImmutableList.<URI>of());
            + public void createBundle(Location target, Iterable<Class<?>> classes, boolean skipTwillClasses) throws IOException {
              • End diff –

          yeah, i should use that.

          Show
          githubbot ASF GitHub Bot added a comment - Github user gsps1 commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41460247 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -122,15 +122,16 @@ public boolean accept(String className, URL classUrl, URL classPathUrl) { }); } public void createBundle(Location target, Iterable<Class<?>> classes) throws IOException { createBundle(target, classes, ImmutableList.<URI>of()); + public void createBundle(Location target, Iterable<Class<?>> classes, boolean skipTwillClasses) throws IOException { End diff – yeah, i should use that.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41459679

          — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillPreparer.java —
          @@ -137,7 +139,7 @@
          YarnTwillPreparer(YarnConfiguration yarnConfig, TwillSpecification twillSpec,
          YarnAppClient yarnAppClient, ZKClient zkClient,
          LocationFactory locationFactory, String extraOptions, LogEntry.Level logLevel,

          • YarnTwillControllerFactory controllerFactory) {
            + YarnTwillControllerFactory controllerFactory, byte[] twillJarContent) {
              • End diff –

          Shouldn't be `byte[]`. Better be a `InputSupplier<? extends InputStream>`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41459679 — Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillPreparer.java — @@ -137,7 +139,7 @@ YarnTwillPreparer(YarnConfiguration yarnConfig, TwillSpecification twillSpec, YarnAppClient yarnAppClient, ZKClient zkClient, LocationFactory locationFactory, String extraOptions, LogEntry.Level logLevel, YarnTwillControllerFactory controllerFactory) { + YarnTwillControllerFactory controllerFactory, byte[] twillJarContent) { End diff – Shouldn't be `byte[]`. Better be a `InputSupplier<? extends InputStream>`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41459500

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -178,8 +181,42 @@ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<U
          }
          }

          + /**
          + * Creates a

          {@link ByteArrayOutputStream} which includes all the given classes and
          + * all the classes that they depended on.
          + * The {@link ByteArrayOutputStream}

          + * will also include all classes and resources under the packages as given as include packages
          + * in the constructor.
          + *
          + * @param resources Extra resources to put into the jar file. If resource is a jar file, it'll be put under
          + * lib/ entry, otherwise under the resources/ entry.
          + * @param classes Set of classes to start the dependency traversal.
          + * @return ByteArrayOutputStream
          + * @throws IOException
          + */
          + public ByteArrayOutputStream getBundleAsStream(Iterable<Class<?>> classes,
          — End diff –

          Also, why this method is in ApplicationBundler? Seems totally unrelated to what the role of the class is.

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41459500 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -178,8 +181,42 @@ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<U } } + /** + * Creates a {@link ByteArrayOutputStream} which includes all the given classes and + * all the classes that they depended on. + * The {@link ByteArrayOutputStream} + * will also include all classes and resources under the packages as given as include packages + * in the constructor. + * + * @param resources Extra resources to put into the jar file. If resource is a jar file, it'll be put under + * lib/ entry, otherwise under the resources/ entry. + * @param classes Set of classes to start the dependency traversal. + * @return ByteArrayOutputStream + * @throws IOException + */ + public ByteArrayOutputStream getBundleAsStream(Iterable<Class<?>> classes, — End diff – Also, why this method is in ApplicationBundler? Seems totally unrelated to what the role of the class is.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41459425

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -178,8 +181,42 @@ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<U
          }
          }

          + /**
          + * Creates a

          {@link ByteArrayOutputStream} which includes all the given classes and
          + * all the classes that they depended on.
          + * The {@link ByteArrayOutputStream}

          + * will also include all classes and resources under the packages as given as include packages
          + * in the constructor.
          + *
          + * @param resources Extra resources to put into the jar file. If resource is a jar file, it'll be put under
          + * lib/ entry, otherwise under the resources/ entry.
          + * @param classes Set of classes to start the dependency traversal.
          + * @return ByteArrayOutputStream
          + * @throws IOException
          + */
          + public ByteArrayOutputStream getBundleAsStream(Iterable<Class<?>> classes,
          — End diff –

          So this could end up using a lot of heap space (and double it when `ByteArrayOutputStream.toByteArray()` is being called). Why it needs to be kept all in memory?

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41459425 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -178,8 +181,42 @@ public void createBundle(Location target, Iterable<Class<?>> classes, Iterable<U } } + /** + * Creates a {@link ByteArrayOutputStream} which includes all the given classes and + * all the classes that they depended on. + * The {@link ByteArrayOutputStream} + * will also include all classes and resources under the packages as given as include packages + * in the constructor. + * + * @param resources Extra resources to put into the jar file. If resource is a jar file, it'll be put under + * lib/ entry, otherwise under the resources/ entry. + * @param classes Set of classes to start the dependency traversal. + * @return ByteArrayOutputStream + * @throws IOException + */ + public ByteArrayOutputStream getBundleAsStream(Iterable<Class<?>> classes, — End diff – So this could end up using a lot of heap space (and double it when `ByteArrayOutputStream.toByteArray()` is being called). Why it needs to be kept all in memory?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user chtyim commented on a diff in the pull request:

          https://github.com/apache/incubator-twill/pull/66#discussion_r41459295

          — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java —
          @@ -122,15 +122,16 @@ public boolean accept(String className, URL classUrl, URL classPathUrl) {
          });
          }

          • public void createBundle(Location target, Iterable<Class<?>> classes) throws IOException {
          • createBundle(target, classes, ImmutableList.<URI>of());
            + public void createBundle(Location target, Iterable<Class<?>> classes, boolean skipTwillClasses) throws IOException {
              • End diff –

          Why need to introduce this specific boolean? Can't we use the constructor that takes `ClassAcceptor`?

          Show
          githubbot ASF GitHub Bot added a comment - Github user chtyim commented on a diff in the pull request: https://github.com/apache/incubator-twill/pull/66#discussion_r41459295 — Diff: twill-core/src/main/java/org/apache/twill/internal/ApplicationBundler.java — @@ -122,15 +122,16 @@ public boolean accept(String className, URL classUrl, URL classPathUrl) { }); } public void createBundle(Location target, Iterable<Class<?>> classes) throws IOException { createBundle(target, classes, ImmutableList.<URI>of()); + public void createBundle(Location target, Iterable<Class<?>> classes, boolean skipTwillClasses) throws IOException { End diff – Why need to introduce this specific boolean? Can't we use the constructor that takes `ClassAcceptor`?
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user gsps1 opened a pull request:

          https://github.com/apache/incubator-twill/pull/66

          TWILL-63, combining twill classes into a twill.jar and having other u…

          …ser classes into a program.jar, also caching the twill.jar in YarnTwillRunnerService and the content is reused across TwillApplications started by the runner

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/gsps1/incubator-twill feature/twill-63-separating-twill-jars

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/incubator-twill/pull/66.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #66


          commit 5d9bc9ff2df636f6881e4a091d0798cbca09254d
          Author: shankar <shankar@cask.co>
          Date: 2015-10-07T21:59:35Z

          TWILL-63, combining twill classes into a twill.jar and having other user classes into a program.jar, also caching the twill.jar in YarnTwillRunnerService and the content is reused across TwillApplications started by the runner


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user gsps1 opened a pull request: https://github.com/apache/incubator-twill/pull/66 TWILL-63 , combining twill classes into a twill.jar and having other u… …ser classes into a program.jar, also caching the twill.jar in YarnTwillRunnerService and the content is reused across TwillApplications started by the runner You can merge this pull request into a Git repository by running: $ git pull https://github.com/gsps1/incubator-twill feature/twill-63-separating-twill-jars Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-twill/pull/66.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #66 commit 5d9bc9ff2df636f6881e4a091d0798cbca09254d Author: shankar <shankar@cask.co> Date: 2015-10-07T21:59:35Z TWILL-63 , combining twill classes into a twill.jar and having other user classes into a program.jar, also caching the twill.jar in YarnTwillRunnerService and the content is reused across TwillApplications started by the runner

            People

            • Assignee:
              chtyim Terence Yim
              Reporter:
              chtyim Terence Yim
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development