Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1184

Update the distribution tar.gz to include spark-assembly jar

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.9.0
    • 0.9.0
    • Build
    • None

    Description

      This JIRA tracks 2 things:
      1. There seems to be something going on in our assembly generation logic because of which are two assembly jars.
      Something like:

      spark-assembly_2.10-1.0.0-SNAPSHOT.jar

      and

      spark-assembly_2.10-1.0.0-SNAPSHOT-hadoop2.0.5-alpha.jar

      The former is pretty bogus and doesn't contain any class files and should be gotten rid of. The latter contains all the good stuff. It essentially is the uber jar generated by the maven-shade-plugin

      2. The current bigtop-dist profile that builds the maven assembly (a .tar.gz file) using the maven-assembly-plugin includes the bogus jar and not the legit spark-assembly jar. We should get rid of the first one from this assembly (which would happen when we fix #1) and put the legit uber jar in it.

      3. Also, the bigtop-dist profile is meant to exclude the hadoop related jars from the distribution. It does a good job of doing so for org.apache.hadoop jars but misses the avro and zookeeper jars that are also provided by hadoop land.

      Attachments

        Activity

          People

            mgrover Mark Grover
            mgrover Mark Grover
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: