Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-323

Paragraph execution is very slow for remote Spark

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.6.0
    • Fix Version/s: None
    • Component/s: Core
    • Labels:
    • Environment:

      EC2

      Description

      I've built EC2 from github in one of EC2 instance. I've running my spark cluster running in EC2 instance as well.

      I've configured zeppelin to use Spark cluster in EC2 by providing spark master URL in spark interpreter.

      I loaded the sample "bank" table. Now the below sql code takes atleast 10 minutes to complete

      %sql
      select age, count(1) value
      from bank
      where age < ${maxAge=30}
      group by age
      order by age

      Whereas when I use the local[*] spark it takes only few seconds.

      Do I need to do any other configuration in zeppelin to execute the paragraph faster using external Spark cluster?

        Attachments

        1. logs.zip
          680 kB
          Samuel Alexander

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              samalexg Samuel Alexander
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: