Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-323

Paragraph execution is very slow for remote Spark

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.6.0
    • None
    • Core
    • EC2

    Description

      I've built EC2 from github in one of EC2 instance. I've running my spark cluster running in EC2 instance as well.

      I've configured zeppelin to use Spark cluster in EC2 by providing spark master URL in spark interpreter.

      I loaded the sample "bank" table. Now the below sql code takes atleast 10 minutes to complete

      %sql
      select age, count(1) value
      from bank
      where age < ${maxAge=30}
      group by age
      order by age

      Whereas when I use the local[*] spark it takes only few seconds.

      Do I need to do any other configuration in zeppelin to execute the paragraph faster using external Spark cluster?

      Attachments

        1. logs.zip
          680 kB
          Samuel Alexander

        Activity

          People

            Unassigned Unassigned
            samalexg Samuel Alexander
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: