Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11009

RowNumber in HiveContext returns negative values in cluster mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.5.1
    • 1.5.2, 1.6.0
    • Spark Core
    • None
    • Standalone cluster mode. No hadoop/hive is present in the environment (no hive-site.xml), only using HiveContext. Spark build as with hadoop 2.6.0. Default spark configuration variables. cluster has 4 nodes, but happens with n nodes as well.

    Description

      This issue happens when submitting the job into a standalone cluster. Have not tried YARN or MESOS. Repartition df into 1 piece or default parallelism=1 does not fix the issue. Also tried having only one node in the cluster, with same result. Other shuffle configuration changes do not alter the results either.

      The issue does NOT happen in --master local[*].

      val ws = Window.
      partitionBy("client_id").
      orderBy("date")

      val nm = "repeatMe"
      df.select(df.col("*"), rowNumber().over(ws).as(nm))

      df.filter(df("repeatMe").isNotNull).orderBy("repeatMe").take(50).foreach(println(_))

      --->

      Long, DateType, Int
      [219483904822,2006-06-01,-1863462909]
      [219483904822,2006-09-01,-1863462909]
      [219483904822,2007-01-01,-1863462909]
      [219483904822,2007-08-01,-1863462909]
      [219483904822,2007-07-01,-1863462909]
      [192489238423,2007-07-01,-1863462774]
      [192489238423,2007-02-01,-1863462774]
      [192489238423,2006-11-01,-1863462774]
      [192489238423,2006-08-01,-1863462774]
      [192489238423,2007-08-01,-1863462774]
      [192489238423,2006-09-01,-1863462774]
      [192489238423,2007-03-01,-1863462774]
      [192489238423,2006-10-01,-1863462774]
      [192489238423,2007-05-01,-1863462774]
      [192489238423,2006-06-01,-1863462774]
      [192489238423,2006-12-01,-1863462774]

      Attachments

        Issue Links

          Activity

            People

              davies Davies Liu
              saif.a.ellafi Saif Addin Ellafi
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: