Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10533

DataFrame filter is not handling float/double with Scientific Notation 'e' / 'E'

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.4.1
    • 1.6.0
    • SQL

    Description

      In DataFrames filter operation,when giving float comparison with e (2.0e2) it is not converting the comparison constant as expected (200.0 in this case).
      For example:

      val df = sqlContext.createDataFrame(Seq(("a",1.0),("b",2.0),("c",3.0)))
      df.filter("_2 < 2.0e1").show()
      
      +--+---+
      |_1| _2|
      +--+---+
      | a|1.0|
      +--+---+ 
      

      It should return all the three records from the dataframe,but is return record which is less than 2.0.
      It seems it is just comparing with the mantissa/coefficient.

      On the other hand,sqlContext is handling the above case and giving the desired output.

      df.resgisterTempTable("df")
      sqlContext.sql("select * from df where `_2` < 2.0e1").show()
      
      +--+---+
      |_1| _2|
      +--+---+
      | a|1.0|
      | b|2.0|
      | c|3.0|
      +--+---+ 
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            adrian-wang Adrian Wang Assign to me
            rishabhbhardwaj Rishabh Bhardwaj
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment