Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32515

Distinct Function Weird Bug

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.4.6
    • None
    • SQL
    • None
    • Window 10 and Mac, both have the same issues.

      Using Scala version 2.11.12

      Python 3.6.10

      java version "1.8.0_261"

    Description

      A weird spark display and counting error. When I was loading in my CSV file into spark and trying to do check all distinct value from a column inside of a dataframe. Everything I try in spark resulted in a wrong answer. But if I convert my spark dataframe into pandas dataframe, it works. Please help. This bug only happens in this one CSV file, all my other CSV files work properly. Here are the pictures.

       

      Attachments

        1. unknown2.png
          82 kB
          Jayce Jiang
        2. unknown1.png
          128 kB
          Jayce Jiang
        3. unknown.png
          234 kB
          Jayce Jiang
        4. Capture.PNG
          19 kB
          Jayce Jiang
        5. Capture1.png
          7 kB
          Jayce Jiang
        6. Capture2.PNG
          24 kB
          Jayce Jiang
        7. image-2020-08-03-07-03-55-716.png
          37 kB
          JinxinTang
        8. Screen_Shot_2020-08-05_at_2.46.42_PM.png
          87 kB
          Jayce Jiang

        Activity

          People

            Unassigned Unassigned
            tigaiii123 Jayce Jiang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: