Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19701

the `in` operator in pyspark is broken

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • PySpark
    • None

    Description

      >>> textFile = spark.read.text("/Users/cloud/dev/spark/README.md")
      >>> linesWithSpark = textFile.filter("Spark" in textFile.value)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/Users/cloud/product/spark/python/pyspark/sql/column.py", line 426, in __nonzero__
          raise ValueError("Cannot convert column into bool: please use '&' for 'and', '|' for 'or', "
      ValueError: Cannot convert column into bool: please use '&' for 'and', '|' for 'or', '~' for 'not' when building DataFrame boolean expressions.
      

      Attachments

        Activity

          People

            hyukjin.kwon Hyukjin Kwon
            cloud_fan Wenchen Fan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: