Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4348

pyspark.mllib.random conflicts with random module

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.1.0, 1.2.0
    • Fix Version/s: 1.1.0, 1.2.0
    • Component/s: MLlib, PySpark
    • Labels:
      None
    • Target Version/s:

      Description

      There are conflict in two cases:

      1. random module is used by pyspark.mllib.feature, if the first part of sys.path is not '', then the hack in pyspark/_init_.py will fail to fix the conflict.

      2. Run tests in mllib/xxx.py, the '' should be popped out before import anything, or it will fail.

      The first one is not fully fixed for user, it will introduce problems in some cases, such as:

      >>> import sys
      >>> import sys.insert(0, PATH_OF_MODULE)
      >>> import pyspark
      >>> # use Word2Vec will fail
      

      I'd like to rename mllib/random.py as random/random.py, then in mllib/_init.py

      import pyspark.mllib._random as random
      

      cc Xiangrui Meng Doris Xin

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                davies Davies Liu
                Reporter:
                davies Davies Liu
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: