Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13947

The error message from using an invalid table reference is not clear

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.3.0
    • SQL
    • None

    Description

      import numpy as np
      import pandas as pd
      
      df = pd.DataFrame({'foo': np.random.randn(1000),
                         'bar': np.random.randn(1000)})
      
      df2 = pd.DataFrame({'foo': np.random.randn(1000),
                          'bar': np.random.randn(1000)})
      
      
      sdf = sqlContext.createDataFrame(df)
      sdf2 = sqlContext.createDataFrame(df2)
      
      sdf[sdf2.foo > 0]
      

      Produces this error message:

      AnalysisException: u'resolved attribute(s) foo#91 missing from bar#87,foo#88 in operator !Filter (foo#91 > cast(0 as double));'
      

      It may be possible to make it more clear what the user did wrong.

      Attachments

        Activity

          People

            RBerenguel Ruben Berenguel
            wesm Wes McKinney
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: