Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44207

Where Clause throwing Resolved attribute(s) _metadata#398 missing from ... error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.1
    • None
    • SQL
    • None
    • Important

    Description

      i have 2 data frames called lt and rt, both with same schema and only 1 row, generated separately by our own curation logic, all the columns are either String, boolean or Timestamp, i am trying to compare them, and i am running a join on two like this 

      var joinedDF = lt.join(rt, "Id")

      after that, i am trying to compare them by schema fist and then by  each column, how many % of rows are same,

      code is kindof like this

      for (column <- lt.schema) {
           if (rt.columns.contains(column.name) &&
           column.dataType == rt.schema(column.name).dataType) {

            var matchCount = joinedCount
            if (column.dataType.typeName == "string")

      {              matchCount = joinedDF.where((lt(column.name) <=> rt(column.name))).count}

      else

      .....

       

      on the last line where i am running a where clause, it is throwing an error called AnalysisException Resolved attribute(s) _metadata#398 missing from ...., i don't even have this _metadata column anywhere in my dataframe at all

      and i searched online people are saying it is a problem of join, i tried to change the colunm names in rt and joinedDF, both doesn't work, same error is still thrown, can anybody help here

      Attachments

        Activity

          People

            Unassigned Unassigned
            davidxuhz huizhong xu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: