Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5166 Stabilize Spark SQL APIs
  3. SPARK-5752

Don't implicitly convert RDDs directly to DataFrames

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:

      Description

      DataFrame is a rich API consisting of too many functions. It would be safer to convert RDDs to a DataFrameHolder that consists of only two functions:

      - toDataFrame()
      - toDataFrame(String*)
      

      This way, it is highly unlikely to have ambiguous implicit conversions, at the cost of requiring users to always call toDataFrame before being able to use DataFrame functions.

        Attachments

          Activity

            People

            • Assignee:
              rxin Reynold Xin
              Reporter:
              rxin Reynold Xin
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: