Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5166

Stabilize Spark SQL APIs

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: SQL
    • Labels:
    • Target Version/s:

      Description

      Before we take Spark SQL out of alpha, we need to audit the APIs and stabilize them.

      As a general rule, everything under org.apache.spark.sql.catalyst should not be exposed.

        Attachments

          Issue Links

          1.
          Stabilize Spark SQL data type API Sub-task Resolved Reynold Xin
          2.
          Move Row into sql package and make it usable for Java Sub-task Resolved Reynold Xin
          3.
          Adding data frame APIs to SchemaRDD Sub-task Resolved Reynold Xin
          4.
          Make SQLConf a field rather than mixin in SQLContext Sub-task Resolved Reynold Xin
          5.
          Native Date type for SQL92 Date Sub-task Resolved Adrian Wang
          6.
          [SQL] Public API in SQLContext to list tables Sub-task Resolved Bill Bejeck
          7.
          Make Spark SQL API usable in Java and remove the Java-specific API Sub-task Resolved Reynold Xin
          8.
          Move Decimal from types.decimal to types package Sub-task Resolved Adrian Wang
          9.
          Enable javadoc/scaladoc for public classes in catalyst project Sub-task Resolved Michael Armbrust
          10.
          Determine serializability of SQLContext Sub-task Resolved Reynold Xin
          11.
          Clean up exposed classes in sql.hive package Sub-task Resolved Reynold Xin
          12.
          Stabilize UDFRegistration API Sub-task Resolved Reynold Xin
          13.
          Use java.math.BigDecimal as the exposed Decimal type Sub-task Resolved Reynold Xin
          14.
          Update SQL programming guide for 1.3 Sub-task Resolved Unassigned
          15.
          Row shouldn't extend Seq Sub-task Resolved Reynold Xin
          16.
          Add defaultSizeOf to every data type Sub-task Resolved Yin Huai
          17.
          Cross-langauge load/store functions for creating and saving DataFrames Sub-task Resolved Yin Huai
          18.
          Make sure DataFrame expressions are usable in Java Sub-task Resolved Reynold Xin
          19.
          Replace reference to SchemaRDD with DataFrame Sub-task Resolved Reynold Xin
          20.
          Make CacheManager a concrete class and field in SQLContext Sub-task Resolved Reynold Xin
          21.
          Remove Python LocalHiveContext Sub-task Resolved Reynold Xin
          22.
          Break sql.py into multiple files Sub-task Resolved Davies Liu
          23.
          SQLContext.createDataFrame shouldn't be an implicit function Sub-task Closed Reynold Xin
          24.
          collect should call executeCollect Sub-task Resolved Reynold Xin
          25.
          Error messages for plans with invalid AttributeReferences Sub-task Resolved Michael Armbrust
          26.
          Create type alias for SchemaRDD for source backward compatibility Sub-task Resolved Reynold Xin
          27.
          Support explode in DataFrame DSL Sub-task Resolved Michael Armbrust
          28.
          Add more tests and docs for DataFrame Python API Sub-task Resolved Davies Liu
          29.
          Create a convenient way for Python users to register SQL UDFs Sub-task Resolved Davies Liu
          30.
          Provide a convenient way for Scala users to use UDFs Sub-task Resolved Reynold Xin
          31.
          Provide support for project using SQL expression Sub-task Resolved Reynold Xin
          32.
          support select/filter by SQL expression for Python DataFrame Sub-task Resolved Davies Liu
          33.
          Better support for creating DataFrame from local data collection Sub-task Resolved Reynold Xin
          34.
          Allow using String to specify colum name in DSL aggregate functions Sub-task Resolved Reynold Xin
          35.
          Move DataFrame implicit functions into SQLContext.implicits Sub-task Resolved Reynold Xin
          36.
          Add a config flag to disable eager analysis of DataFrames Sub-task Resolved Reynold Xin
          37.
          Support DataFrame.renameColumn Sub-task Resolved Reynold Xin
          38.
          Add a show method to print the content of a DataFrame in columnar format Sub-task Resolved Reynold Xin
          39.
          XyzType companion object should subclass XyzType Sub-task Resolved Reynold Xin
          40.
          Python DataFrame API remaining tasks Sub-task Resolved Davies Liu
          41.
          DataFrame.to_pandas Sub-task Resolved Davies Liu
          42.
          Allow short names for built-in data sources Sub-task Resolved Reynold Xin
          43.
          createDataFrame replace applySchema/inferSchema Sub-task Resolved Davies Liu
          44.
          Allow creating a DataFrame from local Python data Sub-task Resolved Davies Liu
          45.
          Don't implicitly convert RDDs directly to DataFrames Sub-task Resolved Reynold Xin
          46.
          Schema support in Row Sub-task Resolved Reynold Xin
          47.
          fix Data Frame Python API Sub-task Resolved Davies Liu
          48.
          DataFrame methods with varargs do not work in Java Sub-task Resolved Reynold Xin
          49.
          sortBy -> orderBy in Python Sub-task Resolved Reynold Xin
          50.
          Python DataFrame documentation fixes Sub-task Resolved Davies Liu

            Activity

              People

              • Assignee:
                rxin Reynold Xin
                Reporter:
                rxin Reynold Xin
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: