Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5180

Data source API improvement (Spark 1.5)

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Story
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 1.5.0
    • SQL
    • None
    • Spark 1.5 release

    Attachments

      Issue Links

      1.
      Adding support for defining schema in foreign DDL commands. Sub-task Resolved Fei Wang Actions
      2.
      Persistent data source tables Sub-task Resolved Michael Armbrust Actions
      3.
      Partitioning support for tables created by the data source API Sub-task Resolved Cheng Lian Actions
      4.
      Improve the performance of metadata operations Sub-task Resolved Unassigned Actions
      5.
      Document data source API Sub-task Resolved Michael Armbrust Actions
      6.
      Write support for the data source API Sub-task Resolved Yin Huai Actions
      7.
      Python API for the write support of the data source API Sub-task Resolved Yin Huai Actions
      8.
      In memory data cache should be invalidated after insert into/overwrite Sub-task Resolved Yin Huai Actions
      9.
      Preinsert casting and renaming rule is needed in the Analyzer Sub-task Resolved Yin Huai Actions
      10.
      Finalize DDL and write support APIs Sub-task Resolved Yin Huai Actions
      11.
      Add common string filters to data sources Sub-task Resolved Reynold Xin Actions
      12.
      FSBasedRelation interface tweaks Sub-task Resolved Cheng Lian Actions
      13.
      Do not use FloatType in partition column inference Sub-task Resolved Reynold Xin Actions
      14.
      Replace the hash map in DynamicPartitionWriterContainer.outputWriterForRow with java.util.HashMap Sub-task Resolved Reynold Xin Actions
      15.
      Reduce memory consumption for dynamic partition insert Sub-task Resolved Michael Armbrust Actions
      16.
      Move all internal data source related classes out of sources package Sub-task Resolved Reynold Xin Actions
      17.
      Speed up path construction in DynamicPartitionWriterContainer.outputWriterForRow Sub-task Resolved Cheng Lian Actions
      18.
      DataFrame partitionBy memory pressure scales extremely poorly Sub-task Closed Unassigned Actions
      19.
      Conversion is applied twice on partitioned data sources Sub-task Resolved Cheng Lian Actions

      Activity

        This comment will be Viewable by All Users Viewable by All Users
        Cancel

        People

          lian cheng Cheng Lian
          yhuai Yin Huai
          Votes:
          0 Vote for this issue
          Watchers:
          6 Start watching this issue

          Dates

            Created:
            Updated:
            Resolved:

            Agile

              Completed Sprint:
              Spark 1.5 release ended 14/Aug/15
              View on Board

              Slack

                Issue deployment