Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17861 Store data source partitions in metastore and push partition pruning into metastore
  3. SPARK-18661

Creating a partitioned datasource table should not scan all files for table

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.1.0
    • Fix Version/s: 2.1.0
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:

      Description

      Even though in 2.1 creating a partitioned datasource table will not populate the partition data by default (until the user issues MSCK REPAIR TABLE), it seems we still scan the filesystem for no good reason.

      We should avoid doing this when the user specifies a schema.

        Attachments

          Activity

            People

            • Assignee:
              ekhliang Eric Liang
              Reporter:
              ekhliang Eric Liang
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: