Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13664 Simplify and Speedup HadoopFSRelation
  3. SPARK-14369

Implement preferredLocations() for FileScanRDD

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0
    • SQL
    • None

    Description

      Implement FileScanRDD.preferredLocations() to add locality support for HadoopFsRelation based data sources.

      We should avoid extra block location related RPC costs for S3, which doesn't provide valid locality information.

      Attachments

        Activity

          People

            lian cheng Cheng Lian
            lian cheng Cheng Lian
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: