Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4706

Fragment planning causes Drillbits to read remote chunks when local copies are available

    XMLWordPrintableJSON

Details

    Description

      When a table (datasize=70GB) of 160 parquet files (each having a single rowgroup and fitting within one chunk) is available on a 10-node setup with replication=3 ; a pure data scan query causes about 2% of the data to be read remotely.
      Even with the creation of metadata cache, the planner is selecting a sub-optimal plan of executing the SCAN fragments such that some of the data is served from a remote server.

      Attachments

        Activity

          People

            Unassigned Unassigned
            kkhatua Kunal Khatua
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: