Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
2.1
-
None
Description
Ignite lacks cost-based optimizer what doesn't allow us to build efficient execution plans. Let's start moving in this direction.
The ticket is about creating local statistics for tables. In the first phase they will not be shared between nodes, neither they will participate in query optimization. The ultimate goal of this ticket is to start gathering some info in the background and provide necessary internal infrastructure and APIs for that.
1. API
Let's start with a single method GridQueryProcessor.rebuildStatistics(), which will build stats for all existing tables.
Then implement ANALYZE command [1]
2. Infrastructure
- Statistics are transient, not persisted
- We need a background worker which will re-build them on regular basis and replace old with new using copy-on-write approach
- Statistics are created for indexed (i.e. sorted) columns
- Sampling should be used to avoid full table scan
3. Statistics types
- Height-based: the whole range is split into N pieces, so that exactly M/N entries are located between X and X+1 piece, where M is number of records
One statistics type should be enough in the first iteration.
Attachments
Issue Links
- is depended upon by
-
IGNITE-6089 SQL: Improve query parallelism architecture
- Open
- is duplicated by
-
IGNITE-3171 Support column selectivity calculation for SQL
- Closed
- relates to
-
IGNITE-7761 EXPLAIN with run time statistics
- Closed