[HIVE-24911] Metastore: Create index on SDS.CD_ID for Postgres - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.0.0-alpha-1
Component/s: None
Labels:
- pull-request-available

Description

While investigating HIVE-24870, we found that during a long incremental replication, an SDS.CD_ID can improve the performance.
It was tested by postgres like below:

CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID");
EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
DROP INDEX IF EXISTS "SDS_N50";
EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;

Further results can be found in: command-output.txt

After some investigation, I found that this index is also part of the schemas for a very long time:
orcale: ~~HIVE-2928~~
mysql: ~~HIVE-2246~~
mssql: ~~HIVE-6862~~ (or earlier)

...except Postgres.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

command-output.txt
19/Mar/21 13:42
3 kB
László Bodor

Issue Links

is related to

HIVE-6862 add DB schema DDL and upgrade 12to13 scripts for MS SQL Server

Closed

HIVE-2928 Support for Oracle-backed Hive-Metastore ("longvarchar" to "clob" in package.jdo)

Closed

HIVE-2246 Dedupe tables' column schemas from partitions in the metastore db

Closed

links to

GitHub Pull Request #2090

Activity

People

Assignee:: László Bodor

Reporter:: László Bodor

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 19/Mar/21 13:39

Updated:: 17/Nov/22 08:47

Resolved:: 20/May/21 14:31

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

40m