[PHOENIX-3224] Observations from large scale testing. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Implemented
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

We have a >1000 node physical cluster at our disposal for a short time, before it'll be handed off to its intended use.

Loaded a bunch of data (TPCs LINEITEM table, among others) and ran a bunch of queries. Most tables are between 100G and 500G (uncompressed) and between 600m and 2bn rows.

The good news is that many things just worked. We sorted > 400G is < 5s with HBase and Phoenix. Scans work. Joins work (as long as one side is kept under 1m rows or so).

For the issues we observers I'll file sub jiras under this.

I'm going to write a lob post about this and attach a link here.

Attachments

Sub-Tasks

1.	Distinct Queries are slower than expected at scale.	Resolved	Unassigned
2.	Phoenix should have an option to fail a query that would ship large amounts of data to the client.	Resolved	Unassigned
3.	OFFSET is very slow at scale	Resolved	Unassigned
4.	Index tables should not be configured with a custom/smaller MAX_FILESIZE	Closed	Lars Hofhansl
5.	Support approximate COUNT(*) by using stats.	Resolved	Unassigned

Activity

People

Assignee:: Unassigned

Reporter:: Lars Hofhansl

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 30/Aug/16 20:49

Updated:: 24/Dec/19 19:54

Resolved:: 24/Dec/19 19:54