[KYLIN-4035] Calculate column cardinality by using spark engine - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: v3.0.0-alpha2
Component/s: Spark Engine
Labels:
None
Environment:
kylin: master/3.0.0-alpha
spark: 2.4.3
hadoop: 2.6.5

Description

Kylin will calculate column cardinality when loading hive table. This stage is only supported by MR engine without spark. I think spark engine should be used in this stage because of the following:

1) Kylin users can choose which engine they apply when calculating column cardinality;

2) Some good spark features(e.g. dynamic resource allocation) can be used;

3) The code written in spark is simple.

I finish this work and test ok. But "kylin.engine.spark-cardinality=true" should be added in kylin.properties(default is false). Look forwards to suggestions.

Best regards.

Attachments

Issue Links

links to

GitHub Pull Request #680

Activity

People

Assignee:: Jack

Reporter:: Jack

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 08/Jun/19 02:41

Updated:: 21/Jan/20 07:36

Resolved:: 25/Jul/19 02:45