[KYLIN-5205] After cube segment is already created in job node, when the result of select is empty - ASF JIRA

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: v4.0.1
Fix Version/s: None
Component/s: Client - CLI
Labels:
None

Skill Level:
Committer(Medium) - This is for regular contributors/committers

Description

The log is as follows:

Our cubes have all been built, but the data on the 28th and 29th can not be viewed. The same problem occurred on the 25th, 26th and 27th before. It has not been solved after reload metadata. It can only be displayed after restarting the query node.

==========================[QUERY]===============================
Query Id: 0c546efd-33b7-acc3-438f-7ece519969fd
SQL: select count from adm_***_ds where biz_date='2022-06-27'
User: ADMIN
Success: true
Duration: 1.104
Project: tr***adm
Realization Names: [CUBE[name=adm_***_ds]]
Cuboid Ids: [97]
Is Exactly Matched: [false]
Total scan count: 0
Total scan files: 23
Total metadata time: 0ms
Total spark scan time: 577ms
Total scan bytes: 64217
Result row count: 1
Storage cache used: false
Is Query Push-Down: false
Is Prepare: false
Used Spark pool: vip_tasks
Trace URL: null
Message: null
Time consuming for each query stage: -----------------
SQL_TRANSFORMATION : 15ms
SQL_PARSE_AND_OPTIMIZE : 52ms
CUBE_MATCHING : 29ms
PREPARE_AND_SUBMIT_JOB : 201ms
WAIT_FOR_EXECUTION : 0ms
EXECUTION : 0ms
FETCH_RESULT : 0ms
Time consuming for each query stage: -----------------
==========================[QUERY]===============================

2022-06-29 04:00:59,439 INFO [Query 0fd68ce7-eb6a-7b1e-68a1-019017b69652-1902] service.QueryService:402 :
==========================[QUERY]===============================
Query Id: 0fd68ce7-eb6a-7b1e-68a1-019017b69652
SQL: select count from adm_***_ds where biz_date='2022-06-28'
User: ADMIN
Success: true
Duration: 0.47
Project: tr***adm
Realization Names: [CUBE[name=adm_***_ds]]
Cuboid Ids: [97]
Is Exactly Matched: [false]
Total scan count: 0
Total scan files: 23
Total metadata time: 0ms
Total spark scan time: 199ms
Total scan bytes: 64217
Result row count: 1
Storage cache used: false
Is Query Push-Down: false
Is Prepare: false
Used Spark pool: vip_tasks
Trace URL: null
Message: null
Time consuming for each query stage: -----------------
SQL_TRANSFORMATION : 9ms
SQL_PARSE_AND_OPTIMIZE : 36ms
CUBE_MATCHING : 13ms
PREPARE_AND_SUBMIT_JOB : 132ms
WAIT_FOR_EXECUTION : 0ms
EXECUTION : 0ms
FETCH_RESULT : 0ms
Time consuming for each query stage: -----------------
==========================[QUERY]===============================

2022-06-29 04:01:00,146 DEBUG [BadQueryDetector] service.BadQueryDetector:148 : Detect bad query.

2022-06-29 04:02:02,437 INFO [Query 9cf55d16-6cde-c88b-32d7-1a8ceed2dd5b-1902] service.QueryService:402 :
==========================[QUERY]===============================
Query Id: 9cf55d16-6cde-c88b-32d7-1a8ceed2dd5b
SQL: select count from adm_***_ds where biz_date='2022-06-26'
User: ADMIN
Success: true
Duration: 0.725
Project: tr***adm
Realization Names: [CUBE[name=adm_***_ds]]
Cuboid Ids: [97]
Is Exactly Matched: [false]
Total scan count: 10
Total scan files: 23
Total metadata time: 0ms
Total spark scan time: 418ms
Total scan bytes: 62329
Result row count: 1
Storage cache used: false
Is Query Push-Down: false
Is Prepare: false
Used Spark pool: vip_tasks
Trace URL: null
Message: null
Time consuming for each query stage: -----------------
SQL_TRANSFORMATION : 21ms
SQL_PARSE_AND_OPTIMIZE : 52ms
CUBE_MATCHING : 11ms
PREPARE_AND_SUBMIT_JOB : 146ms
WAIT_FOR_EXECUTION : 0ms
EXECUTION : 0ms
FETCH_RESULT : 0ms
Time consuming for each query stage: -----------------
==========================[QUERY]===============================

2022-06-29 04:12:42,117 INFO [Query 333916da-00b2-2e47-2176-dd0264ecc942-1919] service.QueryService:402 :
==========================[QUERY]===============================
Query Id: 333916da-00b2-2e47-2176-dd0264ecc942
SQL: select count from adm_***_ds where biz_date='2022-06-25'
User: ADMIN
Success: true
Duration: 0.701
Project: tr***adm
Realization Names: [CUBE[name=adm_***_ds]]
Cuboid Ids: [97]
Is Exactly Matched: [false]
Total scan count: 10
Total scan files: 23
Total metadata time: 0ms
Total spark scan time: 513ms
Total scan bytes: 62329
Result row count: 1
Storage cache used: false
Is Query Push-Down: false
Is Prepare: false
Used Spark pool: vip_tasks
Trace URL: null
Message: null
Time consuming for each query stage: -----------------
SQL_TRANSFORMATION : 6ms
SQL_PARSE_AND_OPTIMIZE : 12ms
CUBE_MATCHING : 8ms
PREPARE_AND_SUBMIT_JOB : 85ms
WAIT_FOR_EXECUTION : 0ms
EXECUTION : 0ms
FETCH_RESULT : 0ms
Time consuming for each query stage: -----------------
==========================[QUERY]===============================

The query node and build node are configured as follows:

query node：
kylin.metadata.url=kylin_metadata@jdbc,url=jdbc:mysql://**.rds.amazonaws.com:3306/kylin4,username=kylin4,password=**,maxActive=10,maxIdle=10
kylin.env.hdfs-working-dir=hdfs:///kylin
kylin.env=PROD
kylin.env.zookeeper-connect-string=ip-1:2181,ip-2:2181,ip-3:2181
kylin.server.mode=query
kylin.server.cluster-servers=job-1:7070,job-2:7070,query-1:7070:query-2:7070
kylin.web.timezone=GMT+0
kylin.storage.clean-after-delete-operation=true
kylin.job.retry=2
kylin.job.max-concurrent-jobs=30
kylin.job.scheduler.default=100
kylin.cube.cubeplanner.enabled=true
kylin.cube.cubeplanner.enabled-for-existing-cube=true
kylin.cube.cubeplanner.expansion-threshold=2.5
kylin.query.cache-enabled=true
kylin.query.badquery-alerting-seconds=10
kylin.query.timeout-seconds=120
kylin.query.max-return-rows=50000000
kylin.query.statement-cache-max-num=100000
kylin.query.statement-cache-max-num-per-key=500
kylin.query.enable-dict-enumerator=false
kylin.query.enable-dynamic-column=false
kylin.query.lazy-query-enabled=true
kylin.query.cache-signature-enabled=true
kylin.query.segment-cache-enabled=false
kylin.query.cache-threshold-scan-count=200
kylin.query.cache-threshold-scan-duration=2000
kylin.query.cache-threshold-scan-bytes=1024
kylin.cache.memcached.hosts=***.cache.amazonaws.com:11211
kylin.env.hadoop-conf-dir=/etc/hadoop/conf
kylin.engine.spark-conf.spark.master=yarn
kylin.engine.spark-conf.spark.submit.deployMode=client
kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=10
kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=100
kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
kylin.engine.spark-conf.spark.yarn.queue=default
kylin.engine.spark-conf.spark.executor.cores=2
kylin.engine.spark-conf.spark.executor.memory=10G
kylin.engine.spark-conf.spark.executor.instances=10
kylin.engine.spark-conf.spark.executor.memoryOverhead=1024M
kylin.engine.spark-conf.spark.driver.cores=2
kylin.engine.spark-conf.spark.driver.memory=4G
kylin.engine.spark-conf.spark.driver.memoryOverhead=256M
kylin.engine.spark-conf.spark.network.timeout=600
kylin.engine.spark-conf.spark.shuffle.service.enabled=true
kylin.engine.spark-conf.spark.memory.fraction=0.5
kylin.engine.spark-conf.spark.storage.memoryFraction=0.5
kylin.engine.spark-conf.spark.eventLog.enabled=true
kylin.engine.spark.rdd-partition-cut-mb=100
kylin.engine.spark.speculation=false
kylin.engine.spark-conf.spark.speculation=false
kylin.engine.spark-conf.spark.eventLog.dir=hdfs:///var/log/spark/apps
kylin.engine.spark-conf.spark.history.fs.logDirectory=hdfs:///var/log/spark/apps
kylin.engine.spark-conf.spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec
kylin.query.auto-sparder-context-enabled=true
kylin.query.spark-conf.spark.master=yarn
kylin.query.spark-conf.spark.driver.cores=2
kylin.query.spark-conf.spark.driver.memory=8G
kylin.query.spark-conf.spark.driver.memoryOverhead=1G
kylin.query.spark-conf.spark.executor.cores=4
kylin.query.spark-conf.spark.executor.instances=15
kylin.query.spark-conf.spark.executor.memory=16G
kylin.query.spark-conf.spark.executor.memoryOverhead=1G
kylin.query.spark-conf.spark.sql.adaptive.enabled=true
kylin.server.query-metrics2-enabled=false
kylin.metrics.monitor-enabled=false
kylin.metrics.reporter-query-enabled=false
kylin.metrics.reporter-job-enabled=false
kylin.metrics.query-cache.expire-seconds=300
kylin.metrics.query-cache.max-entries=10000
kylin.server.query-metrics-enabled=false

job node：
kylin.metadata.url=kylin_metadata@jdbc,url=jdbc:mysql://**.rds.amazonaws.com:3306/kylin4,username=kylin4,password=**,maxActive=10,maxIdle=10
kylin.env.hdfs-working-dir=hdfs:///kylin
kylin.env=PROD
kylin.env.zookeeper-connect-string=ip-1:2181,ip-2:2181,ip-3:2181
kylin.server.mode=job
kylin.server.cluster-servers=job-1:7070,job-2:7070,query-1:7070:query-2:7070
kylin.web.timezone=GMT+0
kylin.storage.clean-after-delete-operation=true
kylin.job.retry=2
kylin.job.max-concurrent-jobs=30
kylin.job.scheduler.default=100
kylin.cube.cubeplanner.enabled=true
kylin.cube.cubeplanner.enabled-for-existing-cube=true
kylin.cube.cubeplanner.expansion-threshold=2.5
kylin.query.cache-enabled=true
kylin.query.badquery-alerting-seconds=10
kylin.query.timeout-seconds=120
kylin.query.max-return-rows=50000000
kylin.query.statement-cache-max-num=100000
kylin.query.statement-cache-max-num-per-key=500
kylin.query.enable-dict-enumerator=true
kylin.query.enable-dynamic-column=true
kylin.query.lazy-query-enabled=true
kylin.query.cache-signature-enabled=true
kylin.query.segment-cache-enabled=true
kylin.query.cache-threshold-scan-count=200
kylin.query.cache-threshold-scan-duration=2000
kylin.query.cache-threshold-scan-bytes=1024
kylin.cache.memcached.hosts=***.cache.amazonaws.com:11211
kylin.env.hadoop-conf-dir=/etc/hadoop/conf
kylin.engine.spark-conf.spark.master=yarn
kylin.engine.spark-conf.spark.submit.deployMode=client
kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=10
kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=100
kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
kylin.engine.spark-conf.spark.yarn.queue=default
kylin.engine.spark-conf.spark.executor.cores=2
kylin.engine.spark-conf.spark.executor.memory=10G
kylin.engine.spark-conf.spark.executor.instances=10
kylin.engine.spark-conf.spark.executor.memoryOverhead=1024M
kylin.engine.spark-conf.spark.driver.cores=2
kylin.engine.spark-conf.spark.driver.memory=4G
kylin.engine.spark-conf.spark.driver.memoryOverhead=256M
kylin.engine.spark-conf.spark.network.timeout=600
kylin.engine.spark-conf.spark.shuffle.service.enabled=true
kylin.engine.spark-conf.spark.memory.fraction=0.5
kylin.engine.spark-conf.spark.storage.memoryFraction=0.5
kylin.engine.spark-conf.spark.eventLog.enabled=true
kylin.engine.spark.rdd-partition-cut-mb=100
kylin.engine.spark.speculation=false
kylin.engine.spark-conf.spark.speculation=false
kylin.engine.spark-conf.spark.eventLog.dir=hdfs:///var/log/spark/apps
kylin.engine.spark-conf.spark.history.fs.logDirectory=hdfs:///var/log/spark/apps
kylin.engine.spark-conf.spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec
kylin.query.auto-sparder-context-enabled=true
kylin.query.spark-conf.spark.master=yarn
kylin.query.spark-conf.spark.driver.cores=1
kylin.query.spark-conf.spark.driver.memory=4G
kylin.query.spark-conf.spark.driver.memoryOverhead=512M
kylin.query.spark-conf.spark.executor.cores=1
kylin.query.spark-conf.spark.executor.instances=15
kylin.query.spark-conf.spark.executor.memory=4G
kylin.query.spark-conf.spark.executor.memoryOverhead=512M
kylin.query.spark-conf.spark.sql.adaptive.enabled=true
kylin.server.query-metrics2-enabled=true
kylin.metrics.monitor-enabled=true
kylin.metrics.reporter-query-enabled=true
kylin.metrics.reporter-job-enabled=true
kylin.web.dashboard-enabled=true
kylin.metrics.query-cache.expire-seconds=300
kylin.metrics.query-cache.max-entries=10000
kylin.server.query-metrics-enabled=true

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

企业微信20220629-140718@2x.png
29/Jun/22 06:15
185 kB
xbchao
企业微信20220629-140819@2x.png
29/Jun/22 06:15
191 kB
xbchao
企业微信20220629-140849@2x.png
29/Jun/22 06:15
190 kB
xbchao
企业微信20220629-140917@2x.png
29/Jun/22 06:15
192 kB
xbchao
企业微信截图_712f06a3-2903-4db3-bffb-d522ea3f105a.png
29/Jun/22 06:16
311 kB
xbchao

After cube segment is already created in job node, when the result of select is empty

Details

Description

Attachments

Attachments

Activity

People

Dates