Details
-
Improvement
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.12.0
-
ghx-label-3
Description
We should improve the logging on the catalogd side during a catalog topic update.
Snippet of current log:
I0307 21:23:41.011883 21342 catalog-server.cc:477] Publishing update: TABLE:functional_avro_def.nulltable original size: 65 I0307 21:23:41.011922 21342 CatalogServiceCatalog.java:426] Collected catalog update: TABLE:functional_avro_def.alltypestiny version: 134 I0307 21:23:41.011927 21342 catalog-server.cc:477] Publishing update: TABLE:functional_avro_def.alltypestiny original size: 68 I0307 21:23:41.011975 21342 CatalogServiceCatalog.java:426] Collected catalog update: TABLE:functional_avro_def.jointbl version: 138 I0307 21:23:41.011986 21342 catalog-server.cc:477] Publishing update: TABLE:functional_avro_def.jointbl original size: 63 I0307 21:23:41.012027 21342 CatalogServiceCatalog.java:426] Collected catalog update: TABLE:functional_avro_def.alltypesaggmultifilesnopart version: 130 I0307 21:23:41.012032 21342 catalog-server.cc:477] Publishing update: TABLE:functional_avro_def.alltypesaggmultifilesnopart original size: 83 I0307 21:23:41.012071 21342 CatalogServiceCatalog.java:426] Collected catalog update: TABLE:functional_avro_def.alltypesaggnonulls version: 131
We should improve the logging as follows
- Avoid duplicate messages for the same entity
- Be clearer about "publishing" and "collecting" a topic update (collect is more accurate)
- Add a log final message when a topic update has been fully assembled
The double-logging was introduced in IMPALA-5990. For heavily loaded clusters this change might lead to a doubling of the log volume - clearly undesirable.