Hive bolt will close connection only if parameter "max_connections" is exceeded or bolt dies. So if we open a connection to Hive via Hive bolt and some time later we stop producing messages to Hive bolt, connection will be maintained and corresponding transactions will be opened. This can be a problem if we launch two topologies and one of them will maintain open transactions doing nothing, and other will work writing messages to hive. At some point hive will launch compactions to collapse small delta files generated by Hive Bolt into one base file. But compaction wont launch if we have opened transactions.
Issue Links
- links to