[ARTEMIS-3868] Journal Compactor split logic creating too many files - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.19.0, 2.20.0, 2.21.0, 2.22.0, 2.23.0, 2.23.1
Fix Version/s: 2.24.0
Component/s: Broker
Labels:
None
Environment:

I am marking this as an improvement, since the error condition shouldn't really happen in real life since compacted records will be written first.

I'm accepting the improvement though.

Description

We are having some problems with disk space usage. Basically we have a lot of queues and one of them fail to consume the messages (problem in a microservice for about 10 minutes).
Then we use the message redelivery feature to process these messages again after some delay.
The problems ocurrs with de journal compact. This process creates a lot of files until fill the disk.

I took some created files at production server and tried to compact them. The disk usage grow a lot and only after the second compact process the disk usage stayed at an acceptable level.

I attached a "test" to reproduce the problem. The test create one journal file with 2048 records and using only 2,1MB. The records have an information that is the "compact count". I adjusted this field to 1 and 2.
What I found out is when the compactor process a record with this field less than 2 (after some record being greater or equals to 2), the JournalCompactor creates a new file.
In this test, when I start the journal it creates a new file of 10MB (default configuration) and keeps the 2,1MB file.
When I ran the compact process, It creates 1024 files and keeps the old 2. 1024 * 10MB = 10GB!!!
If I ran the compact process again, It shrink to only 2 files of 10MB!!!

The problem seems to be in the JournalCompactor at method checkCompact. I think we can remove this method and assume the flow like it returned false:
https://github.com/apache/activemq-artemis/blob/main/artemis-journal/src/main/java/org/apache/activemq/artemis/core/journal/impl/JournalCompactor.java#L190-L203

This was an old code that was introduced at HornetQ:
https://github.com/hornetq/hornetq/commit/93af1cb92f4050e54e83c8daa9c67ce43dbcfead

I also attached an image with the disk usage of my production server with the problem.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

CompactDiskUsage.java
21/Jun/22 22:36
3 kB
Fabio Nascimento Brandão
image-2022-06-21-19-33-59-276.png
21/Jun/22 22:34
42 kB
Fabio Nascimento Brandão

Issue Links

duplicates

ARTEMIS-3545 Artemis primary is filling all the disc when replica is killed

Closed

links to

GitHub Pull Request #4124

Activity

People

Assignee:: Clebert Suconic

Reporter:: Fabio Nascimento Brandão

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 21/Jun/22 22:34

Updated:: 21/Jul/22 22:00

Resolved:: 21/Jul/22 22:00

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

10m