Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Running the Apache ORC benchmarks which has commons-csv as a dependency and noticed the bulk of running time is in commons-csv.
I attached the VisualVM output and here is my test setup:
JVM: OpenJDK 64-Bit Server VM (25.292-b10, mixed mode) Java: version 1.8.0_292, vendor Private Build Java Home: /usr/lib/jvm/java-8-openjdk-amd64/jre JVM Flags: <none>
I suspect this is in part because ExtendedBufferedReader extends BufferedReader. BufferedReader is a synchronized method class which means that every call to read requires synchronization. Usually it's not an issue, but for commons-csv, it adds a lot of overhead because it reads each byte one-at-a-time. So even though it's buffered, it has to go through a synchronization processes for each byte read. It also has to perform a "jump" into the parent class for each byte.
Nothing else stands out to me as being "slow."
Attachments
Attachments
1.
|
Reuse Buffers in Lexer for Delimiter Detection | Resolved | Unassigned |
|
||||||||
2.
|
Optimize Lexer Delimiter Check for One Character Delimiter | Resolved | Unassigned |
|