If compressed file size less than the hdfs block size, the volume information can use.
I've verified 'mvn clean install' and TPC-H 1,3.
+1 for the patch.
'mvn clean install' verified successfully.
In this patch, the block size can be different according to the first BlockStorageLocation.
It would be better to get the block size from the configuration as follows.
This should need real blockSize.
A user can be change each file.
Sorry, it is hard to understand for me.
Would you please add more detailed explanation about the issue?
Also, although the block size is get from the first BlockStorageLocation, it looks to be
blockStorageLocations.getLength() - blockStorageLocations.getOffset()
Thank you for nice finding.
In the current implementation, compression text file only support non-split
it can't use disk volume scheduling. but If compressed file size less than a block size, we can use volume scheduling.
hdfs block size : 64MB
disk volume : 1
file size <= 64 MB
+1 for the latest patch.
Thanks. all guys.
I've just committed it.
SUCCESS: Integrated in Tajo-trunk-postcommit #623 (See https://builds.apache.org/job/Tajo-trunk-postcommit/623/)
TAJO-421: Improve split for compression file. (jinho) (jinossy: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=14160face45894a73f742f528e87b5f8ec2e10b9)