Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Currently, Tajo output committer works as following:
- Each task write output to a temp directory.
- FileTablespace::commitTable renames first successful task's temp directory to final destination.
But above approach will occurs FileNotFoundException because of eventual consistency of S3. To resolve it, we need to implement DirectOutputCommitter.
There may be three different ways for implement it.
First way is changing the name scheme for the files Tajo creates. Instead of part-00000 we should use names like UUID_000000 where all files generated by a single insert into use the same prefix. The prefix is consists of UUID and each query id. It will guarantees that a new insert into will not stomp on data produced by an earlier query. After finishing query successfully, Tajo will delete all files that don't begin with same UUID. Of course, when executing the insert into statement, Tajo never delete existing files. But if query failed or killed, Tajo will delete all file that begin with same UUID. I was inspired by Qubole's slide (http://www.slideshare.net/qubolemarketing/new-york-city-hadoop-meetup-4-232015)
Second way is storing insert file names and existing file names name to tables of CatalogStore or member variables of TaskAttemptContext. Before inserting files, Tajo will store existing file names to some storage. And whenever finishing task attempt, Tajo will store insert file names to some storage. And Tajo will delete or maintain files using stored file names according to query final status.
Other way is writing the data to local disk. This output committer works as follows:
- Each task write output to local disk instead of S3 (in CTAS statement or INERT statement)
- Copies first successful task's temp directory to S3.
For the reference, I was inspired by Netflix integrating spark slide(http://www.slideshare.net/piaozhexiu/netflix-integrating-spark-at-petabyte-scale-53391704).
I wish to implement DirectOutputCommitter with the first way.
Please feel free to comment if you have any questions/ideas.