Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Hi folks,
Currently, there is a pull request to integrate Tajo with Zeppelin as follows.
https://github.com/NFLabs/zeppelin/pull/245
I also need Zeppelin to enable interactive data analysis on Tajo. But above patch had last CI build error and no progress has been made for the five months. So, I implemented new patch with Tajo JDBC driver. For the referece, existing patch had been implemented with TajoCli which is a CLI(Command-Line-Interface) class.
Here is my suggestion.
I implemented the patch with Tajo JDBC driver. I think it would be better than TajoCli due to the following reasons:
Light dependency: TajoCli needs more library compare than Tajo JDBC driver. Especially, it needs hadoop libraries like hadoop-common, hadoop-hdfs, hadoop-mapreduce-client-core. Also it depends on trevni and parquet. Tajo JDBC doesn’t depend on any file system or file format.
Easy maintenence: In existing patch, Zeppelin should call the execute method of TajoClient. But if TajoClient method specification changes, it would cause some problems on Zeppelin. Thus, Zeppelin need to watch up Tajo change note. But JDBC specification will not change easily in the future. Also JDBC interface already had been implemented at HiveInterprerter of Zeppelin. I think most contributors would be familar with JDBC compare than TajoCli and its codes would more simple. I wish it more efficient for code maintenence.
What do you think about my suggestion? For the reference, I'll create a PR right now.
P.S Honestly, I borrowed HiveInterprerter of Zeppelin and I could implemet this quickly by HiveInterpeter.
Attachments
Issue Links
- links to