I'm creating this jira to capture the change we made to the original Spark integration plan, and to have a reference jira for 0.7.0.
What we have now is:
- a kuduRDD that uses newAPIHadoopRDD with our input format.
- a default source with a base relation that basically just wraps the kuduRDD, meaning you can run SparkSQL and pick your columns but that's it.
- a unit test infrastructure in scala
I'm assigning it to Dan since he's the one who took it to the finish line.