I'm currently using the %spark interpreter and CassandraSQLContext to load data from my cassandra cluster. The %cassandra interpreter cannot be used because the results have to be postprocessed.
The problem is that the %sql interpreter from %spark defaults to the sqlContext variable and you are not able to override it. So every table registered using a CassandraSQLContext is not accessible by the default sqlContext used by %sql.
IMHO there are two possible solutions:
- Allow to specify the default SQLContext like in https://github.com/syepes/zeppelin/commit/3e50c6b513ee93246b14931a6729c48f8b8068ca
- Even better: Add an optional parameter to %sql that allows to specify the used variable. Like %sql(myCassandraSQLContext). This might come useful for other problems, too