I believe PySpark's mllib module should support a GLM feature with also includes defining models using a formula. This is done in a python package called statsmodels http://statsmodels.sourceforge.net/devel/example_formulas.html
The formula feature can be implemented using the python module patsy.
Currently, RSpark supports a GLM module with formula feature.
I can give a shot implementing the feature.