I found it confusing that a Row with an omitted field is different from a row with field present but value missing. This was originally problematic for json files will varying fields, but it's comes down to something like:
ds = sc.parallelize(rows)
df = sqlContext.createDataFrame(ds,None,1)
test([Row(x=1,y=None),Row(x=2, y='asdf')]) # Works
test([Row(x=1),Row(x=2, y='asdf')]) # Fails with an ArrayIndexOutOfBoundsException.
maybe more could be said in the documentation for createDataFrame or Row about what's expected. Validation or correction would be helpful, as would a function creating a well formed row from a structtype and dictionary.