This might be related to ARROW-6548 and others dealing with all NaN columns. When creating a dictionary array, even when fully specifying the desired type, this type is not respected when the data contains only NaNs:
which means that one cannot e.g. serialize batches of categoricals if the possibility of all-NaN batches exists, even when trying to enforce that each batch has the same schema (because the schema is not respected).
I understand that inferring the type in this case would be difficult, but I'd imagine that a fully specified type should be respected in this case?
In the meantime, is there a workaround to manually create a dictionary array of the desired type containing only NaNs?