Fix Version/s: None
Just to clarify, this isn't a request to increase the default warn threshold for batch sizes, but rather a general discussion around it to get clarification of batches and perhaps change the current behaviour of size warnings.
When using large batches you get a warning if the batch size exceeds the batch_size_warn_threshold_in_kb, but is this always necessary?
I know that using batches for performance usually isn't recommended, but in my understanding this is if the batch contains multiple partitions. Does this apply for single partitions as well?
If there isn't a problem with large batches on a single partition then maybe the size warning shouldn't be there(or be of a separate size for single partitions?). If there is then maybe a warning should be added for insert/update as well? I realise that getting to that size for a single insert is harder, but it's still possible(e.g. storing a file in a blob, which I guess is discouraged so a warning might be good?).
So I guess that depending on what the problem is there are three courses of action:
If the size of a single partition isn't a problem
- Check if the batch involves multiple partitions before warning.
If the size of a single partitions isn't as much of a problem as multiple partitions
- Create a separate warn threshold for single partitions.
- Maybe add the same warning for insert/update.
If the size is always a problem
- Add the same warning for insert/update.