There are several points to clear up or improve across these pages:
- I'd refer to the Hive documentation on how to set compression codecs instead of documenting Hive's behaviour for file formats Impala cannot write
- Add 'Ingesting file formats Impala can't write' section to 'How Impala Works with Hadoop File Formats' page, link that central location from wherever applicable. Unify the recommendation on data loading (usage of LOAD DATA or hive or manual copy).
- add a compatibility matrix for compressions and file formats, clear up compatibility on 'How Impala Works with Hadoop File Formats' (the page is inconsistent even within itself, e.g. bzip2).
- Remove references to Impala versions <2.0