Creating a table that points to existing data in S3 can take an excessive amount of time.
If the Hive Metastore is configured with "hive.stats.autogather=true" then Hive lists the files of newly created tables to populate basic statistics like file count and file byte sizes. Unfortunately, this listing operation can take an excessive amount of time particularly on S3.
- Reconfigure the Hive Metastore with "hive.stats.autogather=false"
- Note that TBLPROPERTIES("DO_NOT_UPDATE_STATS"="true") does not address the issue due to a bug in Hive