The following is portions of a user group discussion on adding comments both to views and to columns within views in Apache Drill.
This could be a great way to help add contextual information for users. Here's some current observations when I issue a describe view_myview
1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
2. Even thought the underlying parquet table has types, the view does not pass the types for the underlying parquet files through. (The type is ANY)
3. The data for the view is all just a json file that could be easily extended.
So, a few things would be a nice to have
1. Table comments. when I issue a describe table, if the view has a "Description" field, then having that print out as a description for the whole view would be nice. This is harder, I think because it's not just extending the view information.
2. Column comments: A text field that could be added to the view, and just print out another column with description. This would be very helpful. While Drill being schemaless is awesome, the ability to add information to known data, is huge.
3. Ability to to use the types from the Parquet files (without manually specifying each type). If we could provide an option to View creation to attempt to infer type, that would be handy. I realize that folks are using the LIMIT 0 to get metadata, but describe could be done well too.
4. Ability, using ANSI Sql to update the view column descriptions and the description for the view itself.
5. I believe Avro has the ability to add this information to the files, so if the data exists outside of views (such as in AVRO files) should we present it to the user in describe table events as well?
6. This is not a request for the ability to have comments on all data types that Drill could potentially support. By limiting this feature to views, it stays reliant only on Drill code, thus making things simpler to maintain.