The patch adds a new SchemaCheckService and SchemaCheckXCommand. It uses the java Connection to get metadata on the database that Oozie is configured to use. Because it's not using OpenJPA stuff, we have to make a new connection to the database, which means it isn't going to work for Derby. It checks the following:
- The expected tables
- The expected columns, their types, and default values
- The expected indexes and primary keys
To make things easier to maintain in the future when we next change the database schema, a lot of the expected results are generated dynamically using reflection to inspect the Bean classes.
There's a few places where different databases behave differently or want different things (e.g. MySQL and Oracle want uppercase table names but Postgres uses lowercase), so there's some checks for those. Otherwise I tried to keep it as generic as possible.
There's configs for:
- The interval the checker runs at. I set it to run once a week. It's also hardcoded to run 1 hour after Oozie startup
- To ignore extra columns, tables, indexes. I imagine some users, who hopefully know what they're doing, may want to add extra indexes or other things, so turning this on will make the checker not count those as "bad"
The checker also exposes an instrumentation metrics to report the status of the database schema and the last time it was run. The specific problems it finds are logged as ERRORs. There's also a bunch of DEBUG logging for things that are correct.
I've tested this against MySQL, PostgreSQL, and Oracle and it works correctly. Though I'm not a database expert, and my Oracle install was kind of hacky, so any input or additional testing from Rohini Palaniswamy, Purshotam Shah, Mona Chitnis, Ryota Egashira would be helpful there. Oracle was also a bit different in that instead of a catalog (i.e. the database name which the Connection determined from the URL) it used a schema that I think is hardcoded to "OOZIE", but I'm a little iffy on that.
I also don't have access to an MS SQLServer, so I disabled that for now. Bowen Zhang, could you try this against a SQLServer and make any tweaks that it needs? The core code should be correct, but you may have to make minor if statements like I did for the others. We can either do that as a followup JIRA or here.