There are a few properties in TikaCoreProperties which overlap and I think we should minimize ambiguity by consolidating them into a single composite property with the clearest name, the most general specification referenced as its primary property, and the others and deprecated strings as its secondaries.
Here's the proposed pseudo-code for the changes:
TikaCoreProperties.KEYWORDS <- DublinCore.SUBJECT,
TikaCoreProperties.CREATION_DATE <- DublinCore.DATE,
TikaCoreProperties.SAVE_DATE <- DublinCore.MODIFIED,
and an example of the Java changes:
Since this would require a bit of refactoring for parsers that use the properties being removed I thought it best to get some feedback before working up a full patch.
Does this seem like a reasonable approach?