Right now the connector is overwriting the tika metadata "creation_date" and "last_modification_Date" for a document. This is happening because at a Windows Shares level you have a creation_date and a last_modification_date (related to the creation of the document in the windows shares filesystem) that are different from the creation_date and the last_modification_date associated to the original file.
There is the need to change the metadata name to distinguish between this 2 layers of dates and guaranteeing flexibility to the user to use the one that he/she wants with a proper mapping.
A plus can be to format the date in the lucene standard, to be aligned with a proper standard.
- Url metadata :
Can be useful to extract the Url and store it in a specific metadata ( further than the ID of the document). In this way we can keep it as Id but also use it with other mappings without affecting the Id field.
- Parent Directory path :
Can be useful to extract the Path for the directory that contains the current file. Evaluate well this as can be a redundancy or an improvement.