Description
In several different corpuses I've found HTML files which look like the following:
<html> <head> <title>Some title</title> <meta name="title" content="some other title"> </head> ... </html>
This causes the "title" property in the metadata to have two values set, when one would expect that this field is not multivalued.
Perhaps some fields from <meta> tags, like this one, should be namespaced.