Details
Description
When parsing an RTF file with an empty TITLE metadata, the resulting HTML contains an self-closing title tag:
$ java -jar tika-app-1.1.jar -h test.rtf <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta name="Content-Length" content="830468"/> <meta name="Content-Type" content="application/rtf"/> <meta name="resourceName" content="test.rtf"/> <title/> </head> [...]
I believe self-closing tags are not valid in XHTML, according to http://www.w3.org/TR/xhtml1/#C_3 (However there's no XHTML doctype generated here, just a namespace...). Anyway this causes some browsers like Chrome to fail parsing the HTML, resulting in a blank page displayed.
The expected output would be a non self-closing empty tag: <title></title>