I get a lot of msgs in the log like:
WARNING: No Unicode mapping for 7 (7) in font null
Oct 17, 2020 5:00:23 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode
WARNING: No Unicode mapping for 8 (8) in font null
Oct 17, 2020 5:00:23 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode
The extracted document shows like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"
http://www.w3.org/TR/html4/loose.dtd">
<html><head><title>
https://2brightsparks.onfastspring.com/...unt/order/2BR180831-9532-88246/invoice</title>
<meta http-equiv="Content-Type" content="text/html; charset="UTF-8">
</head>
<body>
<div style="page-break-before:always; page-break-after:always"><div><p>�� 	  				 ����  ! 
</p>
<p>	  				 ����  !  
</p>
<p>"#$%"&'
%()*(+",+-+./01232415674.522.89+
:;<)+"=+>?@+-+A?B+41C+.312
</p>
<p>D*@*(
</p>
<p>E	F G
</p>
<p>�GHGFG
</p>
<p>F GG
</p>
<p>FGIGJKGG
</p>
<p>LF
</p>
<p>	M		 G
</p>
<p>NKOGPG GQL� 
</p>
<p>:(R)?ST+D?UUR(T+
</p>
<p> VVV 		G
</p>
<p>	M 		
</p>
<p>/?W*(
</p>
<p>XYG!GGZG
</p>
<p>[G G
</p>
<p>G
</p>
<p>P\	
etc...etc...