Description
Unknown encoding for 'GBK-EUC-H' for chinese pdf document. To fix it.
1.add method to org.apache.pdfbox.pdmodel.font.PDFont.java
public String getEncodingName() {
COSBase encoding = font.getDictionaryObject(COSName.ENCODING);
if (encoding != null) {
if (encoding instanceof COSName)
}
return null;
}
2.modify encode method.
from
if( retval == null && cmap != null )
//if we havn't found a value yet and
//we are still on the first byte and
//there is no cmap or the cmap does not have 2 byte mappings then try to encode
//using fallback methods.
to
if( retval == null && cmap != null )
{
String encodingStr = getEncodingName();
if (encodingStr != null) {
EncodingConverter converter = EncodingConversionManager.getConverter(encodingStr);
if (converter != null)
else
{ retval = cmap.lookup( c, offset, length ); }} else
{ retval = cmap.lookup( c, offset, length ); } }
//if we havn't found a value yet and
//we are still on the first byte and
//there is no cmap or the cmap does not have 2 byte mappings then try to encode
//using fallback methods.