Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2083

Some characters overlap other characters, font changed

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
      windows8

      Description

      Hi, please forgive my english first.
      I tried to convert a pdf file to images, using pdfbox 1.8.4 within tika-app-1.5.jar.
      The jpeg files I got were not ideal.
      The content in the images were different from the pdf file.
      Some characters were in different places, and some characters overlapped others.
      There were many lines of console information which read:
      '13:49:07,094 WARN [PDSimpleFont:107] Changing font on <l> from <Courier New Italic> to the default font
      13:49:07,094 WARN [PDSimpleFont:107] Changing font on <l> from <Courier New Italic> to the default font
      13:49:07,095 WARN [PDSimpleFont:107] Changing font on <y> from <Courier New Italic> to the default font
      13:49:07,095 WARN [PDSimpleFont:107] Changing font on <l> from <Courier New Italic> to the default font
      ...'
      Could you give me some instruction, tell me how to solve this problem, how to get ideal images?
      Thanks a lot.

      I attached the pdf file and one of the images.
      And here are my code:
      PDDocument doc = PDDocument.load(input + ".pdf");
      List<PDPage> pages = doc.getDocumentCatalog().getAllPages();
      for (int i = 0; i < pages.size(); i++) {
      PDPage page = pages.get;
      BufferedImage image = page.convertToImage();
      Iterator<ImageWriter> iter = ImageIO.getImageWritersBySuffix("JPG");
      ImageWriter writer = iter.next();
      File outFile = new File(input + i + ".jpg");
      FileOutputStream out = new FileOutputStream(outFile);
      ImageOutputStream outImage = ImageIO.createImageOutputStream(out);
      writer.setOutput(outImage);
      writer.write(new IIOImage(image, null, null));
      writer.dispose();
      out.close();
      }
      doc.close();

        Attachments

        1. technical-guide.pdf
          468 kB
          Webrtc Go
        2. vgsdmuuhd5ak03orqudq10.jpg
          216 kB
          Webrtc Go

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                webrtcgo Webrtc Go
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: