The BaseParser#readUntilEndStream(OutputStream) method is parsing streams the wrong way. 
This method will start reading a stream till the keyword "endstream" is reached and don't care about the length value inside the dictionary. This implementation brokes nearly every pdf document with a pdf embedded inside a stream .
Encoder that is used for compressing streams can be block-based (like FlateDecode which is mostly used). If a block of data that should be compressed don't spare space after compressing, the encode do not compress this block and mark it as uncompressed. So a stream can containing compressed and uncompressed parts. So if someone try to embed pdf documents with streams inside a stream, the encoder will left most parts of the document uncompressed. Such parts can contain plan text like "endstream" or other critical keywords that can cause the parser to stop.
So we need to read the whole stream length that was wrote inside the dictionary and don't look at "endstream" keywords until the end is reached.
The current stream parser cause a ZIPException with the Message "Unexpected end of ZLIB input stream".
A sample pdf and a patch is coming soon.
 PDF 32000-1:2008 -> 18.104.22.168 Stream Extent
 PDF 32000-1:2008 -> 7.11.4 Embedded File Streams
- is duplicated by
PDFBOX-1106 PDFMergerUtility corrupts file attachments