Bug 47271 - StyleSheet.createChp stack overflow - parent style description is itself
Summary: StyleSheet.createChp stack overflow - parent style description is itself
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.5-dev
Hardware: All Windows XP
: P2 critical with 2 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-27 00:34 UTC by Antony Bowesman
Modified: 2010-09-19 06:18 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Antony Bowesman 2009-05-27 00:34:12 UTC
I have a document that gets stuck in createChp().  Debugging it reveals that in createChp() the baseIndex is the same as the input param, istd and the parentCHP is null, hence it loops approx 1021 times before it gives stack overflow.

Is the solution 

       if(baseIndex != NIL_STYLE)
       {
           parentCHP = _styleDescriptions[baseIndex].getCHP();
           if(parentCHP == null && baseIndex != istd)
           {
               createChp(baseIndex);
               parentCHP = _styleDescriptions[baseIndex].getCHP();
           }
       }
Comment 1 Antony Bowesman 2009-05-27 00:58:34 UTC
Of course it's not that solution... parentCHP cannot be null.

The offending StyleDescription is "Footnote Reference" and has the following attributes in the debugger.

_baseLength = 10
_bchUpe = 52
_chp = null
_infoShort = 38
_infoShort2 = 370
_infoShort3 = 369
_infoShort4 = 0
_istd = 0
name = "Footnote Reference"
_pap = null
_upxs = IPX[1]

Any ideas on how to fix this?
Comment 2 Wes Freeman 2009-08-31 06:18:50 UTC
I would much rather have it throw an exception, so I can handle it, instead of recurse until it crashes the thread.
Comment 3 Wes Freeman 2009-08-31 08:17:46 UTC
Adding baseIndex != istd (as Antony mentioned) stops the infinite recursion. It throws a null exception later in uncompressCHP, since parentCHP is null, which is easier to handle--even though it doesn't successfully extract the text. 

I added this, which made it work for my purposes (text extraction only):
          if(baseIndex != NIL_STYLE)
          {

              parentCHP = _styleDescriptions[baseIndex].getCHP();
              if(parentCHP == null && baseIndex != istd)
              {
                  createChp(baseIndex);
                  parentCHP = _styleDescriptions[baseIndex].getCHP();
              }

          }
          if(parentCHP != null)
          {
             chp = (CharacterProperties)CharacterSprmUncompressor.uncompressCHP(parentCHP, chpx, 0);
          }

Thanks for the useful library, by the way.
Comment 4 cbamford 2010-08-02 06:42:11 UTC
Can someone please give me an update on this?  We get this problem in POI 3.6 and 3.7 on Linux and are are keen to understand if it is scheduled to be fixed in a particular release?

Thanks

- Chris
Comment 5 Nick Burch 2010-08-03 12:21:17 UTC
We need someone to figure out if the problem is caused by a bug in our chp decoding, or if it's correctly decoded by just not something we currently support.

Once we know that, we can decide if the suggested fix is OK to apply, or if we need to revist the chp decoding code to avoid getting into this situation in the first place

HWPF currently lacks a pointman, so if this matters to you, please do investigate and report back!
Comment 6 Wes Freeman 2010-09-15 12:48:03 UTC
From what I remember after looking at this issue, it is actually caused by a bad word document. The parent of the style was the same as the style, or something, so it would keep recursing into the child/parent style until it got an overflow. 

I have no idea how the word documents that gave us issues got into that state--I was going through and extracting text from thousands of documents and this error only happened on a handful (<20 out of 10k). The main issue for me was that it caused my entire web app to crash because of the recursion, so my quick fix resolved that and it's been working fine since.

Wes
Comment 7 Nick Burch 2010-09-19 06:18:25 UTC
I think I've added a fix in r998625 that will hopefully switch the broken styles to be standalone.

However, as no-one has uploaded a sample broken file, I can't be sure :/

If the problem still remains with the fix, please re-open the bug & upload a file for us to test against!