Issue 121073 - [From Symphony]Loading performance for xls file with row banded style is bad
[From Symphony]Loading performance for xls file with row banded style is bad
Status: RESOLVED FIXED
Product: Calc
Classification: Application
Component: open-import
3.4.1
All All
: P3 normal (vote)
: 4.0.0
Assigned To: Wang Lei
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-21 03:35 UTC by Wang Lei
Modified: 2012-10-10 07:06 UTC (History)
1 user (show)

See Also:
Issue Type: ENHANCEMENT
Latest Confirmation on: ---
Developer Difficulty: ---


Attachments
Smaple xls file with row banded style (210.01 KB, application/octet-stream)
2012-09-21 11:08 UTC, Wang Lei
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description Wang Lei 2012-09-21 03:35:29 UTC
Loading performance for xls file with row banded style is bad.

Take attached sample.xls as an example
Comment 1 Wang Lei 2012-09-21 11:07:19 UTC
The sample.xls is 1.3M, which is larger than the limitation.
Comment 2 Wang Lei 2012-09-21 11:08:36 UTC
Created attachment 79615 [details]
Smaple xls file with row banded style
Comment 3 Wang Lei 2012-09-21 11:28:23 UTC
1. Root Cause:
The EXCEL file has some rows which have different attributes one another row. That means row 1 has attributes set A, row 2 has attributes set B, row 3 has attribute set A, row 4 has attribute set B and so on. 
AOO use ScDocument--ScTable--ScColumn to save cell. Every column has a ScAttrArray to save cell attributes. There is only one ScPatternAttr obj for same attributes in adjacent cells in one column.  But the character of the file will cause most adjacent cells have different ScPatternAttr in one column.
In the Excel 2003 loading procedure as following

        scmi.dll!ScPatternAttr::operator==(...)  Line 178	C++
(3) 	svlmi.dll!SfxItemPool::Put(...)  Line 811 + 0x13	C++
 	scmi.dll!ScDocumentPool::Put(...)  Line 609 + 0x12	C++
(2) 	svtmi.dll!SfxItemPoolCache::ApplyTo(...)  Line 136 + 0x15	C++
 	scmi.dll!ScAttrArray::ApplyCacheArea(...)  Line 755 + 0xf	C++
 	scmi.dll!ScColumn::ApplyPatternArea(...)  Line 616	C++
 	scmi.dll!ScTable::ApplyPatternArea(...)  Line 1908 + 0x1b	C++
 	scmi.dll!ScDocument::ApplyPatternAreaTab(...)  Line 3856	C++
 	scfiltmi.dll!XclImpXF::ApplyPattern(...)  Line 1211	C++
(1) 	scfiltmi.dll!XclImpXFBuffer::ApplyPattern(...)  Line 1440	C++
 	scfiltmi.dll!XclImpXFRangeBuffer::Finalize()  Line 1766 + 0x2b	C++
 	scfiltmi.dll!XclImpRoot::FinalizeTable()  Line 127	C++
 	scfiltmi.dll!ImportExcel::EndSheet()  Line 1158	C++
	scfiltmi.dll!ImportExcel8::EndSheet()  Line 336	C++
 	scfiltmi.dll!ImportExcel::Eof()  Line 439	C++
 	scfiltmi.dll!ImportExcel8::Read()  Line 1118 + 0xb	C++

(1)After loading one sheet, it will set cell's attribute. XclImpXFRange contains same XclImpXF(cell attribute set) for adjacent cells in one column. 
(2)When set cell attribute, need put the new ScPatternAttr in the SfxItemPool, then use this pooled item to set cell's attribute. Because the most adjacent cells have different attribute, SfxItemPoolCache::ApplyTo() method will be called by more than 50,000
(3)When putting a item into SfxItemPool, it will check whether this item is in SfxItemPool. If this item is in the pool, the comparsion times are large and the time is long. For these kind of Excel files, most ScPatternAttr is in Pool. 

From Quantify,  ScPatternAttr::operator== cost a lot of time

2. Resolution:
cache pooled ScPatternAttr for every XclImpXF(cell format record), which will reduce the times for comparsion
Comment 4 Wang Lei 2012-09-21 11:29:44 UTC
Submit in 0
Comment 5 Wang Lei 2012-09-21 12:03:11 UTC
Before enhancement, AOO need 15s to load sample.xls.
After enhancement, AOO only need 5s to load sample.xls.
Comment 6 Shenfeng Liu 2012-10-10 07:06:21 UTC
set Target Milestone to AOO 3.5.0 for PM purpose.