Issue 65221 - SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria
Summary: SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and =""...
Status: ACCEPTED
Alias: None
Product: Calc
Classification: Application
Component: code (show other issues)
Version: recent-trunk
Hardware: All All
: P3 Trivial with 9 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
: 11273 50331 65109 81198 85966 97711 105908 110508 123907 (view as issue list)
Depends on:
Blocks:
 
Reported: 2006-05-09 16:49 UTC by ooo
Modified: 2017-05-20 11:01 UTC (History)
11 users (show)

See Also:
Issue Type: TASK
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
testcases (24.00 KB, application/vnd.ms-excel)
2006-05-09 16:50 UTC, ooo
no flags Details
ODFF expect result (20.50 KB, application/vnd.ms-excel)
2012-07-04 09:03 UTC, Wang Lei
no flags Details
Update the ODFF expect result sample file (16.00 KB, application/vnd.ms-excel)
2012-07-05 03:14 UTC, Wang Lei
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description ooo 2006-05-09 16:49:13 UTC
This is a consolidation issue of several other issues such as issue 13702, issue
50331 and issue 65109.

Calc interprets criteria arguments to the SUMIF, COUNTIF, COUNTBLANK, DCOUNT,
DSUM, ... functions different from Excel if the criteria is a literal "" empty
string, or a reference to a formula cell returning an empty string. Furthermore,
Excel is not even consistent in how it handles these different forms, nor how it
interprets the same form in different functions like DCOUNT and DSUM, and maybe
other D*() functions. This is a mess. See attached testcase document. Load in
both, Calc and Excel, to have a better comparison of the results and the colored
TRUE/FALSE cells.

We need to come to a conclusion how we'll handle this. Simply mimicking the
Excel behavior may be appealing but might not be the best idea.
Comment 1 ooo 2006-05-09 16:50:09 UTC
Created attachment 36346 [details]
testcases
Comment 2 ooo 2006-05-09 16:51:54 UTC
*** Issue 50331 has been marked as a duplicate of this issue. ***
Comment 3 ooo 2006-05-09 16:54:14 UTC
*** Issue 65109 has been marked as a duplicate of this issue. ***
Comment 4 ooo 2006-05-09 16:57:11 UTC
Accepting.
Comment 5 Rainer Bielefeld 2006-05-09 17:09:50 UTC
Corrected typo in summary
Comment 6 frank 2006-06-16 15:07:10 UTC
*** Issue 11273 has been marked as a duplicate of this issue. ***
Comment 7 frank 2007-09-03 13:14:41 UTC
*** Issue 81198 has been marked as a duplicate of this issue. ***
Comment 8 Regina Henschel 2008-02-08 17:56:40 UTC
*** Issue 85966 has been marked as a duplicate of this issue. ***
Comment 9 dridgway 2009-01-05 14:09:34 UTC
*** Issue 97711 has been marked as a duplicate of this issue. ***
Comment 10 Regina Henschel 2009-01-05 22:10:04 UTC
*** Issue 97711 has been marked as a duplicate of this issue. ***
Comment 11 ztevepowell 2009-01-06 09:36:59 UTC
In the other issues consolidated here there is considerable confusion about what constitutes an empty 
cell and whether OOo should be 'compatible' with XL. 

There is a further issue: in the help it clearly uses the phrase 'empty cell' to mean one with no value set, 
and furthermore to DEFINE equality comparison with "" and 0 to succeed against such cells.

Try, for example, using A1="" and A1<>"" as cell formulae against the empty (unset) cell A1, and you 
will discover the semantics of comparison. 

Independent of the question of compatibility with XL is the necessity for compatibility with OOo.  SUMIF, 
COUNTIF and the like, should apply the same comparison rules as simple comparison formulae -- and 
they don't. 

This note is to request of the developer looking at this to provide some sort of specification, and to 
ensure that self-consistency is preserved.  Is there a place to discuss this?  
Comment 12 Regina Henschel 2009-01-06 10:45:11 UTC
@ztevepowell: You can get the draft spec of ODF from
http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula. You
get most of the relevant places by searching for "empty cell" and "blank".

A place to discuss the problem is the mailing list users@sc.openoffice.org or
maybe  dev@sc.openoffice.org. 
Comment 13 Regina Henschel 2009-10-14 20:52:48 UTC
*** Issue 105908 has been marked as a duplicate of this issue. ***
Comment 14 Regina Henschel 2010-03-31 00:53:15 UTC
*** Issue 110508 has been marked as a duplicate of this issue. ***
Comment 15 Oliver Brinzing 2010-03-31 10:15:33 UTC
,
Comment 16 cjhansen 2010-03-31 16:49:31 UTC
For those of us who develop spreadsheet applications for anonymous users, compatibility is a huge 
concern, even if the solution is not technically perfect.
Comment 17 rockfx01 2010-04-07 21:42:42 UTC
This defect is a major PITA.

Compatibility with Excel aside, this is a MAJOR problem.  Because of the way
that OOo calculates empty cells with these functions, AFAIK there is NO WAY to
accurately calculate the number of empty cells within a range if any of those
cells contain an equation whose output is an empty string, i.e. ="".

In Excel, this can be done one of TWO ways, using either COUNTBLANK(<Range>), or
COUNTIF(<Range>,"").

I started using COUNTBLANK originally so that spreadsheets which have to count
blank/empty cells would be compatible with both Excel and OOo, since
COUNTIF(<Range>,"") doesn't work at all in OOo.  Now with more advanced
spreadsheets which contain equations within these cell ranges, I've found that
OOo is simply incapable of handling such simple equations, and I have been
unable to find a way to make a spreadsheet that will calculate blank/empty cells
correctly in OOo.  So you see it is not simply a matter of OOo being
incompatible with Excel; rather, it is a matter of OOo being absolutely
incapable of performing such a (simple) calculation with its current function set.

It is IMPOSSIBLE to count the number of blank cells (cells containing an empty
string result ="" in addition to the truly empty cells containing absolutely
nothing) using OOo.  I've tried this in OOo 3.0 and 3.1, and there is NO WAY to
do this as far as I can tell.  If there is, please enlighten me.  The earliest
defect I found regarding this issue dates back to 2003. 7 years later it has yet
to be addressed.

COUNTIF(<Range>,"") doesn't work at all in OOo and always returns a result of 0.

COUNTBLANK(<Range>) will only count cells if they have absolutely nothing in
them.  IMO COUNTBLANK() should also count cells where an equation returns an
empty string "".  In fact, it is what I would expect given the fact that by
returning an empty string, you are effectively saying "This cell is empty."

ISBLANK() returns FALSE for a cell containing the equation ="".

I read the Help manual and understand this is the expected behavior of OOo due
to the way it specifies 'blank' cells, but WHY?  I'd like to know what exactly
the usage case is for counting cells containing equations that return an empty
string "" as NOT blank?  The current behavior and specification does not make
any sense whatsoever IMO.

There needs to be some simple way to do this.  Using a function that is already
shared by Excel and OOo would be ideal, because most people use Excel and not
OOo. Also, the behavior of Excel for COUNTBLANK() makes far more sense than
OOo's behavior.
Comment 18 orcmid 2012-07-02 11:37:38 UTC
@rockfx01:

The final OpenFormula specification for ODF 1.2 may resolve the question about empty strings versus empty cells.  

Try =COUNTIF(<range>;"=")

Note that the "=" operation has no value.

To count empty strings, it looks like you need to use 

  =COUNTIF(<range>;"=""""")

The paired "" introduce single " characters into the literal string, so the criterion is '=""'.

If these functions have been updated for ODF 1.2 in Apache OpenOffice 3.4.0, you should be able to tell the difference with these two cases.

PS: Check =COUNTIF(<range>;"") again too.  This should find empty strings, but it might also catch empty cells (which default to empty strings when referenced in search of text values).
Comment 19 Wang Lei 2012-07-04 09:03:30 UTC
Created attachment 78576 [details]
ODFF expect result

@orcmid

For =COUNTIF(<range>;""), I think it should only find empty strings.

I add an attachment. In this xls file, I list the expect result for ODFF. Please comments if there is something wrong.
Comment 20 ztevepowell 2012-07-04 10:28:10 UTC
Hi. My problem was this:

=SUMIF(<refrange>, <>"y", <numberrange>)

only added cells that corresponded to non-empty reference cells that had strings which were not "y".

This omitted cells that had empty reference cells, even though A1<>"y" would have been true for those cells in the <refrange>.

The issue is complicated by the strange reading of 'geometry' of the reference range.  This seems to remove empty cells from the <refrange> geometry before considering which cells to sum, and only then are the corresponding cells in the <numberrange> considered.

This is a disaster for consistency: if I generated a new column, for example, propagate a simple test =IF((A1<>"y"),B1,0) and sum the result, I get a different answer from the "corresponding" SUMIF. This is a serious consistency issue which is nothing to do with XL or ODF, nor much to do with the way empty cells are treated in boolean expressions elsewhere.

This is due to the special treatment of empty cells in reference ranges, which is inconsistent with the other semantics.
Comment 21 Wang Lei 2012-07-05 03:14:44 UTC
Created attachment 78581 [details]
Update the ODFF expect result sample file

Update the ODFF expect result sample file
Comment 22 Regina Henschel 2013-12-27 12:28:27 UTC
*** Issue 123907 has been marked as a duplicate of this issue. ***
Comment 23 Marcus 2017-05-20 11:01:01 UTC
Reset assigne to the default "issues@openoffice.apache.org".