Issue 60110

Summary: import from csv file sometimes strips initial apostrophe in cell
Product: Calc Reporter: hsorenson <hso>
Component: open-importAssignee: ooo
Status: ACCEPTED --- QA Contact:
Severity: trivial    
Priority: P3 CC: issues, steve.m.frank
Version: OOo 1.1.3Keywords: ms_interoperability, oooqa
Target Milestone: ---   
Hardware: All   
OS: All   
URL: http://www.nosneros.net/hso/test.csv
Issue Type: DEFECT Latest Confirmation on: ---
Developer Difficulty: ---
Attachments:
Description Flags
CSV file with examples of various possible cases of apostrophe imports none

Description hsorenson 2006-01-06 06:50:43 UTC
OpenOffice improperly imports a field in a CSV file that contains only a single
quote. I am able to get it to import the field properly if the field contains
two single quotes enclosed in a set of double quotes. The test file at the url
above is an example file that shows this behavior.

Excel doesn't exhibit this behavior.

From what I read of RFC 4180, it looks like OpenOffice is not RFC compliant.
That said, implementing CSV doesn't seem to straight forward either (there seem
to be several interpretations of how a CSV file should be formatted). However,
given that OpenOffice is an office suite and strives to be compatible with
Excel, I think its behavior should be similar to Excel's.
Comment 1 frank 2006-03-03 09:24:55 UTC
Hi,

I'm sorry, but I don't get the point with the file you've mentioned. Please be
more precise there to find the problem. Also a smaller file would be great to
get the point.

Frank
Comment 2 atdsm 2006-06-30 14:21:43 UTC
Set needmoreinfo keyword.

Perhaps this is related to the use of a single quote to denote a number which
should be displayed as text? Or to other bugs surrounding the use of single
quotes (see for example issue 65510)?

Steve
Comment 3 atdsm 2006-06-30 14:22:14 UTC
BTW, the URL for the csv file is broken.
Comment 4 atdsm 2006-07-06 14:37:45 UTC
The specific problem is that OOo Calc sometimes strips the initial apostrophe in
a cell after a CSV import, even if that apostrophe is enclosed in double quotes
(normally denoting an exact text import). Interestingly, the apostrophe seems to
be stripped in all cases except when followed immediately by a number.

Even more interestly, the cell contents appear properly in the import preview.
Once the import takes place, however, the error occurs and the apostrophe's get
stripped.

I will attach an example.
Comment 5 atdsm 2006-07-06 14:39:25 UTC
Created attachment 37545 [details]
CSV file with examples of various possible cases of apostrophe imports
Comment 6 atdsm 2006-07-06 14:42:36 UTC
The contents of the CSV:

WITHOUT QUOTES
1 apostrophe,'
2 apostrophes,''
3 apostrophes,'''
A number,'3
A word,'word
A misspelled word,'mword

WITH QUOTES
1 apostrophe,"'"
2 apostrophes,"''"
3 apostrophes,"'''"
A number,"'3"
A word,"'word"
A misspelled word,"'mword"

In all these cases except '3 and "'3" the apostrophe is stripped after the
import to Excel. This behavior is very similar to behavior described in issue
65510; I suspect a dependency on 65510 and have marked accordingly.

Also added ms_interoperability keyword because Excel treats CSV imports of
apostrophes different:
*In Calc, "''" is required to import a single apostrophe due to the stripping of
the initial apostrophe
*In Excel, only "'" is required to import a single apostrophe
Comment 7 atdsm 2006-07-06 14:44:33 UTC
> In all these cases except '3 and "'3" the apostrophe is stripped after the
> import to Excel.

Excuse the error; this line actually describes the behavior in CALC, not Excel.
Thus it should read:

"In all these cases except '3 and "'3" the apostrophe is stripped after the
import to Calc."

-SF
Comment 8 atdsm 2006-07-06 19:07:51 UTC
>> From a personal email from hsorenson, posted w/ permission:

 > BTW, the URL for the csv file is broken.

I've put this back in place. My website changed and it wasn't on the new
one.

http://www.nosneros.net/hso/test.csv 

OpenOffice imports csv differently than excel. The RFC, unfortunatly,
doesn't disambiguate in such a way to say which implementation is
correct.

OpenOffice needs "''" to import an apostrophe (\x27) from CSV.

Excel needs "'" to import an apostrophe (\x27) from CSV.

It would be nice if OpenOffice followed Excel's lead since Excel has a
larger install base. I use both and having this inconsistency is
annoying.

-Holt
Comment 9 frank 2006-07-07 13:04:37 UTC
Hi eike,

please have a look at this one.

Frank
Comment 10 frank 2006-07-07 13:04:49 UTC
Hi eike,

please have a look at this one.

Frank
Comment 11 ooo 2006-07-14 16:34:16 UTC
The CSV import currently interprets the field content the same way it does as if
keyed in as input, with the exception of a single apostrophe as field content,
thus forcing otherwise numerical context to textual content, discarding the
leading apostrophe. This should be disabled for CSV import and field content
taken as is.

Btw, it is a common misconception that field content quoted by double quotes
should always be textual content, this is _not_ the case. Double quotes are to
be removed and then it is up to the application to interpret the content.
Otherwise it would be impossible to have numerical values contain the field
separator.

This issue is related to 65510 for the leading apostrophe handling, but doesn't
depend on it in the sense that issue 65510 would block this issue, removing
dependency.

Changing target to 2.x because of desired lossless data import.
Comment 12 Martin Hollmichel 2007-11-09 16:52:46 UTC
change target from 2.x to 3.x according to
http://wiki.services.openoffice.org/wiki/Target_3x