Commons CSV
  1. Commons CSV
  2. CSV-121

IllegalArgumentException thrown when the header contains duplicate names when the column names are empty.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0
    • Component/s: None
    • Labels:
      None

      Description

      When having a header like a,,c,d,, an IllegalArgumentException("The header contains duplicate names: " +Arrays.toString(header) is thrown because empty header names are treated as a header with name. I sended in a pullrequest at github: https://github.com/apache/commons-csv/pull/2

        Activity

        Hide
        Gary Gregory added a comment -

        I Added the format setting ignoreEmptyHeaders, which defaults to false to keep the IAE as the default behavior.

        commit -m "[CSV-121] Exception that the header contains duplicate names when the column names are empty. Added the setting ignoreEmptyHeaders, defaults to false to keep the IAE as the default behavior." C:/vcs/svn/apache/commons/trunks-proper/csv/src/test/java/org/apache/commons/csv/CSVParserTest.java C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVFormat.java C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVParser.java C:/vcs/svn/apache/commons/trunks-proper/csv/src/changes/changes.xml
            Sending        C:/vcs/svn/apache/commons/trunks-proper/csv/src/changes/changes.xml
            Sending        C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVFormat.java
            Sending        C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVParser.java
            Sending        C:/vcs/svn/apache/commons/trunks-proper/csv/src/test/java/org/apache/commons/csv/CSVParserTest.java
            Transmitting file data ...
            Committed revision 1602206.
        
        Show
        Gary Gregory added a comment - I Added the format setting ignoreEmptyHeaders , which defaults to false to keep the IAE as the default behavior. commit -m "[CSV-121] Exception that the header contains duplicate names when the column names are empty. Added the setting ignoreEmptyHeaders, defaults to false to keep the IAE as the default behavior." C:/vcs/svn/apache/commons/trunks-proper/csv/src/test/java/org/apache/commons/csv/CSVParserTest.java C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVFormat.java C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVParser.java C:/vcs/svn/apache/commons/trunks-proper/csv/src/changes/changes.xml Sending C:/vcs/svn/apache/commons/trunks-proper/csv/src/changes/changes.xml Sending C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVFormat.java Sending C:/vcs/svn/apache/commons/trunks-proper/csv/src/main/java/org/apache/commons/csv/CSVParser.java Sending C:/vcs/svn/apache/commons/trunks-proper/csv/src/test/java/org/apache/commons/csv/CSVParserTest.java Transmitting file data ... Committed revision 1602206.
        Hide
        Emmanuel Bourg added a comment -

        Or simply disable the error when duplicate column names are found. Being able to parse malformed or ambiguous input is a plus.

        Show
        Emmanuel Bourg added a comment - Or simply disable the error when duplicate column names are found. Being able to parse malformed or ambiguous input is a plus.
        Hide
        Gary Gregory added a comment -

        A "allowBlankColumnNames" flag?

        Show
        Gary Gregory added a comment - A "allowBlankColumnNames" flag?
        Hide
        Sebastian Hardt added a comment -

        I use the libary in a project where users can upload a csv, to import data into the system.
        They have a lot of csv like this, so this is a problem which exists in the real world.
        I can change the patch, so this behaviour can be configured by a flag in the csvparser.

        Show
        Sebastian Hardt added a comment - I use the libary in a project where users can upload a csv, to import data into the system. They have a lot of csv like this, so this is a problem which exists in the real world. I can change the patch, so this behaviour can be configured by a flag in the csvparser.
        Hide
        Gary Gregory added a comment -

        Playing devil's advocate here. The only case where I have seen empty column names is when looking at certain SQL result sets where a column value is the result of a computation for example, SELECT 1+COLUMNX FROM SOMETABLE. In this case, this result has one column but no column name. In SQL you can usually assign a label to a column but you do not have to. Therefore, if your CSV file is the result of a database export, it is quite possible to end up with empty column names mixed with real column names.

        Show
        Gary Gregory added a comment - Playing devil's advocate here. The only case where I have seen empty column names is when looking at certain SQL result sets where a column value is the result of a computation for example, SELECT 1+COLUMNX FROM SOMETABLE. In this case, this result has one column but no column name. In SQL you can usually assign a label to a column but you do not have to. Therefore, if your CSV file is the result of a database export, it is quite possible to end up with empty column names mixed with real column names.
        Hide
        Sebb added a comment -

        Is a missing column name really harmless?

        Show
        Sebb added a comment - Is a missing column name really harmless?
        Hide
        Emmanuel Bourg added a comment -

        I don't know if there is a use case, but turning a harmless case (one empty column name) into an error doesn't seem right.

        Show
        Emmanuel Bourg added a comment - I don't know if there is a use case, but turning a harmless case (one empty column name) into an error doesn't seem right.
        Hide
        Sebb added a comment -

        What is the use case for a missing column name?
        Either one provides names for all the columns, or none.

        Show
        Sebb added a comment - What is the use case for a missing column name? Either one provides names for all the columns, or none.
        Hide
        Emmanuel Bourg added a comment -

        You would throw an IAE even if the header contains only one empty column name? That sounds a bit restrictive to me.

        Show
        Emmanuel Bourg added a comment - You would throw an IAE even if the header contains only one empty column name? That sounds a bit restrictive to me.
        Hide
        Sebb added a comment -

        Two comments:
        1) the patch in the pull request includes lots of spurious changes. Even if the change is agreed, the patch cannot be used as it stands

        2) The patch ignores columns with no names; I would have thought a more suitable fix would be to report IAE for an empty column name. In other words, the current behaviour is correct, but the IAE message is a bit confusing and could be changed.

        Show
        Sebb added a comment - Two comments: 1) the patch in the pull request includes lots of spurious changes. Even if the change is agreed, the patch cannot be used as it stands 2) The patch ignores columns with no names; I would have thought a more suitable fix would be to report IAE for an empty column name. In other words, the current behaviour is correct, but the IAE message is a bit confusing and could be changed.
        Hide
        Gary Gregory added a comment -

        When does it make sense to have a mix of named and unnamed columns? IOW, what is your user story?

        Show
        Gary Gregory added a comment - When does it make sense to have a mix of named and unnamed columns? IOW, what is your user story?

          People

          • Assignee:
            Unassigned
            Reporter:
            Sebastian Hardt
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development