Uploaded image for project: 'Commons Net'
  1. Commons Net
  2. NET-215

UNIXFTPEntryParser doesn't preserve trailing whitespace in files

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 2.2
    • Component/s: None
    • Labels:
      None

      Description

      From https://bugs.eclipse.org/bugs/show_bug.cgi?id=204740 :

      The Commons Net FTP Entry Parsers do not preserve trailing whitespace on file names. On systems like UNIX that support trailing whitespace, this results in some invalid entries being parsed.

      The bug seems to be in Jakarta Commons Net UnixFTPEntryParser – in its REGEX,
      the last field ("endtoken") is declared
      (
      s*.*)
      which means any whitespace followed by at least one non-whitespace character.
      Which is not the case in case of trailing whitespace.

        Activity

        Hide
        moberhuber Martin Oberhuber added a comment -

        It might actually help to make the "name" token more greedy and have it match until the line terminator that would be \\r?
        n but that would require some unit tests to secure the solution...

        Show
        moberhuber Martin Oberhuber added a comment - It might actually help to make the "name" token more greedy and have it match until the line terminator that would be \\r? n but that would require some unit tests to secure the solution...
        Hide
        rwinston@eircom.net Rory Winston added a comment -

        This works for me in 2.1.

        public void testTrailingSpaces() {
        		FTPFile f = getParser().parseFTPEntry("drwxr-xr-x   2 john smith     group         4096 Mar  2 15:13 zxbox     ");
        		assertNotNull(f);
        		assertEquals(f.getName(), "zxbox     ");	
        	}
        

        works as expected.

        Show
        rwinston@eircom.net Rory Winston added a comment - This works for me in 2.1. public void testTrailingSpaces() { FTPFile f = getParser().parseFTPEntry( "drwxr-xr-x 2 john smith group 4096 Mar 2 15:13 zxbox " ); assertNotNull(f); assertEquals(f.getName(), "zxbox " ); } works as expected.
        Hide
        sebb@apache.org Sebb added a comment -

        Note that the regex

        (\s*.*)

        means 0 or more whitespace, followed by 0 or more other characters, so it does match trailing spaces - as shown by the example.

        However, having two "*" quantifiers next to each other tends to make the parsing slower, as it increases the backtracking that may be required.

        Show
        sebb@apache.org Sebb added a comment - Note that the regex (\s*.*) means 0 or more whitespace, followed by 0 or more other characters, so it does match trailing spaces - as shown by the example. However, having two "*" quantifiers next to each other tends to make the parsing slower, as it increases the backtracking that may be required.

          People

          • Assignee:
            Unassigned
            Reporter:
            moberhuber Martin Oberhuber
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development