Hadoop Common
  1. Hadoop Common
  2. HADOOP-3003

FileSystem cache key should be updated after a FileSystem object is created

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.2, 0.17.0
    • Fix Version/s: 0.16.2
    • Component/s: fs
    • Labels:
      None

      Description

      In FileSystem.get(uri, conf), it first creates a cache key from the uri and the conf and then lookups the corresponding FileSystem object in the cache. If the object is not found, it initializes a FileSystem object and put it to the cache with the key. However, during FileSystem creation, the conf might be modified. In such case, the key should be updated before putting it to the cache.

      1. 3003_20080311.patch
        5 kB
        Tsz Wo Nicholas Sze
      2. 3003_20080313.patch
        8 kB
        Tsz Wo Nicholas Sze
      3. 3003_20080318.patch
        8 kB
        Tsz Wo Nicholas Sze

        Activity

        Hide
        Tsz Wo Nicholas Sze added a comment -

        3003_20080311.patch: should fix this problem.

        Show
        Tsz Wo Nicholas Sze added a comment - 3003_20080311.patch: should fix this problem.
        Hide
        Doug Cutting added a comment -

        In particular, I think you mean that the user information in the configuration could have been modified by a login process, and that we want the updated user information. Is that right?

        Also, I don't see why FsRef needs a Key, why cache values cannot just be FileSystem instances. The only place where the FsRef's key is used is in closeAll(), where map.entrySet() could be instead used to iterate through all key/value pairs, no?

        Show
        Doug Cutting added a comment - In particular, I think you mean that the user information in the configuration could have been modified by a login process, and that we want the updated user information. Is that right? Also, I don't see why FsRef needs a Key, why cache values cannot just be FileSystem instances. The only place where the FsRef's key is used is in closeAll(), where map.entrySet() could be instead used to iterate through all key/value pairs, no?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > ... updated user information. Is that right?
        That's is true.

        In HADOOP-2915, there is a discussion about using WeakReference in the cache, so that un-referenced fs could be closed automatically. In that case, the cache values have to be FsRef (extending WeakReference). You are right that we don't need FsRef currently. Do you want me to remove it or keep it for future use?
        See https://issues.apache.org/jira/browse/HADOOP-2915?focusedCommentId=12575175#action_12575175

        Show
        Tsz Wo Nicholas Sze added a comment - > ... updated user information. Is that right? That's is true. In HADOOP-2915 , there is a discussion about using WeakReference in the cache, so that un-referenced fs could be closed automatically. In that case, the cache values have to be FsRef (extending WeakReference). You are right that we don't need FsRef currently. Do you want me to remove it or keep it for future use? See https://issues.apache.org/jira/browse/HADOOP-2915?focusedCommentId=12575175#action_12575175
        Hide
        Tsz Wo Nicholas Sze added a comment -

        It was lucky that this problem does not exist in 0.16.1 since the username in conf remains constant during FileSystem creation.

        Show
        Tsz Wo Nicholas Sze added a comment - It was lucky that this problem does not exist in 0.16.1 since the username in conf remains constant during FileSystem creation.
        Hide
        Doug Cutting added a comment -

        > Do you want me to remove it or keep it for future use?

        Please remove it. We should not keep dead code around for future use, since plans often change and then we end up with just dead code.

        Show
        Doug Cutting added a comment - > Do you want me to remove it or keep it for future use? Please remove it. We should not keep dead code around for future use, since plans often change and then we end up with just dead code.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12377663/3003_20080311.patch
        against trunk revision 619744.

        @author +1. The patch does not contain any @author tags.

        tests included +1. The patch appears to include 6 new or modified tests.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit -1. The applied patch generated 192 release audit warnings (more than the trunk's current 0 warnings).

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests -1. The patch failed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/artifact/trunk/build/test/checkstyle-errors.html
        Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/artifact/trunk/current/releaseAuditDiffWarnings.txt
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12377663/3003_20080311.patch against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 6 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit -1. The applied patch generated 192 release audit warnings (more than the trunk's current 0 warnings). findbugs +1. The patch does not introduce any new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/artifact/trunk/build/test/checkstyle-errors.html Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/artifact/trunk/current/releaseAuditDiffWarnings.txt Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1952/console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        3003_20080313.patch: removed FsRef

        Show
        Tsz Wo Nicholas Sze added a comment - 3003_20080313.patch: removed FsRef
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12377835/3003_20080313.patch
        against trunk revision 619744.

        @author +1. The patch does not contain any @author tags.

        tests included +1. The patch appears to include 6 new or modified tests.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit +1. The applied patch does not generate any new release audit warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests -1. The patch failed core unit tests.

        contrib tests -1. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12377835/3003_20080313.patch against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 6 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1985/console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        3003_20080318.patch: fixed some bugs.

        Show
        Tsz Wo Nicholas Sze added a comment - 3003_20080318.patch: fixed some bugs.
        Hide
        Doug Cutting added a comment -

        +1

        A thought: the filesystem cache keys are now <user,scheme,authority>. Since URIs have a user field, it is tempting to use a URI for the key, instead of defining a new class: scheme://user@host:port/. The only problem would be if the FileSystem URI's authority includes a user, especially if it is a different user. For example, the S3 FileSystem puts a user in the authority, but that should be consistent with the S3 login. Unfortunately, its probably not consistent with the way we're determining the username used by the cache, using UnixUserGroupInformation.login(). The bug is that login() should be per-filesystem, not static.

        Show
        Doug Cutting added a comment - +1 A thought: the filesystem cache keys are now <user,scheme,authority>. Since URIs have a user field, it is tempting to use a URI for the key, instead of defining a new class: scheme://user@host:port/. The only problem would be if the FileSystem URI's authority includes a user, especially if it is a different user. For example, the S3 FileSystem puts a user in the authority, but that should be consistent with the S3 login. Unfortunately, its probably not consistent with the way we're determining the username used by the cache, using UnixUserGroupInformation.login(). The bug is that login() should be per-filesystem, not static.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12378180/3003_20080318.patch
        against trunk revision 619744.

        @author +1. The patch does not contain any @author tags.

        tests included +1. The patch appears to include 6 new or modified tests.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit +1. The applied patch does not generate any new release audit warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12378180/3003_20080318.patch against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 6 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1995/console This message is automatically generated.
        Hide
        dhruba borthakur added a comment -

        I just committed this. Thanks Nicholas!

        Show
        dhruba borthakur added a comment - I just committed this. Thanks Nicholas!
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #434 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/434/ )

          People

          • Assignee:
            Tsz Wo Nicholas Sze
            Reporter:
            Tsz Wo Nicholas Sze
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development