Uploaded image for project: 'Libcloud'
  1. Libcloud
  2. LIBCLOUD-835

Malformed auth token causes fatal exception in Google Storage driver

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      One of my Django instances has started hitting a libcloud error which is causing a fatal exception, bringing down the instance.

      It looks like libcloud is writing invalid JSON into the auth token, which then causes a JSON parse error when it is subsequently read back in.

      Here's the token that's written:

      $ cat /root/.google_libcloud_auth.<project>
      {"access_token": "<redacted>", "token_type": "Bearer", "expire_time": "2016-07-12T16:45:09Z", "expires_in": 3559}09Z", "expires_in": 3537}
      

      Note the two "expires_in" keys, one with a nonsense value of `3559}09Z"`

      Environment:
      Python 3.4.4
      apache-libcloud==1.0.0

        Issue Links

          Activity

          Hide
          kami Tomaz Muraus added a comment -

          Merged, thanks!

          Show
          kami Tomaz Muraus added a comment - Merged, thanks!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/libcloud/pull/844

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/libcloud/pull/844
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 78df34cf8db8706440ee594c571d80de8613433e in libcloud's branch refs/heads/trunk from Paul Tiplady
          [ https://git-wip-us.apache.org/repos/asf?p=libcloud.git;h=78df34c ]

          Fix caching of Google auth tokens

          _write_token_to_file was not zeroing the file before writing
          a new token, causing corruption.

          FIXES: LIBCLOUD-835

          Closes #844

          Signed-off-by: Tomaz Muraus <tomaz@tomaz.me>

          Show
          jira-bot ASF subversion and git services added a comment - Commit 78df34cf8db8706440ee594c571d80de8613433e in libcloud's branch refs/heads/trunk from Paul Tiplady [ https://git-wip-us.apache.org/repos/asf?p=libcloud.git;h=78df34c ] Fix caching of Google auth tokens _write_token_to_file was not zeroing the file before writing a new token, causing corruption. FIXES: LIBCLOUD-835 Closes #844 Signed-off-by: Tomaz Muraus <tomaz@tomaz.me>
          Hide
          erjohnso Eric Johnson added a comment -

          Thanks Paul - Pointed Tom to this too.

          Show
          erjohnso Eric Johnson added a comment - Thanks Paul - Pointed Tom to this too.
          Hide
          paul.tiplady Paul Tiplady added a comment -

          Fix is here: https://github.com/apache/libcloud/pull/844

          I have verified that this resolves my issue. Had a quick stab at tests, but I don't have time to push this across the finish line today, so happy for someone else to take over if you're interested in getting this bugfix in sooner than I can.

          Show
          paul.tiplady Paul Tiplady added a comment - Fix is here: https://github.com/apache/libcloud/pull/844 I have verified that this resolves my issue. Had a quick stab at tests, but I don't have time to push this across the finish line today, so happy for someone else to take over if you're interested in getting this bugfix in sooner than I can.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user paultiplady opened a pull request:

          https://github.com/apache/libcloud/pull/844

          LIBCLOUD-835 Fix caching of Google auth tokens

            1. Fix corruption bug in Google auth token caching
              1. Description

          The `GoogleOAuth2Credential. _write_token_to_file()` method writes a copy of the latest OAuth token to disk. Prior to this fix, the token was being written to disk without truncating the file first, which is fine in the case where the new token has the same number of characters (or more) as the old one. However, in some situations Google OAuth returns a shorter token string, which was causing the library to crash when loading the corrupted token.

              1. Status

          Fixed, needs tests.

              1. Checklist (tick everything that applies)

          _write_token_to_file was not zeroing the file before writing
          a new token, causing corruption.

          FIXES: LIBCLOUD-835

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/qwil/libcloud LIBCLOUD-835_google-token-corruption

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/libcloud/pull/844.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #844


          commit 9d05463aa2faa4733ac0129c2797ee9d043e58f9
          Author: Paul Tiplady <paul@qwil.co>
          Date: 2016-07-22T18:32:27Z

          LIBCLOUD-835 Fix caching of Google auth tokens

          _write_token_to_file was not zeroing the file before writing
          a new token, causing corruption.

          FIXES: LIBCLOUD-835


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user paultiplady opened a pull request: https://github.com/apache/libcloud/pull/844 LIBCLOUD-835 Fix caching of Google auth tokens Fix corruption bug in Google auth token caching Description The `GoogleOAuth2Credential. _write_token_to_file()` method writes a copy of the latest OAuth token to disk. Prior to this fix, the token was being written to disk without truncating the file first, which is fine in the case where the new token has the same number of characters (or more) as the old one. However, in some situations Google OAuth returns a shorter token string, which was causing the library to crash when loading the corrupted token. Status Fixed, needs tests. Checklist (tick everything that applies) [x] [Code linting] ( http://libcloud.readthedocs.org/en/latest/development.html#code-style-guide ) (required, can be done after the PR checks) [ ] Documentation [ ] [Tests] ( http://libcloud.readthedocs.org/en/latest/testing.html ) [ ] [ICLA] ( http://libcloud.readthedocs.org/en/latest/development.html#contributing-bigger-changes ) (required for bigger changes) _write_token_to_file was not zeroing the file before writing a new token, causing corruption. FIXES: LIBCLOUD-835 You can merge this pull request into a Git repository by running: $ git pull https://github.com/qwil/libcloud LIBCLOUD-835 _google-token-corruption Alternatively you can review and apply these changes as the patch at: https://github.com/apache/libcloud/pull/844.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #844 commit 9d05463aa2faa4733ac0129c2797ee9d043e58f9 Author: Paul Tiplady <paul@qwil.co> Date: 2016-07-22T18:32:27Z LIBCLOUD-835 Fix caching of Google auth tokens _write_token_to_file was not zeroing the file before writing a new token, causing corruption. FIXES: LIBCLOUD-835
          Hide
          paul.tiplady Paul Tiplady added a comment -

          Ping Tomaz Muraus, Eric Johnson – think this bug is high priority.

          Show
          paul.tiplady Paul Tiplady added a comment - Ping Tomaz Muraus , Eric Johnson – think this bug is high priority.
          Hide
          paul.tiplady Paul Tiplady added a comment -

          I neglected to update, that this issue didn't actually clear when I upgraded to 1.1.0.

          I added some debug logging to the token generation code, and I have figured out what the problem is.

          Here's a sequence of token writes (W) and reads (R):

          W: {"token_type": "Bearer", "expires_in": 3599, "expire_time": "2016-07-22T07:12:46Z", "access_token": "ya29.<snip 107 chars>"}
          W: {"token_type": "Bearer", "expires_in": 3294, "expire_time": "2016-07-22T18:17:05Z", "access_token": "ya29.<snip 82 chars>"}
          R: {"token_type": "Bearer", "expires_in": 3294, "expire_time": "2016-07-22T18:17:05Z", "access_token": "ya29.<82 chars from new token"}<23 chars from original token>"}
          

          The write_token code is not clearing the old token before writing the new one, resulting in corruption.

          The offending code, in libcloud/common/google.py:

                  with os.fdopen(os.open(filename, os.O_CREAT | os.O_WRONLY,
                                         int('600', 8)), 'w') as f:
                      f.write(data)
          

          In the case where the file exists, O_CREAT is a no-op, so we're just opening the existing file and writing our bytes into it, without first clearing it. Need to set O_TRUNC as well, to get > semantics instead of >>.

          I don't know what causes different token lengths to be returned by the Google APIs; since this issue appeared spontaneously, and only on one of my projects, it may be that the Google auth APIs changed recently, thus triggering this latent bug. Also note that I'm using auth type GoogleAuthType.GCE, which might scope the issue more tightly – but the affected code is used by all Google auth types, so this in principle could be hit on any of them.

          Until it's understood under what circumstances the Google APIs return different token lengths, I think it's safest to assume that this issue can break everybody using this library with Google Cloud Storage, so this looks like an exceptionally critical bug; I propose raising to Blocker level and making an immediate bugfix release with a fix, and backporting to all supported versions.

          I'm going to knock together a quick fix on my fork of Libcloud, since it's not much code. I'll push a fix out without UTs onto my staging environment, and look at tests later. Happy to contribute this fix back. Will you want a test for this fix, even though it's a one-liner?

          Show
          paul.tiplady Paul Tiplady added a comment - I neglected to update, that this issue didn't actually clear when I upgraded to 1.1.0. I added some debug logging to the token generation code, and I have figured out what the problem is. Here's a sequence of token writes (W) and reads (R): W: { "token_type" : "Bearer" , "expires_in" : 3599, "expire_time" : "2016-07-22T07:12:46Z" , "access_token" : "ya29.<snip 107 chars>" } W: { "token_type" : "Bearer" , "expires_in" : 3294, "expire_time" : "2016-07-22T18:17:05Z" , "access_token" : "ya29.<snip 82 chars>" } R: { "token_type" : "Bearer" , "expires_in" : 3294, "expire_time" : "2016-07-22T18:17:05Z" , "access_token" : "ya29.<82 chars from new token" }<23 chars from original token>"} The write_token code is not clearing the old token before writing the new one, resulting in corruption. The offending code, in libcloud/common/google.py: with os.fdopen(os.open(filename, os.O_CREAT | os.O_WRONLY, int ('600', 8)), 'w') as f: f.write(data) In the case where the file exists, O_CREAT is a no-op, so we're just opening the existing file and writing our bytes into it, without first clearing it. Need to set O_TRUNC as well, to get > semantics instead of >>. I don't know what causes different token lengths to be returned by the Google APIs; since this issue appeared spontaneously, and only on one of my projects, it may be that the Google auth APIs changed recently, thus triggering this latent bug. Also note that I'm using auth type GoogleAuthType.GCE, which might scope the issue more tightly – but the affected code is used by all Google auth types, so this in principle could be hit on any of them. Until it's understood under what circumstances the Google APIs return different token lengths, I think it's safest to assume that this issue can break everybody using this library with Google Cloud Storage, so this looks like an exceptionally critical bug; I propose raising to Blocker level and making an immediate bugfix release with a fix, and backporting to all supported versions. I'm going to knock together a quick fix on my fork of Libcloud, since it's not much code. I'll push a fix out without UTs onto my staging environment, and look at tests later. Happy to contribute this fix back. Will you want a test for this fix, even though it's a one-liner?
          Hide
          paul.tiplady Paul Tiplady added a comment -

          I upgraded from 1.0.0 to 1.1.0, and the error has not resurfaced (yet).

          Show
          paul.tiplady Paul Tiplady added a comment - I upgraded from 1.0.0 to 1.1.0, and the error has not resurfaced (yet).
          Hide
          paul.tiplady Paul Tiplady added a comment -

          Worth mentioning that this problem doesn't manifest on the initial connection token exchange; the token is fetched and stored correctly on startup. It's only after <1d of uptime that the problem manifests.

          I've started a new Django container for the instance that's affected, but I have preserved the faulty container in a 'stopped' state, so I can spin it up and poke around if you have further diagnostics that you'd like me to collect.

          Show
          paul.tiplady Paul Tiplady added a comment - Worth mentioning that this problem doesn't manifest on the initial connection token exchange; the token is fetched and stored correctly on startup. It's only after <1d of uptime that the problem manifests. I've started a new Django container for the instance that's affected, but I have preserved the faulty container in a 'stopped' state, so I can spin it up and poke around if you have further diagnostics that you'd like me to collect.
          Hide
          kami Tomaz Muraus added a comment -

          Thanks for the report, we will look into it.

          /cc Eric Johnson

          Show
          kami Tomaz Muraus added a comment - Thanks for the report, we will look into it. /cc Eric Johnson

            People

            • Assignee:
              kami Tomaz Muraus
              Reporter:
              paul.tiplady Paul Tiplady
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development