Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33440

Spark schedules on updating delegation token with 0 interval under some token provider implementation

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.1, 3.1.0
    • 3.0.2, 3.1.0
    • Spark Core
    • None

    Description

      We got a report from customer that under specific circumstance Spark schedules on updating delegation token with 0 interval, ended up with flooding log message & massive requests on token handler side.

      After investigation, the problem was they have two delegation token identifiers which one of token identifier (IDBS3ATokenIdentifier) has the value of "issue date" to be 0, whereas another token identifier (DelegationTokenIdentifier) has correct value.

      Both are providing the expire time correctly via Token.renew(), and Spark assumes issue date is "correct", hence calculating the token expire period as (the result of Token.renew() - "issue date").

      20/10/13 06:34:19 INFO security.HadoopFSDelegationTokenProvider: Renewal interval is 1603175657000 for token S3ADelegationToken/IDBroker
      20/10/13 06:34:19 INFO security.HadoopFSDelegationTokenProvider: Renewal interval is 86400048 for token HDFS_DELEGATION_TOKEN
      

      It's safe at least here because Spark picks "minimal" value. The thing is, to calculate the next renewal timestamp, Spark tries to add the renewal interval with issue date for every token, and pick minimum value, hence "86400048" is picked as the next renewal timestamp.

      This is "earlier" than now, hence interval to schedule goes to be negative (as we apply subtract with now), and Spark applies safeguard to pick the greater between 0 and interval, hence 0 is picked up, and schedule updating token infinitely. (Schedule is one-time, but the calculation will always lead to the negative, so that's effectively immediate schedule.)

      We should construct the better consideration of "safe guard", instead of just guarding the schedule interval doesn't go to negative.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kabhwan Jungtaek Lim
            kabhwan Jungtaek Lim
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment