Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-19247

Authentification failed in Azure Kubernetes with HTTP1.1 and Chunked transfer encoding

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.4.0, 3.3.4, 3.3.6, 3.5.0
    • None
    • auth, fs/azure
    • None
    • Azure Kubernetes Services

      Azure Entra ID

      Azure Metadata Service

      Spark 3.3

    Description

       

      The problem is related to Azure authentication on Kubernetes.

      When I run my Spark program, I have this error when I try to authenticate the pod :

       

      java.lang.NullPointerException
          at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.consumeInputStream(AzureADAuthenticator.java:340)
          at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenSingleCall(AzureADAuthenticator.java:270)
          at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenCall(AzureADAuthenticator.java:211)
          at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenFromMsi(AzureADAuthenticator.java:137)
          at org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider.refreshToken(MsiTokenProvider.java:45)
          at org.apache.hadoop.fs.azurebfs.oauth2.AccessTokenProvider.getToken(AccessTokenProvider.java:50)
          at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAccessToken(AbfsClient.java:554)
          at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:151)
          at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:125)
          at org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:181)
          at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:569)
          at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:536)
          at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:359) 

       

       

      My configuration is a spark-driver deployed on Azure kubernetes with managed identity.

      I used this method with aad-pod-identity.

       

      There are two different scenarios we can observe when trying to authenticate on Kubernetes to Azure Instance Metadata Service :

      • The returned token is short and its size is less than 2048 chars. The Token have all headers and explicitly the "Content-Length" header

       

      NB : I run a curl command in pod to generate these sceenshots according to the Azure Documentation

      In a GitHub repository I found my "AzureADAuthenticator.java" and this piece of code :

      The "Content-length" property is mandatory when the returned HTTP code is 200 and it's not compatible with the HTTP1.1 Chunked transfer encoding fonctionality.

      Is it possible to update this authentification to support this mechanism implemented by Microsoft on kubernetes (and may be in virtual machine).

      Attachments

        1. TokenOK.png
          99 kB
          Emeric
        2. TokenKO.png
          68 kB
          Emeric
        3. CodeResponse.png
          26 kB
          Emeric

        Activity

          People

            Unassigned Unassigned
            mric78 Emeric
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: