[HADOOP-19247] Authentification failed in Azure Kubernetes with HTTP1.1 and Chunked transfer encoding - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.4.0, 3.3.4, 3.3.6, 3.5.0
Fix Version/s: None
Component/s: auth, fs/azure
Labels:
None
Environment:

Azure Kubernetes Services

Azure Entra ID

Azure Metadata Service

Spark 3.3

Language:
- java

Description

The problem is related to Azure authentication on Kubernetes.

When I run my Spark program, I have this error when I try to authenticate the pod :

java.lang.NullPointerException
    at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.consumeInputStream(AzureADAuthenticator.java:340)
    at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenSingleCall(AzureADAuthenticator.java:270)
    at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenCall(AzureADAuthenticator.java:211)
    at org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenFromMsi(AzureADAuthenticator.java:137)
    at org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider.refreshToken(MsiTokenProvider.java:45)
    at org.apache.hadoop.fs.azurebfs.oauth2.AccessTokenProvider.getToken(AccessTokenProvider.java:50)
    at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAccessToken(AbfsClient.java:554)
    at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:151)
    at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:125)
    at org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:181)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:569)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:536)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:359)

My configuration is a spark-driver deployed on Azure kubernetes with managed identity.

I used this method with aad-pod-identity.

There are two different scenarios we can observe when trying to authenticate on Kubernetes to Azure Instance Metadata Service :

The returned token is short and its size is less than 2048 chars. The Token have all headers and explicitly the "Content-Length" header

The returned token is long and its size is more than 2048 chars. The Token have the HTTP1.1 capacity with transfer encoding property in Response and don't have the "Content-length" header due to Chunked transfer encoding mechanism.

NB : I run a curl command in pod to generate these sceenshots according to the Azure Documentation

In a GitHub repository I found my "AzureADAuthenticator.java" and this piece of code :

The "Content-length" property is mandatory when the returned HTTP code is 200 and it's not compatible with the HTTP1.1 Chunked transfer encoding fonctionality.

Is it possible to update this authentification to support this mechanism implemented by Microsoft on kubernetes (and may be in virtual machine).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

CodeResponse.png
02/Aug/24 06:51
26 kB
Emeric
TokenKO.png
02/Aug/24 06:51
68 kB
Emeric
TokenOK.png
02/Aug/24 06:51
99 kB
Emeric

Activity

People

Assignee:: Unassigned

Reporter:: Emeric

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Aug/24 07:15

Updated:: 02/Aug/24 13:18