Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Why
For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or namenode needs to get the datanodes where the block is located, then redirect the request to one of the datanodes.
However, this chooseDatanode action in router is much slower than namenode, which directly affects the WebHDFS operations above.
For namenode WebHDFS, it normally takes tens of milliseconds, while router always takes more than 2 seconds.
How
Cache the datanode report in router RPC server. Actively refresh with a configured interval. Only get the datanode report when necessary in router.
It is a very expense operation where all the time is spent on.
This is only needed when we want to exclude some datanodes or find a random datanode for CREATE.
Attachments
Issue Links
- is depended upon by
-
HDFS-15432 RBF: Move cache datanode reports from NamenodeBeanMetrics to RouterRpcServer
- Open
- is duplicated by
-
HDFS-15014 RBF: WebHdfs chooseDatanode shouldn't call getDatanodeReport
- Resolved
- links to