Details
-
Sub-task
-
Status: Reopened
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
In testing we have found an issues in the ECWritableContainerProvider.
For EC a pipeline is used for only one container, when the container gets closed, the pipeline also gets closed. At the moment, the only place in the code which closes the EC piplines which no longer have an open container is inside the ECWritableContainerProvider. It first gets the list of open piplines and enforces the pipeline limit, then for all open pipelines, it tries top find one the client can use.
If the client has had problems writing to the pipelines (eg it was given a container/pipeline and then the write failed as the container was closed on the DN), the pipelines get added to the exclude list. Then we can get into a situation where many pipelines need to be closed on the write path, slowing down block allocation.
Ideally, when a container transitions to CLOSING in SCM, if the container is an EC container, we should also close the associated pipeline to avoid it counting toward the limit and to avoid needing to close it during the write (block allocation) path.
This could be achieved relatively simply inside the PipelineManagerImpl.removeContainersFromPipeline() method which is called as soon as the container transitions to CLOSING via ContainerStateManagerImpl.updateContainerState() when it executes the containerStateChangeActions. Wrapping the container close and pipeline close in a lock inside PipelineManagerImpl ensure we have a consistent "ec container close" flow and it should avoid the ECWritableContainerProvider needing to close the pipelines internally. However we can leave that code in place in ECWritableContainerProvider incase some pipelines slip through somehow.