Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.4.0
-
None
-
None
Description
While reviewing DN metrics, I noticed the sum of Used, Available, and Reserved is different from the actual volume size. I don't survey Jira deeply for existing similar issues, so I'm appreciate tell me similar issues if you know. We experienced this issue in two clusters. Cluster #1 gains much data and experienced disk full many times.
Example 1: Cluster #1
This cluster is consisted from 36 nodes. Each node has 36 24 of 14 TB HDD drives. Expected total capacity per a single node is calculated by: 36 bays * 14 TB * 10^12 / 1024^4 = 458 TiB, so the sum of volume_info_metrics_{used,available,reserved} should be equal to 458 TiB. However, we experience differ results.
The cluster1.png shows a stacked bar graph. Reported metrics are vary and exceeds 458 TiB.
Example 2: Cluster #2
This is another example and each node has 12 of 14 TB HDD drives. Expected total capacity per a single node is calculated by: 12 bays * 14 TB * 10^12 / 1024^4 = 153 TiB.
The cluster2.png shows a stacked bar graph. Reported metrics is almost same among DNs but some exceptions exceed the physical capacity.