[HADOOP-2149] Pure name-node benchmarks. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.16.0
Fix Version/s: 0.16.0
Component/s: None
Labels:
None

Description

Pure name-node benchmark.

This patch starts a series of name-node benchmarks.
The intention is to have a separate benchmark for every important name-node operation.
The purpose of benchmarks is

to measure the throughput for each name-node operation, and
to evaluate changes in the name-node performance (gain or degradation) when optimization
or new functionality patches are introduced.

The benchmarks measure name-node throughput (ops per second) and the average execution time.
The benchmark does not involve any other hadoop components except for the name-node.
The name-node server is real, other components are simulated.
There is no RPC overhead. Each operation is executed by calling directly the respective name-node method.
The benchmark is multi-threaded, that is one can start multiple threads competing for the
name-node resources by executing concurrently the same operation but with different data.
See javadoc for more details.

The patch contains implementation for two name-node operations: file creates and block reports.
Implementation of other operations will follow.

File creation benchmark.

I've ran two series of the file create benchmarks on the name-node with different number of threads.
The first series is run on the regular name-node performing an edits log transaction on every create.
The transaction includes a synch to the disk.
In the second series the name-node is modified so that the synchs are turned off.
Each run of the benchmark performs the same number 10,000 of creates equally distributed between
running threads. I used a 4 core 2.8Ghz machine.
The following two tables summarized the results. Time is in milliseconds.

threads	time (msec) with synch	ops/sec with synch
1	13074	764
2	8883	1125
4	7319	1366
10	7094	1409
20	6785	1473
40	6776	1475
100	6899	1449
200	7131	1402
400	7084	1411
1000	7181	1392

threads	time (msec) no synch	ops/sec no synch
1	4559	2193
2	4979	2008
4	5617	1780
10	5679	1760
20	5550	1801
40	5804	1722
100	5871	1703
200	6037	1656
400	5855	1707
1000	6069	1647

The results show:

(Table 1) The new synchronization mechanism that batches synch calls from different threads works well.
For one thread all synchs cause a real IO making it slow. The more threads is used the more synchs are
batched resulting in better performance. The performance grows up to a certain point and then stabilizes
at about 1450 ops/sec.
(Table 2) Operations that do not require disk IOs are constrained by memory locks.
Without synchs the one-threaded execution is the fastest, because there are no waits.
More threads start to intervene with each other and have to wait.
Again the performance stabilizes at about 1700 ops/sec, and does not degrade further.
Our default 10 handlers per name-node is not the best choice neither for the io bound nor for the pure
memory operations. We should increase the default to 20 handlers and on big classes 100 handlers
or more can be used without loss of performance. In fact with more handlers more operations can be handled
simultaneously, which prevents the name-node from dropping calls that are close to timeout.

Block report benchmark.

In this benchmarks each thread pretends it is a data-node and calls blockReport() with the same blocks.
All blocks are real, that is they were previously allocated by the name-node and assigned to the data-nodes.
Some reports can contain fake blocks, and some can have missing blocks.
Each block report consists of 10,000 blocks. The total number of reports sent is 1000.
The reports are equally divided between the data-nodes so that each of them sends equal number of reports.

Here is the table with the results.

data-nodes	time (msec)	ops/sec
1	42234	24
2	9412	106
4	11465	87
10	15632	64
20	17623	57
40	19563	51
100	24315	41
200	29789	34
400	23636	42
600	39682	26

I did not have time to analyze this yet. So comments are welcome.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

NNThroughput.patch
09/Nov/07 20:42
29 kB
Konstantin Shvachko
NNThroughput.patch
03/Nov/07 03:07
28 kB
Konstantin Shvachko

Issue Links

depends upon

HADOOP-2078 Name-node should be able to close empty files.

Closed

Activity

People

Assignee:: Konstantin Shvachko

Reporter:: Konstantin Shvachko

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 03/Nov/07 03:05

Updated:: 02/May/13 02:29

Resolved:: 24/Dec/07 22:22