[HADOOP-13200] Implement customizable and configurable erasure coders - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.0.0-alpha4
Component/s: None
Labels:
- hdfs-ec-3.0-must-do

Target Version/s:

3.0.0-beta1
Hadoop Flags:

Incompatible change, Reviewed
Release Note:

Hide
CodecRegistry uses ServiceLoader to dynamically load all implementations of RawErasureCoderFactory. In Hadoop 3.0, there are several built-in implementations, and user can also provide self-defined implementations with the corresponding resource files.
For each codec, user can configure the order of the implementations with the configuration keys:
`io.erasurecode.codec.rs.rawcoders` for the default RS codec,
`io.erasurecode.codec.rs-legacy.rawcoders` for the legacy RS codec,
`io.erasurecode.codec.xor.rawcoders` for the XOR codec.
User can also configure self-defined codec with the configuration key like:
`io.erasurecode.codec.self-defined.rawcoders`.
For each codec, Hadoop will use the implementation according to the order configured. If the former implementation fails, it will fall back to call the latter one. The order is defined by a list of coder names separated by commas. The names for the built-in implementations are:
`rs_native` and `rs_java` for the default RS codec, of which the former is a native implementation which leverages Intel ISA-L library, which is the default implementation and the latter is the implementation in pure Java,
`rs-legacy_java` for the legacy RS codec, which is the default implementation in pure Java,
`xor_native` and `xor_java` for the XOR codec, of which the former is the Intel ISA-L implementation which is the default one and the latter in pure Java.

Show
CodecRegistry uses ServiceLoader to dynamically load all implementations of RawErasureCoderFactory. In Hadoop 3.0, there are several built-in implementations, and user can also provide self-defined implementations with the corresponding resource files. For each codec, user can configure the order of the implementations with the configuration keys: `io.erasurecode.codec.rs.rawcoders` for the default RS codec, `io.erasurecode.codec.rs-legacy.rawcoders` for the legacy RS codec, `io.erasurecode.codec.xor.rawcoders` for the XOR codec. User can also configure self-defined codec with the configuration key like: `io.erasurecode.codec.self-defined.rawcoders`. For each codec, Hadoop will use the implementation according to the order configured. If the former implementation fails, it will fall back to call the latter one. The order is defined by a list of coder names separated by commas. The names for the built-in implementations are: `rs_native` and `rs_java` for the default RS codec, of which the former is a native implementation which leverages Intel ISA-L library, which is the default implementation and the latter is the implementation in pure Java, `rs-legacy_java` for the legacy RS codec, which is the default implementation in pure Java, `xor_native` and `xor_java` for the XOR codec, of which the former is the Intel ISA-L implementation which is the default one and the latter in pure Java.

Description

This is a follow-on task for ~~HADOOP-13010~~ as discussed over there. There may be some better approach allowing to customize and configure erasure coders than the current having raw coder factory, as cmccabe suggested. Will copy the relevant comments here to continue the discussion.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-13200.11.patch
27/Apr/17 03:46
45 kB
Tim Yao
HADOOP-13200.10.patch
26/Apr/17 11:59
41 kB
Tim Yao
HADOOP-13200.09.patch
25/Apr/17 01:40
41 kB
Tim Yao
HADOOP-13200.08.patch
24/Apr/17 09:19
41 kB
Tim Yao
HADOOP-13200.07.patch
24/Apr/17 08:33
42 kB
Tim Yao
HADOOP-13200.06.patch
21/Apr/17 09:00
40 kB
Tim Yao
HADOOP-13200.05.patch
19/Apr/17 11:39
38 kB
Tim Yao
HADOOP-13200.04.patch
19/Apr/17 06:58
37 kB
Tim Yao
HADOOP-13200.03.patch
18/Apr/17 08:27
33 kB
Tim Yao
HADOOP-13200.02.patch
18/Apr/17 07:37
34 kB
Tim Yao

Issue Links

Is contained by

HDFS-7337 Configurable and pluggable erasure codec and policy

Resolved

relates to

HADOOP-13665 Erasure Coding codec should support fallback coder

Resolved

Activity

People

Assignee:: Tim Yao

Reporter:: Kai Zheng

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 24/May/16 21:55

Updated:: 25/Oct/19 20:27

Resolved:: 27/Apr/17 19:50