[MAPREDUCE-775] Add input/output formatters for Vertica clustered ADBMS. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.21.0
Component/s: contrib/vertica
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
Add native and streaming support for Vertica as an input or output format taking advantage of parallel read and write properties of the DBMS.
Tags:
vertica, db, formatter

Description

Add native support for Vertica as an input or output format taking advantage of parallel read and write properties of the DBMS.

On the input side allow for parametrized queries (a la prepared statements) and create a split for each combination of parameters. Also support the parameter list to be generated from a sql statement. For example - return metrics for all dimensions that meet criteria X with one input split for each dimension. Divide the read among any number of hosts in the Vertica cluster.

On the output side, support Vertica streaming load to any number of hosts in the Vertica cluster. Output may be to a different cluster than input.

Also includes Input and Output formatters that support streaming interface.

Code has been tested and run on live systems under 19 and 20. Patch for 21 with new API will be ready end of this week.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-775.2.patch
12/Sep/09 12:14
120 kB
Omer Trajman
MAPREDUCE-775.3.patch
12/Sep/09 20:35
120 kB
Omer Trajman
MAPREDUCE-775.4.patch
13/Sep/09 14:41
121 kB
Omer Trajman
MAPREDUCE-775.patch
04/Aug/09 20:52
115 kB
Omer Trajman

Activity

People

Assignee:: Omer Trajman

Reporter:: Omer Trajman

Votes:: 0 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 20/Jul/09 23:34

Updated:: 24/Aug/10 21:14

Resolved:: 18/Sep/09 18:24