[HIVE-467] Scratch data location should be on different filesystems for different types of intermediate data - ASF JIRA

Log work

Agile Board

Rank to Top

Rank to Bottom

Bulk Copy Attachments

Bulk Move Attachments

Voters

Watch issue

Watchers

Create sub-task

Convert to sub-task

Move

Link

Clone

Labels

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.4.0
Component/s: Query Processor
Labels:
None
Environment:

S3/EC2

Description

Currently Hive uses the same scratch directory/path for all sorts of temporary and intermediate data. This is problematic:

1. Temporary location for writing out DDL output should just be temp file on local file system. This divorces the dependence of metadata and browsing operations on a functioning hadoop cluster.
2. Temporary location of intermediate map-reduce data should be the default file system (which is typically the hdfs instance on the compute cluster)
3. Temporary location for data that needs to be 'moved' into tables should be on the same file system as the table's location (table's location may not be same as hdfs instance of processing cluster).

ie. - local storage, map-reduce intermediate storage and table storage should be distinguished. Without this distinction - using hive on environments like S3/EC2 causes problems. In such an environment - i would like to be able to:

do metadata operations without a provisioned hadoop cluster (using data stored in S3 and metastore on local disk)
attach to a provisioned hadoop cluster and run queries
store data back in tables that are created over s3 file system

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hive-467.6.patch
27/May/09 21:21
77 kB
Joydeep Sen Sarma
hive-467.5.patch
26/May/09 07:24
77 kB
Joydeep Sen Sarma
hive-467.4.patch
16/May/09 04:14
77 kB
Joydeep Sen Sarma
hive-467.3.patch
15/May/09 06:45
76 kB
Joydeep Sen Sarma
hive-467.patch.2
14/May/09 02:32
74 kB
Joydeep Sen Sarma
hive-467.patch.1
07/May/09 04:12
54 kB
Joydeep Sen Sarma

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Joydeep Sen Sarma Assign to me

Reporter:: Joydeep Sen Sarma

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 02/May/09 03:03

Updated:: 17/Dec/11 00:07

Resolved:: 28/May/09 19:41

Agile

View on Board

Scratch data location should be on different filesystems for different types of intermediate data

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment