Cassandra
  1. Cassandra
  2. CASSANDRA-3840

Use java.io.tmpdir as default output location for BulkRecordWriter

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 1.1.0
    • Component/s: Hadoop
    • Labels:

      Description

      BulkRecordWriter uses the value of the property mapreduce.output.bulkoutputformat.localdir if set, defaulting to value of mapred.local.dir if the former is not set.

      However, on a typical production system, mapred.local.dir is set to a list of directories. This leads to BulkOutputFormat writing to silly paths such as

      /dir1/,dir2,/dir3,KeySpaceName/CFName

      This has two effects:

      1) Directory is not removed when job is finished, leading to disk space management issues.

      2) If a new job is run against same keyspacename and CF, it tries to load old data + new data.

      Better to use System.getProperty("java.io.tmpdir"), as that is set to an attempt-specific temporary directory which is cleaned after the job finishes. See http://hadoop.apache.org/common/docs/current/mapred_tutorial.html, under "Directory Structure".

      1. java.io.tmpdir.patch
        0.9 kB
        Erik Forsberg

        Activity

        Gavin made changes -
        Workflow patch-available, re-open possible [ 12749093 ] reopen-resolved, no closed status, patch-avail, testing [ 12756820 ]
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12651657 ] patch-available, re-open possible [ 12749093 ]
        Brandon Williams made changes -
        Assignee Erik Forsberg [ forsberg ]
        Affects Version/s 1.1 [ 12317615 ]
        Brandon Williams made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Reviewer brandon.williams
        Fix Version/s 1.1 [ 12317615 ]
        Resolution Fixed [ 1 ]
        Hide
        Brandon Williams added a comment -

        Committed, thanks!

        Show
        Brandon Williams added a comment - Committed, thanks!
        Erik Forsberg made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Erik Forsberg made changes -
        Attachment java.io.tmpdir.patch [ 12512973 ]
        Erik Forsberg made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Erik Forsberg made changes -
        Field Original Value New Value
        Status Open [ 1 ] Patch Available [ 10002 ]
        Erik Forsberg created issue -

          People

          • Assignee:
            Erik Forsberg
            Reporter:
            Erik Forsberg
            Reviewer:
            Brandon Williams
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development