Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.3.0
    • Fix Version/s: 0.3.0
    • Component/s: Data Processors
    • Labels:
      None
    • Release Note:
      Simple lightweight archiver tool.

      Description

      The current demux-archive plumbing is quite complicated. At Berkeley, we need something much simpler.

        Issue Links

          Activity

          Hide
          asrabkin Ari Rabkin added a comment -

          Simple sink archiver.

          Copies all the .done files out of the sink, runs an archiver MapReduce job, then merges output of that job into archive, renaming files to avoid collision.

          Intended use is to run once every day or two, to empty out sink.

          Show
          asrabkin Ari Rabkin added a comment - Simple sink archiver. Copies all the .done files out of the sink, runs an archiver MapReduce job, then merges output of that job into archive, renaming files to avoid collision. Intended use is to run once every day or two, to empty out sink.
          Hide
          asrabkin Ari Rabkin added a comment -

          A future enhancement, once we have appends, is to actually merge files during promotion, and not just rename to avoid collision.

          Show
          asrabkin Ari Rabkin added a comment - A future enhancement, once we have appends, is to actually merge files during promotion, and not just rename to avoid collision.
          Hide
          tanjiaqi Jiaqi Tan added a comment -

          If there's no Demux, then the purpose of Chukwa will be just to collect logs, and store them in a single jumbled mix of all the log record types?

          Show
          tanjiaqi Jiaqi Tan added a comment - If there's no Demux, then the purpose of Chukwa will be just to collect logs, and store them in a single jumbled mix of all the log record types?
          Hide
          asrabkin Ari Rabkin added a comment -

          No. The archiver, by default in this patch, will group by cluster, day and datatype. Which is well suited to our use case, which is mapreduce analytics of logs.

          Show
          asrabkin Ari Rabkin added a comment - No. The archiver, by default in this patch, will group by cluster, day and datatype. Which is well suited to our use case, which is mapreduce analytics of logs.
          Hide
          asrabkin Ari Rabkin added a comment -

          Revised, fixes a few unit test problems.

          Show
          asrabkin Ari Rabkin added a comment - Revised, fixes a few unit test problems.
          Hide
          asrabkin Ari Rabkin added a comment -

          Taking silence for consent, I just committed this.

          Show
          asrabkin Ari Rabkin added a comment - Taking silence for consent, I just committed this.

            People

            • Assignee:
              asrabkin Ari Rabkin
              Reporter:
              asrabkin Ari Rabkin
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development