Details

    • Type: Brainstorming Brainstorming
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.20.6, 0.89.20100924
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Mozilla is currently in the process of trying to migrate our HBase cluster to a new datacenter.

      We have our existing 25 node cluster in our SJC datacenter. It is serving production traffic 24/7. While we can take downtimes, it is very costly and difficult to take them for more than a few hours in the evening.

      We have two new 30 node clusters in our PHX datacenter. We are wanting to cut production over to one of these this week.

      The old cluster is running 0.20.6. The new clusters are running CDH3b3 with HBase 0.89.

      We have tried running a pull distcp using hftp URLs. If HBase is running, this causes SAX XML Parsing exceptions when a directory is removed during the scan.
      If HBase is stopped, it takes hours for the directory compare to finish before it even begins copying data.

      We have tried a custom backup MR job. This job uses the map phase to evaluate and copy changed files. It can run while HBase is live, but that results in a dirty copy of the data.

      We have tried running the custom backup job while HBase is shut down as well. When we do this, even on two back to back runs, it still copies over some data and seems to not be an entirely clean copy.

      When we have gotten what we thought was an entire copy onto the new cluster, we ran add_table on it, but the resulting hbase table had holes. Investigating the holes revealed there were directories that were not transfered.

      We had a meeting to brainstorm ideas and two further suggestions that came up were:
      1. Build a file list of files to transfer on the SJC side, transfer that file list to PHX and then run distcp on it.
      2. Try a full copy instead of incremental, skipping the expensive file compare step
      3. Evaluate copying from SJC to S3 then from S3 to PHX.

        Activity

        Hide
        stack added a comment -

        We have tried running a pull distcp using hftp URLs. If HBase is running, this causes SAX XML Parsing exceptions when a directory is removed during the scan.

        Can I see the stack trace? Maybe we need to hack on distcp so it runs over a moved dir? Or change it so it snapshots dirs up front and doesn't do a compare?

        We have tried a custom backup MR job. This job uses the map phase to evaluate and copy changed files. It can run while HBase is live, but that results in a dirty copy of the data.

        This seems right. Then have another job which goes through and does reconciliation after the fact. Run it a few times. Finally run it when HBase is down so true copy. Is this even possible?

        As Gary asks in IRC, whats the intra-DC bandwidth like?

        Show
        stack added a comment - We have tried running a pull distcp using hftp URLs. If HBase is running, this causes SAX XML Parsing exceptions when a directory is removed during the scan. Can I see the stack trace? Maybe we need to hack on distcp so it runs over a moved dir? Or change it so it snapshots dirs up front and doesn't do a compare? We have tried a custom backup MR job. This job uses the map phase to evaluate and copy changed files. It can run while HBase is live, but that results in a dirty copy of the data. This seems right. Then have another job which goes through and does reconciliation after the fact. Run it a few times. Finally run it when HBase is down so true copy. Is this even possible? As Gary asks in IRC, whats the intra-DC bandwidth like?
        Hide
        Daniel Einspanjer added a comment -

        30TB full data set. iperf tested at 272 Mbps

        Will get you the stack trace later today.

        The problem with the custom backup job is that it didn't deliver a clean result set even though we ran two back to back with HBase down. We don't understand exactly why this didn't work.

        Show
        Daniel Einspanjer added a comment - 30TB full data set. iperf tested at 272 Mbps Will get you the stack trace later today. The problem with the custom backup job is that it didn't deliver a clean result set even though we ran two back to back with HBase down. We don't understand exactly why this didn't work.
        Hide
        Daniel Einspanjer added a comment -

        Tonight we are going to be trying a modified invocation of distcp:
        1. dfs -lsr /hbase on SJC cluster
        2. dfs -lsr /hbase on PHX cluster
        3. Python script that diffs those two file lists looking for missing, orphaned, or changed files.
        4. Save diff results into a file list on PHX
        5. Invoke distcp with overwrite flag using that file list.

        Anyone see potential gotchas with that?

        Show
        Daniel Einspanjer added a comment - Tonight we are going to be trying a modified invocation of distcp: 1. dfs -lsr /hbase on SJC cluster 2. dfs -lsr /hbase on PHX cluster 3. Python script that diffs those two file lists looking for missing, orphaned, or changed files. 4. Save diff results into a file list on PHX 5. Invoke distcp with overwrite flag using that file list. Anyone see potential gotchas with that?
        Hide
        Daniel Einspanjer added a comment -
        Show
        Daniel Einspanjer added a comment - Diffing python script: http://xstevens.pastebin.mozilla.org/956095
        Hide
        stack added a comment -

        @Daniel You can run the above multiple times in case SJC changes during copy to PHX? Regards script, apart from what looks like code dup reading the files, it seems fine. You tried it?

        Show
        stack added a comment - @Daniel You can run the above multiple times in case SJC changes during copy to PHX? Regards script, apart from what looks like code dup reading the files, it seems fine. You tried it?
        Hide
        Daniel Einspanjer added a comment -

        Anyone ever try to distcp hbase data using a file list?

        We got this error:
        11/01/18 21:21:50 INFO tools.DistCp: destPath=hdfs://hp-node70.phx1.mozilla.com:8020/hbase
        org.apache.hadoop.tools.DistCp$DuplicationException: Invalid input, there are duplicated files in the sources: hftp://cm-hadoop-adm03.mozilla.org:50070/hbase/archive/683263177/.regioninfo, hftp://cm-hadoop-adm03.mozilla.org:50070/hbase/crash_counts/2038233953/.regioninfo
        at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1383)
        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1186)
        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

        It seems that checkDuplication doesn't evaluate the full path and hence will choke the first time it evaluates two .regioninfo files.

        Show
        Daniel Einspanjer added a comment - Anyone ever try to distcp hbase data using a file list? We got this error: 11/01/18 21:21:50 INFO tools.DistCp: destPath=hdfs://hp-node70.phx1.mozilla.com:8020/hbase org.apache.hadoop.tools.DistCp$DuplicationException: Invalid input, there are duplicated files in the sources: hftp://cm-hadoop-adm03.mozilla.org:50070/hbase/archive/683263177/.regioninfo, hftp://cm-hadoop-adm03.mozilla.org:50070/hbase/crash_counts/2038233953/.regioninfo at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1383) at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1186) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666) at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) It seems that checkDuplication doesn't evaluate the full path and hence will choke the first time it evaluates two .regioninfo files.

          People

          • Assignee:
            Unassigned
            Reporter:
            Daniel Einspanjer
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development