Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
Accumulo comes with a nice set of scripts to set up an accumulo user and install init scripts. It would be nice to have a concise set of installation instructions that leverage those scripts. I put together a list of instructions the other day for one of our installs, that is included below. It needs review, testing, and integration into the standard docs (user manual + README in some form).
1. Accumulo 1.5.x should be installed using the scripts in the scripts directory under ACCUMULO_HOME, which should be /usr/lib/accumulo. Make a symbolic link to the /usr/lib/accumulo_1.5.0 directory (or whatever version you're using), which is where you should put the package.
2. Zookeeper must be installed on all machines, but it only needs to be running on the zookeeper nodes.
3. Make sure the HDFS /user/accumulo exists and is owned by the accumulo user (for the trash collection issue).
4. Make sure dfs.durable.sync (or dfs.support.append on some platforms) is enabled. Reboot HDFS after this is set.
5. For Accumulo with encryption, use the encryption settings from conf/examples/crypto/accumulo-site.xml, but the other settings from conf/examples/3GB/native-standalone* for performance. You can also bump up the memory settings for cache and memory maps according to resources available on the cluster.
6. Make sure to set the instance.secret using a password generator.
7. Make sure the accumulo-site.xml is set to only be readable by the accumulo user.
8. Change the accumulo-monitor user to accumulo (substitute accumulo for accumulo_monitor in lines 28, 31, 35 of scripts/monitor-only-init.sh). This is one way of getting past the security restrictions on the accumulo-site.xml file and on the accumulo directory in HDFS. An alternative would be to give the accumulo_monitor user access to those resources.
9. From the scripts directory, run ./master-only-init.sh, ./gc-only-init.sh, and ./monitor-only-init.sh on the master node.
10. chown -R accumulo /usr/lib/accumulo*
11. scp or rsync the configured accumulo directory through the cluster
12. From the scripts directory, run ./tserver-only-init.sh on each of the tservers.
13. Start up all the processes using "service accumulo-master start" or the appropriate commands on each server.
14. Check that all the processes started using jps -m, and check that the right number of tservers started on the monitor page.
15. Test! Accumulo shell first, then CI if you're ambitious.
16. Monitor the logs via the monitor page periodically over the next half hour to see if there are any errors or warnings. Some things don't cause errors at the API level for a while, but they show up earlier in the logs.
Attachments
Issue Links
- is superceded by
-
ACCUMULO-2606 Remove RPM/DEB packaging from build
- Resolved