Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
gobblin supports multiple modes of executions ( CLI, Standalone, cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines utility to run cli and admin commands. The problem is each cli and execution mode has individual script to manage the service, which brings following problems.
Having individual script introduces lot of issues
- all scripts handles gobblin variables, user parameters differently, and its highly inconsistent among various different gobblin scripts, not to mention different features supported by different scripts.
- functionality around start, stop, status checking and handling PID's among lot of other things, varies vastly as per the implementation of the script.
- features like GC & JVM params, log4j file selection, classpath calculation, etc... exists in some gobblin scripts but not all, adding to inconsistent user experience.
- code duplication: all the gobblin scripts share lot of common code to handle params, start, stop services, status checks, pid handling, etc... combining all the scripts into 1 not only makes maintenance easier but also brings clarity and consistency.
- Basically, current 13 different scripts adds confusion to new user on how to use Gobblin or how to use it.
Solution:
1. there can be one gobblin.sh script to handle all gobblin commands and deployment options as per following signature. NOTE: This
gobblin.sh <command> <params>
gobblin.sh <execution-mode> <start|stop|status>
commands values: admin, cli, statestore-check, statestore-clean, historystore-manager, classpath
service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, service
with above change, following becomes valid command.
# all under GobblinCli class gobblin run listQuickApps –> gobblin cli run listQuickApps <params> gobblin run <quick-app-name> -> gobblin cli run <quick-app-name> <params> # class: JobStateToJsonConverter statestore-checker.sh <args> -> gobblin cli job-state-to-json <params> # class: StateStoreCleaner statestore-clean.sh <args> -> the class is depricated so no need to migrate this over. # class: DatabaseJobHistoryStoreSchemaManager historystore-manager.sh <args> -> gobblin cli job-store-schema-manager <params> # class: Cli gobblin-admin.sh <args> -> gobblin cli admin <args> # all gobblin deployment modes gobblin-cluster-master.sh -> gobblin service cluster-master start|stop|status gobblin-cluster-worker.sh -> gobblin service cluster-worker start|stop|status gobblin-compaction.sh -> gobblin-compaction.sh ( kept as it is for now, can be migrated to new script framework) gobblin-mapreduce.sh -> gobblin service mapreduce start|stop|status gobblin-service.sh -> gobblin service service-manager start|stop|status gobblin-standalone.sh -> gobblin service standalone start|stop|status gobblin-yarn.sh -> gobblin service yarn start|stop|status
2. Also all configurations for each mode needs to be structured and de-duped accordingly to make it clear on which config will be picked up for which execution mode. This would be well defined in command help instructions.
NOTE: this refactoring adds all cli and service commands to gobblin.sh and hence changes the syntax for all commands and services.
Attachments
Issue Links
- contains
-
GOBBLIN-477 Lib Jars in gobblin-mapreduce.sh are hardcoded
- Open
-
GOBBLIN-581 Remove unnecessary imports in gobblin-jira-version script
- Open
- Dependency
-
GOBBLIN-694 improve gobblin-cluster-[master/worker].sh script
- Open
- Dependent
-
GOBBLIN-843 Separately startable Admin UI & REST Server
- Open
- is depended upon by
-
GOBBLIN-812 take worker id from command line if specified
- Open
- Parent Feature
-
GOBBLIN-854 update config reader in standalone mode
- Open
- links to
1.
|
fix help text and align it with variable names | Open | Unassigned |
|
||||||||
2.
|
remove pid file only when it exists | Open | Unassigned |
|