diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md b/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md index c058021..8326b5f 100644 --- a/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md +++ b/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md @@ -169,6 +169,7 @@ REST API compatibility corresponds to both the request (URLs) and responses to e * [NodeManager](../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html) * [MR Application Master](../../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html) * [History Server](../../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html) +* [Timeline Server v1 REST API](../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html) #### Policy diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md index cb8a5d3..5f5bbda 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md @@ -12,8 +12,8 @@ limitations under the License. See accompanying LICENSE file. --> -YARN Timeline Server -==================== +The YARN Timeline Server +======================== * [Overview](#Overview) * [Introduction](#Introduction) @@ -21,69 +21,70 @@ YARN Timeline Server * [Timeline Structure](#Timeline_Structure) * [Deployment](#Deployment) * [Configurations](#Configurations) - * [Running Timeline server](#Running_Timeline_server) + * [Running the Timeline Server](#Running_Timeline_Server) * [Accessing generic-data via command-line](#Accessing_generic-data_via_command-line) * [Publishing of application specific data](#Publishing_of_application_specific_data) +* [REST API](#Timeline_Server_REST_API_v1) -Overview +Overview --------- -### Introduction +### Introduction - Storage and retrieval of application's current as well as historic information in a generic fashion is solved in YARN through the Timeline Server. This serves two responsibilities: + The Storage and retrieval of application's current and historic information in a generic fashion is addressed in YARN through the Timeline Server. It has two responsibilities: -#### Application specific information +#### Persisting Application specific information - Supports collection of information completely specific to an application or framework. For example, Hadoop MapReduce framework can include pieces of information like number of map tasks, reduce tasks, counters etc. Application developers can publish the specific information to the Timeline server via TimelineClient, the ApplicationMaster and/or the application's containers. This information is then queryable via REST APIs for rendering by application/framework specific UIs. + Supports collection of information completely specific to an application or framework. For example, Hadoop MapReduce framework can include pieces of information like number of map tasks, reduce tasks, counters etc. Application developers can publish the specific information to the Timeline server via `TimelineClient` in the Application Master and/or the application's containers. This information is then queryable via REST APIs for rendering by application/framework specific UIs. -#### Generic information about completed applications - - Previously this was done by Application History Server but with timeline server its just one use case of Timeline server functionality. Generic information includes application level data like queue-name, user information etc in the ApplicationSubmissionContext, list of application-attempts that ran for an application, information about each application-attempt, list of containers run under each application-attempt, and information about each container. Generic data is published by ResourceManager to the timeline store and used by the web-UI to display information about completed applications. - +#### Persisting Generic information about completed applications -### Current Status + Previously this was supported purely for MapReduce jobs by the Application History Server. With the introduction of the timeline server, the Application History Server becomes just one use the Timeline Server. Generic information includes application level data such as queue-name, user information and the like set in the `ApplicationSubmissionContext`, a list of application-attempts that ran for an application, information about each application-attempt, list of containers run under each application-attempt, and information about each container. Generic data is published by the `ResourceManager` to the timeline store and used by its web-UI to display information about completed applications. - The essential functionality of the timeline server have been completed and it can work in both secure and non secure modes. The generic history service is also built on timeline store. In subsequent releases we will be rolling out next generation timeline service which is scalable and reliable. Currently, Application specific information is only available via RESTful APIs using JSON type content. The ability to install framework specific UIs in YARN is not supported yet. -### Timeline Structure +### Current Status and Future Plans + +1. The core functionality of the timeline server has been completed. +1. It can work in both secure and non secure modes. +1. The generic history service is built on the timeline store. +1. The history can be stored in memory or in a leveldb database store; the latter ensures the history is preserved over Timeline Server ("ATS") restarts. +1. The ability to install framework specific UIs in YARN is not supported. +1. Application specific information is only available via RESTful APIs using JSON type content. +1. The "ATS v1" REST API has been declared one of the REST APIs whose compatibility will be maintained in future releases. +1. The single-server implementation of ATS places a limit on the scalability of the service; it also prevents the service being a High-Availability component of the YARN infrastructure. +1. Future releases will introduce next generation timeline service which is scalable and reliable. +1. The expanded features of this service *may not* be available to applications using the ATSv2 service. That includes extended data structures as well as the ability of the client to failover between ATS insances. + +### Timeline Structure ![Timeline Structure] (./images/timeline_structure.jpg) -#### TimelineDomain +#### Timeline Domain + + The Timeline Domain offers a namespace for Timeline server allowing users can host multiple entities, isolating them from other users and applications. Timeline server Security is defined at this level. Domain primarily stores owner info, read & write ACL information, created and modified time stamp information. Each Domain is uniquely identified by an ID. - Domain is like namespace for Timeline server and users can host multiple entities, isolating them from others. Timeline server Security is defined at this level. Domain majorly stores owner info, read & write ACL information, created and modified time stamp information. Domain is uniquely identified by ID. +#### Timeline Entity -#### TimelineEntity + A Timeline Entity contains the the meta information of a conceptual entity and its related events. The entity can be an application, an application attempt, a container or any user-defined object. It contains **Primary filters** which will be used to index the entities in the TimelineStore. Accordingly, users/applications should carefully choose the information they want to store as the primary filters. The remaining data can be stored as unindexed information. Each Entity is uniquely identified by an `EntityId` and `EntityType`. - Entity contains the the meta information of some conceptual entity and its related events. The entity can be an application, an application attempt, a container or whatever the user-defined object. It contains Primary filters which will be used to index the entities in TimelineStore, such that users should carefully choose the information they want to store as the primary filters. The remaining data can be stored as other information. Entity is uniquely identified by EntityId and EntityType. +#### Timeline Events -#### TimelineEvent + A Timeline Event describes an event that is related to a specific Timeline Entity of an application. Users are free to define what the event means, such as starting an application, getting allocated a container, operation failures or other information considered relevant to users (including cluster operators). - TimelineEvent contains the information of an event that is related to some conceptual entity of an application. Users are free to define what the event means, such as starting an application, getting allocated a container and etc. -Deployment +Deployment ---------- -###Configurations +### Configurations #### Basic Configuration | Configuration Property | Description | |:---- |:---- | -| `yarn.timeline-service.enabled` | Indicate to clients whether Timeline service is enabled or not. If enabled, the TimelineClient library used by end-users will post entities and events to the Timeline server. Defaults to false. | -| `yarn.resourcemanager.system-metrics-publisher.enabled` | The setting that controls whether yarn system metrics is published on the timeline server or not by RM. Defaults to false. | +| `yarn.timeline-service.enabled` | Indicate to clients whether Timeline service is enabled or not. If enabled, the `TimelineClient` library used by applications will post entities and events to the Timeline server. Defaults to false. | +| `yarn.resourcemanager.system-metrics-publisher.enabled` | The setting that controls whether or not YARN system metrics are published on the timeline server by RM. Defaults to false. | | `yarn.timeline-service.generic-application-history.enabled` | Indicate to clients whether to query generic application data from timeline history-service or not. If not enabled then application data is queried only from Resource Manager. Defaults to false. | -#### Advanced configuration - -| Configuration Property | Description | -|:---- |:---- | -| `yarn.timeline-service.ttl-enable` | Enable age off of timeline store data. Defaults to true. | -| `yarn.timeline-service.ttl-ms` | Time to live for timeline store data in milliseconds. Defaults to 604800000 (7 days). | -| `yarn.timeline-service.handler-thread-count` | Handler thread count to serve the client RPC requests. Defaults to 10. | -| `yarn.timeline-service.client.max-retries` | Default maximum number of retires for timeline servive client. Defaults to 30. | -| `yarn.timeline-service.client.retry-interval-ms` | Default retry time interval for timeline servive client. Defaults to 1000. | - #### Timeline store and state store configuration | Configuration Property | Description | @@ -109,10 +110,24 @@ Deployment | `yarn.timeline-service.bind-host` | The actual address the server will bind to. If this optional address is set, the RPC and webapp servers will bind to this address and the port specified in yarn.timeline-service.address and yarn.timeline-service.webapp.address, respectively. This is most useful for making the service listen to all interfaces by setting to 0.0.0.0. | | `yarn.timeline-service.http-cross-origin.enabled` | Enables cross-origin support (CORS) for web services where cross-origin web response headers are needed. For example, javascript making a web services request to the timeline server. Defaults to false. | | `yarn.timeline-service.http-cross-origin.allowed-origins` | Comma separated list of origins that are allowed for web services needing cross-origin (CORS) support. Wildcards `(*)` and patterns allowed. Defaults to `*`. | -| yarn.timeline-service.http-cross-origin.allowed-methods | Comma separated list of methods that are allowed for web services needing cross-origin (CORS) support. Defaults to GET,POST,HEAD. | +| `yarn.timeline-service.http-cross-origin.allowed-methods` | Comma separated list of methods that are allowed for web services needing cross-origin (CORS) support. Defaults to GET,POST,HEAD. | | `yarn.timeline-service.http-cross-origin.allowed-headers` | Comma separated list of headers that are allowed for web services needing cross-origin (CORS) support. Defaults to X-Requested-With,Content-Type,Accept,Origin. | | `yarn.timeline-service.http-cross-origin.max-age` | The number of seconds a pre-flighted request can be cached for web services needing cross-origin (CORS) support. Defaults to 1800. | +Note that the selection between the HTTP and HTTPS binding is made in the `TimelineClient` based +upon the value of the YARN-wide configuration option `yarn.http.policy`; the HTTPS endpoint will be +selected if this policy is either of `HTTPS_ONLY` or `HTTP_AND_HTTPS`. +#### Advanced Server-side configuration + +| Configuration Property | Description | +|:---- |:---- | +| `yarn.timeline-service.ttl-enable` | Enable deletion of aged data within the timeline store. Defaults to true. | +| `yarn.timeline-service.ttl-ms` | Time to live for timeline store data in milliseconds. Defaults to 604800000 (7 days). | +| `yarn.timeline-service.handler-thread-count` | Handler thread count to serve the client RPC requests. Defaults to 10. | +| `yarn.timeline-service.client.max-retries` | The maximum number of retries for attempts to publish data to the timeline service.Defaults to 30. | +| `yarn.timeline-service.client.retry-interval-ms` | The interval in milliseconds between retries for the timeline service client. Defaults to 1000. | + + #### Security Configuration Security can be enabled by setting yarn.timeline-service.http-authentication.type to kerberos and further following configurations can be done. @@ -122,10 +137,12 @@ Deployment | `yarn.timeline-service.http-authentication.type` | Defines authentication used for the timeline server HTTP endpoint. Supported values are: simple / kerberos / #AUTHENTICATION_HANDLER_CLASSNAME#. Defaults to simple. | | `yarn.timeline-service.http-authentication.simple.anonymous.allowed` | Indicates if anonymous requests are allowed by the timeline server when using 'simple' authentication. Defaults to true. | | `yarn.timeline-service.principal` | The Kerberos principal for the timeline server. | -| yarn.timeline-service.keytab | The Kerberos keytab for the timeline server. Defaults to /etc/krb5.keytab. | +| `yarn.timeline-service.keytab` | The Kerberos keytab for the timeline server. Defaults on Unix to to `/etc/krb5.keytab`. | | `yarn.timeline-service.delegation.key.update-interval` | Defaults to 86400000 (1 day). | | `yarn.timeline-service.delegation.token.renew-interval` | Defaults to 86400000 (1 day). | | `yarn.timeline-service.delegation.token.max-lifetime` | Defaults to 604800000 (7 day). | +| `yarn.timeline-service.best-effort` | Should the failure to obtain a delegation token be considered an application failure (option = false), + or should the client attempt to continue to publish information without it (option=true). Default: false #### Enabling the timeline service and the generic history service @@ -156,21 +173,21 @@ Deployment ``` -### Running Timeline server +### Running the Timeline Server Assuming all the aforementioned configurations are set properly, admins can start the Timeline server/history service with the following command: ``` - $ yarn timelineserver + yarn timelineserver ``` - Or users can start the Timeline server / history service as a daemon: + To start the Timeline server / history service as a daemon, the command is ``` - $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start timelineserver + $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start timelineserver ``` -### Accessing generic-data via command-line +### Accessing generic-data via command-line Users can access applications' generic historic data via the command line as below. Note that the same commands are usable to obtain the corresponding information about running applications. @@ -182,10 +199,9 @@ Deployment $ yarn container -status ``` -Publishing of application specific data ------------------------------------------------- +### Publishing application specific data - Developers can define what information they want to record for their applications by composing `TimelineEntity` and `TimelineEvent` objects, and put the entities and events to the Timeline server via `TimelineClient`. Below is an example: +Developers can define what information they want to record for their applications by composing `TimelineEntity` and `TimelineEvent` objects, then publishing the entities and events to the Timeline server via `TimelineClient`. Here is an example: ``` // Create and start the Timeline client @@ -216,10 +232,14 @@ Publishing of application specific data // Compose other Event info .... myEntity.addEvent(event); - timelineClient.putEntities(entity); + TimelinePutResponse response = timelineClient.putEntities(entity); } catch (IOException e) { // Handle the exception + } catch (RuntimeException e) { + // In Hadoop 2.6, if attempts submit information to the ATS fail more than the retry limit, + // a RuntimeException will be raised. This is likely to change in future releases, being + // replaced with a IOException that is (or wraps) that which triggered retry failures. } catch (YarnException e) { // Handle the exception } @@ -228,10 +248,181 @@ Publishing of application specific data client.stop(); ``` - **Note** : Following are the points which needs to be observed during updating a entity. +1. Publishing of data to ATS is a synchronous operation; the call will not return until successful. +1. The result of a `putEntities()` call is a `TimelinePutResponse` object. This contains a +(hopefully empty) list of those timeline entities reject by the timeline server, along with +an error code indicating the cause of each failure. + +In Hadoop 2.6 and 2.7, the error codes are: + +| Configuration Property | Description | +|:---- |:---- | +|1 | No start time | +|2 | IOException | +|3 | System Filter conflict (reserved filter key used) | +|4 | Access Denied | +|5 | No domain | +|6 | Forbidden relation | + +Further error codes may be defined in future. + +**Note** : Following are the points which need to be observed when updating a entity. + +* Domain ID should not be modified for already existing entity. + + +* After a modification of a Primary filter value, the new value will be appended to the old value; the original value will not be replaced. +* It's advisable to have same primary filters for all updates on entity. As on modification of primary filter by subsequent updates will result in not fetching the information before the update when queried with updated primary filter. + +## Timeline Server REST API V1 + +Querying the timeline server is currently only supported via REST API calls; there is no API +client implemented in the YARN libraries. In Java, the Jersey client is effective at querying +the server, even in secure mode (provided the caller has the appropriate Kerberos tokens). + +The v1 REST API is implemented at under the path, `/ws/v1/timeline/` on the ATS web service. + +Here is a non-normative description of the API. + +### Root path + + GET /ws/v1/timeline/ + +Returns a JSON object describing the server instance. + + {"About":"Timeline API"} + + +### Domain summary information `/ws/v1/timeline/domain` + + + GET /ws/v1/timeline/domain?owner=$OWNER + +Returns a list of domains belonging to a specific user, in +the JSON-marshalled `TimelineDomains` data structure. + +The `owner` MUST be set on a GET which is not authenticated. + +On an authenticated request, the `owner` defaults to +the caller. + + PUT /ws/v1/timeline/domain?owner + +A PUT of a serialized `TimelineDomain` structure to this path will +add the domain to the list of domains owned by the specified/current +user. A successful operation returns status code of 200 and +a `TimelinePutResponse` containing no errors. + + +### Specific information about a Domain `/ws/v1/timeline/domain/{domainId}` + +Returns a JSON-marshalled `TimelineDomain` structure +describing a domain. + +If the domain is not found, then an HTTP 404 response is returned. + +### List entitites `/ws/v1/timeline/{entityType}` + +List all entities of a specific type, matching query parameters + +| Query Parameter | Description | +|:---- |:---- | +| primaryFilter| String key:value filter pair | +| secondaryFilter | comma separated list of key:value secondary filters| +| windowStart| start time for list: Long | +| windowEnd | end time for list: Long| +| fromId | ID to start listing from: String | +| fromTs | from timestamp: Long| +| limit| long| +| fields| comma separated list of fields| + +### retrieve entity by ID `/ws/v1/timeline/{entityType}/{entityId}` + +Retrieves an entity. + +| Query Parameter | Description | +|:---- |:---- | +| fields| comma separated list of fields| + +### List events of an entity type `/ws/v1/timeline/{entityType}/events` + +| Query Parameter | Description | +|:---- |:---- | +| entityId | entity ID| +| secondaryFilter | comma separated list of key:value secondary filters| +| windowStart| start time for list: Long | +| windowEnd | end time for list: Long| +| limit| long| + +### POST new entitites `/ws/v1/timeline/` + +POST one or more timeline entities to the service + +submission: `TimelineEntities` + +response: `TimelinePutResponse` + +### Domains `/ws/v1/timeline/domain` + +#### POST new domain `/ws/v1/timeline/domain` + +Creates a new timeline domain, or overrides an existing one. + +When attempting to create a new domain, the ID in the submission MUST +be unique across all domains in the cluster. + +When attempting to update an existing domain, the ID of that domain +must be set. The submitter must have the appropriate permissions +to update the domain. + +submission: `TimelineDomain` + +response: `TimelinePutResponse` + +#### List domains of a user: GET `/ws/v1/timeline/domain` + +Retrieves a list of all domains of a user. If +an owner is specified, that owner name overrides that of the caller. + +| Query Parameter | Description | +|:---- |:---- | +| owner| owner of the domains to list| + +response: `TimelineDomains` + +If the user lacks the permission to list the domains of the specified +owner, an `TimelineDomains` response with no domain listings is +returned. + + +#### Retrieve details of a specific domain: GET `/ws/v1/timeline/domain/{domainId}` + +Retrieves the details of a single domain + +response: `TimelineDomain` + +If the user lacks the permission to query the domain details, +a 404, not found exception is returned —the same response which +is returned if there is no entry with that ID. + + +### Response Codes + +1. Queries where a domain, entity type, entity ID or similar cannot be resolved result in +HTTP 404, "Not Found" responses. +1. Requests in which the path, parameters or values are invalid result in +Bad Request, 400, responses. +1. In a secure cluster, a 401, "Forbidden", response is generated +when attempting to perform operations to which the caller does +not have the sufficient rights. There is an exception to this when +querying some entities, such as Domains; here the API deliberately +downgrades permission-denied outcomes as empty and not-founds responses. +This hides details of other domains from an unauthorized caller. + +1. If the content of timeline entity PUT operations is invalid, + this failure *will not* result in an HTTP error code being retured. + A status code of 200 will be returned —however, there will be an error code + in the list of failed entities for each entity which could not be added. - * Domain ID should not be modified for already existing entity. - * Its advisable to have same primary filters for all updates on entity. As on modification of primary filter by subsequent updates will result in not fetching the information before the update when queried with updated primary filter. - * On modification of Primary filter value, new value will be appended with the old value. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/index.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/index.md index 9637ea0..35925ef 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/index.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/index.md @@ -72,4 +72,6 @@ YARN REST APIs * [Node Manager](./NodeManagerRest.html) +* [Timeline Server](./TimelineServer.html#Timeline_Server_REST_API_v1) +