Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-14353

Kafka Connect REST API configuration validation timeout improvements

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • connect

    Description

      Kafka Connect currently defines a default REST API request timeout of 90 seconds. If a REST API request takes longer than this timeout value, a 500 Internal Server Error  response is returned with the message "Request timed out".

      The POST /connectors  and the PUT /connectors/{connector}/config  endpoints that are used to create or update connectors internally do a connector configuration validation (the details of which vary depending on the connector plugin) before proceeding to write a message to the Connect cluster's config topic. If the configuration validation takes longer than 90 seconds, the connector is still eventually created after the config validation completes (even though a 500 Internal Server Error  response is returned to the user) which leads to a fairly confusing user experience.

      Furthermore, this situation is exacerbated by the potential for config validations occurring twice for a single request. If Kafka Connect is running in distributed mode, requests to create or update a connector are forwarded to the Connect worker which is currently the leader of the group, if the initial request is made to a worker which is not the leader. In this case, the config validation occurs both on the initial worker, as well as the leader (assuming that the first config validation is successful) - this means that if a config validation takes longer than 45 seconds to complete each time, it will result in the original create / update connector request timing out.

      Slow config validations can occur in certain exceptional scenarios - consider a database connector which has elaborate validation logic involving querying information schema to get a list of tables and views to validate the user's connector configuration. If the database has a very high number of tables and views and the database is under a heavy load in terms of query volume, such information schema queries can end up being considerably slow to complete.

      Attachments

        Issue Links

          Activity

            People

              yash.mayya Yash Mayya
              yash.mayya Yash Mayya
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: