Apache Jena
  1. Apache Jena
  2. JENA-218

Fuseki should allow timeouts to be specified on a per-request basis

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: Fuseki 0.2.1
    • Fix Version/s: None
    • Component/s: Fuseki

      Description

      A query endpoint might want to have different timeouts depending on whether queries are from untrusted or trusted users, or maintenance processes. The timeout could be passed with an X- header, a Timeout header as per http://tools.ietf.org/html/draft-loreto-http-timeout-00, or a query parameter, respecting the system default if none is provided. The query parameter might be less favourable as it'd be harder to filter out for Fuseki instances behind Apache.

      There is a risk that changing the behaviour to allow timeouts to be overridden will lead to DoSs of query endpoints open to the world to some extent. This can be mitigated by defaulting to disallowing timeout overrides.

      I'm happy to put a patch together and document it at http://incubator.apache.org/jena/documentation/serving_data/.

      1. config-tdb.ttl
        3 kB
        Alexander Dutton
      2. jena-218-1.diff
        11 kB
        Alexander Dutton
      3. jena-218-default-timeout.diff
        9 kB
        Alexander Dutton

        Activity

        Hide
        ASF subversion and git services added a comment -

        Commit 1507774 from Andy Seaborne in branch 'jena/branches/jena'
        [ https://svn.apache.org/r1507774 ]

        JENA-218

        Changes from jena-218-default-timeout.diff (18/Jul/13).
        default timeout and maximum timeout (as single numeric values).

        Show
        ASF subversion and git services added a comment - Commit 1507774 from Andy Seaborne in branch 'jena/branches/jena' [ https://svn.apache.org/r1507774 ] JENA-218 Changes from jena-218-default-timeout.diff (18/Jul/13). default timeout and maximum timeout (as single numeric values).
        Hide
        Andy Seaborne added a comment - - edited

        This is looking good. I like the proposed configuration file details.

        I'm not worried about not setting the two timeouts separately; in fact, I'm not sure it's a good idea at all because it is complicated to have something set in two places - easy to make mistakes.

        ARQ.queryTimeout

        We can keep compatibility by setting via this first, then proceeding with the defined fuseki: settings. We can't stop ARQ.queryTimeout because there are are other ways to set it.

        We can have an ordered hierarchy of setting:

        Any

        {ARQ.queryTimeout}

        – Global (cmd line settings) – Server – Service

        which if set in that order, means the more specific one wins.

        Command line

        One issue is the commandline --timeout X or --timeout X,Y which is setting ARQ.queryTimeout (in ms).

        We could simple deprecate and replace with --defaultTimeout and --maximumTimeoutOverride but the command line is more about easy setup, usually without config file. I see --timeout X as being the important case; put in some sort of timeout to guard aginst errant queries.

        Seconds are a better choice. Adding units (e.g. "20ms") seems excessive for fuseki and can be done with fractional settings. The ARQ API uses millis by default because it is common in APIs however network use and very fine grained timeouts don't make a lot of sense.

        We could take over --timeout as the initial setting (and the config file overrides) of both fuseki:defaultTimeout and fuseki:maximumTimeoutOverride, and migrate to seconds by assuming (with warning) X000 is ms. So if you want to do simple things, the command line is enough but complicated mixes of defaultTimeout and maximumTimeoutOverride need to be done with a configuration file.

        Show
        Andy Seaborne added a comment - - edited This is looking good. I like the proposed configuration file details. I'm not worried about not setting the two timeouts separately; in fact, I'm not sure it's a good idea at all because it is complicated to have something set in two places - easy to make mistakes. ARQ.queryTimeout We can keep compatibility by setting via this first, then proceeding with the defined fuseki: settings. We can't stop ARQ.queryTimeout because there are are other ways to set it. We can have an ordered hierarchy of setting: Any {ARQ.queryTimeout} – Global (cmd line settings) – Server – Service which if set in that order, means the more specific one wins. Command line One issue is the commandline --timeout X or --timeout X,Y which is setting ARQ.queryTimeout (in ms). We could simple deprecate and replace with --defaultTimeout and --maximumTimeoutOverride but the command line is more about easy setup, usually without config file. I see --timeout X as being the important case; put in some sort of timeout to guard aginst errant queries. Seconds are a better choice. Adding units (e.g. "20ms") seems excessive for fuseki and can be done with fractional settings. The ARQ API uses millis by default because it is common in APIs however network use and very fine grained timeouts don't make a lot of sense. We could take over --timeout as the initial setting (and the config file overrides) of both fuseki:defaultTimeout and fuseki:maximumTimeoutOverride , and migrate to seconds by assuming (with warning) X000 is ms. So if you want to do simple things, the command line is enough but complicated mixes of defaultTimeout and maximumTimeoutOverride need to be done with a configuration file.
        Hide
        ASF subversion and git services added a comment -

        Commit 1507676 from Andy Seaborne in branch 'jena/branches/jena'
        [ https://svn.apache.org/r1507676 ]

        Working area for JENA-218

        Show
        ASF subversion and git services added a comment - Commit 1507676 from Andy Seaborne in branch 'jena/branches/jena' [ https://svn.apache.org/r1507676 ] Working area for JENA-218
        Hide
        ASF subversion and git services added a comment -

        Commit 1507652 from Andy Seaborne in branch 'jena/branches/jena'
        [ https://svn.apache.org/r1507652 ]

        Working area for JENA-218

        Show
        ASF subversion and git services added a comment - Commit 1507652 from Andy Seaborne in branch 'jena/branches/jena' [ https://svn.apache.org/r1507652 ] Working area for JENA-218
        Hide
        Alexander Dutton added a comment -

        The current documentation says that one can attach a timeout to the fu:Server using the ARQ.queryTimeout. This (setting timeouts once for all services) isn't supported by the patch above. The patch also doesn't support setting the two timeouts separately.

        I'm going to propose dropping the suggestion of using ARQ.queryTimeout and being able to do the following:

        [] a fuseki:Server ;
          fuseki:defaultTimeout (10 20) ; # Set the default timeouts to 10s and 20s
          fuseki:allowTimeoutOverride true ;
          fuseki:maximumTimeoutOverride (20 40) ;
          fuseki:services (
            <#service1>
            <#service2>
          ) .
        
        <#service1> a fuseki:Service ;
          … ;
          fuseki:defaultTimeout (5 10) ; # Override the default timeouts
          fuseki:allowTimeoutOverride false .
        
        <#service2> a fuseki:Service ;
          … ;
          fuseki:maximumTimeoutOverride 60 . # Override the maximum timeouts to both be 60s.
        
        Show
        Alexander Dutton added a comment - The current documentation says that one can attach a timeout to the fu:Server using the ARQ.queryTimeout. This (setting timeouts once for all services) isn't supported by the patch above. The patch also doesn't support setting the two timeouts separately. I'm going to propose dropping the suggestion of using ARQ.queryTimeout and being able to do the following: [] a fuseki:Server ; fuseki:defaultTimeout (10 20) ; # Set the default timeouts to 10s and 20s fuseki:allowTimeoutOverride true ; fuseki:maximumTimeoutOverride (20 40) ; fuseki:services ( <#service1> <#service2> ) . <#service1> a fuseki:Service ; … ; fuseki:defaultTimeout (5 10) ; # Override the default timeouts fuseki:allowTimeoutOverride false . <#service2> a fuseki:Service ; … ; fuseki:maximumTimeoutOverride 60 . # Override the maximum timeouts to both be 60s.
        Hide
        Alexander Dutton added a comment -

        This patch changes a few things with respect to how Fuseki handles timeouts, and configuring them:

        • an invalid Timeout header or parameter now results in a 400 (Bad Request), not a 500.
        • an empty timeout parameter is ignored (so HTML forms can include an empty timeout field)
        • sparql. {html,tpl}

          now include a timeout field

        • better logging (shows how each service is configured, and request logs say when a timeout has been applied)
        • (another) way to set a default timeout (using fu:defaultTimeout)
        • HttpAction now has a timeout property, set by setAnyTimeouts, and used to construct log messages and HTTP error responses.

        Documentation to follow…

        Show
        Alexander Dutton added a comment - This patch changes a few things with respect to how Fuseki handles timeouts, and configuring them: an invalid Timeout header or parameter now results in a 400 (Bad Request), not a 500. an empty timeout parameter is ignored (so HTML forms can include an empty timeout field) sparql. {html,tpl} now include a timeout field better logging (shows how each service is configured, and request logs say when a timeout has been applied) (another) way to set a default timeout (using fu:defaultTimeout) HttpAction now has a timeout property, set by setAnyTimeouts, and used to construct log messages and HTTP error responses. Documentation to follow…
        Hide
        Alexander Dutton added a comment -

        Ah, I hadn't realised it made it in!

        I'll set aside some time over the long weekend to reappraise myself and get something written.

        Show
        Alexander Dutton added a comment - Ah, I hadn't realised it made it in! I'll set aside some time over the long weekend to reappraise myself and get something written.
        Hide
        Andy Seaborne added a comment -

        (trying to clear out JIRA)

        This is marked "needsdocumentation" which seems to be why it's stil open. The patch is applied and in the releases.

        Status?

        Show
        Andy Seaborne added a comment - (trying to clear out JIRA) This is marked "needsdocumentation" which seems to be why it's stil open. The patch is applied and in the releases. Status?
        Hide
        Alexander Dutton added a comment -

        1/ WRT HTTP headers, one could extend the Timeout header to take both timeouts, but that's liable to encourage people to send two timeouts to other SPARQL services. We could take note of a Fuseki-specific header (e.g. X-Fuseki-Timeout), which will allow both to be set.

        2/ Quite possibly, yes. It'd be a bit more forgiving.

        3/ I standardised on seconds between the config, parameter and header as the Timeout header is specified in seconds. Would it be confusing to have the config (internal) be milliseconds, and the HTTP-based stuff (external) be seconds? Changing arq:queryTimeout to be in seconds would be rather backwards-incompatible at this stage, wouldn't it? (Unless jena-dev says that no one sets the timeout in that manner…)

        Show
        Alexander Dutton added a comment - 1/ WRT HTTP headers, one could extend the Timeout header to take both timeouts, but that's liable to encourage people to send two timeouts to other SPARQL services. We could take note of a Fuseki-specific header (e.g. X-Fuseki-Timeout), which will allow both to be set. 2/ Quite possibly, yes. It'd be a bit more forgiving. 3/ I standardised on seconds between the config, parameter and header as the Timeout header is specified in seconds. Would it be confusing to have the config (internal) be milliseconds, and the HTTP-based stuff (external) be seconds? Changing arq:queryTimeout to be in seconds would be rather backwards-incompatible at this stage, wouldn't it? (Unless jena-dev says that no one sets the timeout in that manner…)
        Hide
        Andy Seaborne added a comment -

        I've applied the patch but have some comments.

        1/ Specifying timeouts: there two timeouts, time to first answer and time to last answer. Setting the overall timeout high but short time to first answer makes sense as a timeout after first answer before the last is likely to result in bad syntax in the results.

        For compatibility with HTTP, the HTTP timeout is going to have to be one number. This should be time to first answer (and last) because it should correspond to HTTP 408. We now have the typical problem of HTTP response codes - they go out before the results start.

        But for ?timeout= and for the config file, I suggest allowing the ""X,Y" form as well as plain number. i.e. allowTimeoutOverride is a string (as well as a number).

        c.f. QueryExecutionBase.setAnyTimeouts.

        2/ Should the presence of fuseki:maximumTimeoutOverride 10 imply fuseki:allowTimeoutOverride=true?

        3/ Units. Elsewhere, including setting the setting timeouts by context, is by milliseconds. Granted, it might be better to use seconds uniformly but given where we are, maybe milliseconds everywhere is better than a mixture.

        ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "10000" ] ;

        I'm open to changing arq:queryTimeout to be seconds.

        Show
        Andy Seaborne added a comment - I've applied the patch but have some comments. 1/ Specifying timeouts: there two timeouts, time to first answer and time to last answer. Setting the overall timeout high but short time to first answer makes sense as a timeout after first answer before the last is likely to result in bad syntax in the results. For compatibility with HTTP, the HTTP timeout is going to have to be one number. This should be time to first answer (and last) because it should correspond to HTTP 408. We now have the typical problem of HTTP response codes - they go out before the results start. But for ?timeout= and for the config file, I suggest allowing the ""X,Y" form as well as plain number. i.e. allowTimeoutOverride is a string (as well as a number). c.f. QueryExecutionBase.setAnyTimeouts. 2/ Should the presence of fuseki:maximumTimeoutOverride 10 imply fuseki:allowTimeoutOverride=true? 3/ Units. Elsewhere, including setting the setting timeouts by context, is by milliseconds. Granted, it might be better to use seconds uniformly but given where we are, maybe milliseconds everywhere is better than a mixture. ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "10000" ] ; I'm open to changing arq:queryTimeout to be seconds.
        Hide
        Paolo Castagna added a comment -

        Alexander, thanks for the patch (I've not looked at it yet). It will not make it the first Fuseki release, since vote process for that has already started.

        Show
        Paolo Castagna added a comment - Alexander, thanks for the patch (I've not looked at it yet). It will not make it the first Fuseki release, since vote process for that has already started.
        Hide
        Alexander Dutton added a comment -

        Replacing previous diff with one that doesn't include local changes to .project and .classpath files.

        Show
        Alexander Dutton added a comment - Replacing previous diff with one that doesn't include local changes to .project and .classpath files.
        Hide
        Alexander Dutton added a comment -

        First go at a patch, and a config file that enables the new functionality. I've got SPARQL_ServletBase passing a DatasetRef in place of a DatasetGraph to perform (though the default implementation calls perform with the previous signature for compatibility).
        The reporting of the length of the timeout when returning the 408 is currently wrong, as it doesn't take into account any user-provided timeout. I'm not quite sure of the best way to get the actual timeout into the right place without changing rather a lot (one would be to include the QueryExecution or timeout value as a member of the QueryExecutionCancelled exception).
        Still got to work on the documentation. Do we need tests?

        Show
        Alexander Dutton added a comment - First go at a patch, and a config file that enables the new functionality. I've got SPARQL_ServletBase passing a DatasetRef in place of a DatasetGraph to perform (though the default implementation calls perform with the previous signature for compatibility). The reporting of the length of the timeout when returning the 408 is currently wrong, as it doesn't take into account any user-provided timeout. I'm not quite sure of the best way to get the actual timeout into the right place without changing rather a lot (one would be to include the QueryExecution or timeout value as a member of the QueryExecutionCancelled exception). Still got to work on the documentation. Do we need tests?
        Hide
        Alexander Dutton added a comment -

        Yes, supporting both headers and query parameters sounds sensible. If both are specified, shall we take the minimum? Taking the minimum woud allow e.g. httpd to add a Timeout header for unauthed users, which restricts timeouts, but does not take away the user's freedom to specify a lower timeout.

        So, the config could be something like:

        <#service3> rdf:type fuseki:Service ;
        fuseki:name "tdb" ; # http://host:port/tdb
        fuseki:serviceQuery "sparql" ; # SPARQL query service
        fuseki:allowTimeoutOverride true ;
        fuseki:maximumTimeoutOverride 4 ;
        fuseki:dataset <#dataset> .

        If allowTimeoutOverride isn't specified we default to leaving it disabled (hence, backwards compatibility with current behaviour), and if maximumTimeoutOverride is missing, default to allowing unlimited timeouts (the documentation can suggest that you probably want to specify both together. Do we want this specified in seconds or milliseconds?

        I would argue that it should silently ignore attempts to set a timeout when none is allowed, lest a client is optimistically asking timeouts of everything it queries and gets confused by a 400 or 501.

        Show
        Alexander Dutton added a comment - Yes, supporting both headers and query parameters sounds sensible. If both are specified, shall we take the minimum? Taking the minimum woud allow e.g. httpd to add a Timeout header for unauthed users, which restricts timeouts, but does not take away the user's freedom to specify a lower timeout. So, the config could be something like: <#service3> rdf:type fuseki:Service ; fuseki:name "tdb" ; # http://host:port/tdb fuseki:serviceQuery "sparql" ; # SPARQL query service fuseki:allowTimeoutOverride true ; fuseki:maximumTimeoutOverride 4 ; fuseki:dataset <#dataset> . If allowTimeoutOverride isn't specified we default to leaving it disabled (hence, backwards compatibility with current behaviour), and if maximumTimeoutOverride is missing, default to allowing unlimited timeouts (the documentation can suggest that you probably want to specify both together. Do we want this specified in seconds or milliseconds? I would argue that it should silently ignore attempts to set a timeout when none is allowed, lest a client is optimistically asking timeouts of everything it queries and gets confused by a 400 or 501.
        Hide
        Andy Seaborne added a comment -

        This would be great.

        I'd like to see at least ?timeout= form for pragmatic reasons. This makes it similar to other systems. It's much easier to set in the client where access to setting the HTTP headers can be tricky (e.g. when using a library for HTTP calls, not going raw to java.net or Apache httpClient). When writing a call, whether scripting or java, it's easier to do everything in the query string but a sem-standard is also

        Having header and query parameter is possible - it's not either/or.

        The DoS issue is a serious one, I think. From just looking at usage (e.g. DBPedia), people override the timeout as the first "solution" to a query timing out when the query is just inherently expensive and missing the timeout by a long way. As a usage is public-facing data serving is one use for Fuseki, armour-plating the timeout mechanism is required.

        A complicated scheme is to have a second timeout associated with the dataset that is the maximum allowable settings. If absent, any normal timeout set should be the maximum allowed. Setting the max setting very high (or, better, a special value) would be the same as letting the client take full control. Absence, or setting the same as the normal timeout is, in effect, no override as you can only set it shorter but a special value for "not allowed" would make for a better error message like "You can't do that".

        Show
        Andy Seaborne added a comment - This would be great. I'd like to see at least ?timeout= form for pragmatic reasons. This makes it similar to other systems. It's much easier to set in the client where access to setting the HTTP headers can be tricky (e.g. when using a library for HTTP calls, not going raw to java.net or Apache httpClient). When writing a call, whether scripting or java, it's easier to do everything in the query string but a sem-standard is also Having header and query parameter is possible - it's not either/or. The DoS issue is a serious one, I think. From just looking at usage (e.g. DBPedia), people override the timeout as the first "solution" to a query timing out when the query is just inherently expensive and missing the timeout by a long way. As a usage is public-facing data serving is one use for Fuseki, armour-plating the timeout mechanism is required. A complicated scheme is to have a second timeout associated with the dataset that is the maximum allowable settings. If absent, any normal timeout set should be the maximum allowed. Setting the max setting very high (or, better, a special value) would be the same as letting the client take full control. Absence, or setting the same as the normal timeout is, in effect, no override as you can only set it shorter but a special value for "not allowed" would make for a better error message like "You can't do that".

          People

          • Assignee:
            Andy Seaborne
            Reporter:
            Alexander Dutton
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development