ManifoldCF
  1. ManifoldCF
  2. CONNECTORS-797

Database Exception on Status and Job Management screen

    Details

      Description

      Receive the following message on the ' Status and Job Management' screen:

      Error: Database exception: SQLException doing query (42601): ERROR: syntax error at or near "status" Position: 60

      Digging through the debug logs found:
      DEBUG 2013-10-31 12:29:11,911 (http-apr-8081-exec-5) - Requested query: [SELECT COUNT(dochash) AS doccount FROM jobqueue WHERE id=?status IN (?,?,?,?,?,?) LIMIT 500001]
      DEBUG 2013-10-31 12:29:11,911 (Thread-16) - Actual query: [SELECT COUNT(dochash) AS doccount FROM jobqueue WHERE id=?status IN (?,?,?,?,?,?) LIMIT 500001]
      DEBUG 2013-10-31 12:29:11,912 (Thread-16) - Parameter 0: '1380815567276'
      DEBUG 2013-10-31 12:29:11,912 (Thread-16) - Parameter 1: 'A'
      DEBUG 2013-10-31 12:29:11,912 (Thread-16) - Parameter 2: 'a'
      DEBUG 2013-10-31 12:29:11,912 (Thread-16) - Parameter 3: 'P'
      DEBUG 2013-10-31 12:29:11,912 (Thread-16) - Parameter 4: 'F'
      DEBUG 2013-10-31 12:29:11,912 (Thread-16) - Parameter 5: 'f'
      DEBUG 2013-10-31 12:29:11,912 (Thread-16) - Parameter 6: 'G'
      DEBUG 2013-10-31 12:29:11,918 (http-apr-8081-exec-5) - Reinterpreting exception 'Database exception: SQLException doing query (42601): ERROR: syntax error at or near "status"
      Position: 60'. The exception type is 4
      DEBUG 2013-10-31 12:29:11,918 (http-apr-8081-exec-5) - Exception Database exception: SQLException doing query (42601): ERROR: syntax error at or near "status"
      Position: 60 is possibly a transaction abort signal
      DEBUG 2013-10-31 12:29:11,918 (http-apr-8081-exec-5) - Exception Database exception: SQLException doing query (42601): ERROR: syntax error at or near "status"
      Position: 60 is NOT a transaction abort signal

      Appears that the SQL request is missing a space at 'WHERE id=?status IN (....'

      1. CONNECTORS-797.patch.2
        2 kB
        Karl Wright
      2. CONNECTORS-797.patch
        1 kB
        Karl Wright

        Activity

        Hide
        Karl Wright added a comment -

        r1541192 (release 1.4 branch)

        Show
        Karl Wright added a comment - r1541192 (release 1.4 branch)
        Hide
        Karl Wright added a comment -

        Reopened for pullup to 1.4.1

        Show
        Karl Wright added a comment - Reopened for pullup to 1.4.1
        Hide
        Karl Wright added a comment -

        Well, even if it is better only in 9.2 it may be worth doing. If you think it's worthwhile please open an enhancement ticket.

        Show
        Karl Wright added a comment - Well, even if it is better only in 9.2 it may be worth doing. If you think it's worthwhile please open an enhancement ticket.
        Hide
        Graeme Seaton added a comment -

        Actually - looking at http://wiki.postgresql.org/wiki/Slow_Counting index-only scans are supported in 9.2 and beyond......

        The only benefit in counting against id would be a smaller index size to scan.

        Show
        Graeme Seaton added a comment - Actually - looking at http://wiki.postgresql.org/wiki/Slow_Counting index-only scans are supported in 9.2 and beyond...... The only benefit in counting against id would be a smaller index size to scan.
        Hide
        Karl Wright added a comment -

        I'd say yes except PostgreSQL will likely plan to do a sequential scan simply because the number of rows in the job outstrips all others. But please feel free to experiment.

        Show
        Karl Wright added a comment - I'd say yes except PostgreSQL will likely plan to do a sequential scan simply because the number of rows in the job outstrips all others. But please feel free to experiment.
        Hide
        Graeme Seaton added a comment -

        Hi Karl,

        Applied the patch and it works (can't provide any metrics about speed increase as I was waiting for the REINDEX to complete while waiting for the screen to refresh).

        Was going suggest using jobid to take advantage of the existing index but you beat me to it

        On a similar note I notice quite a few 'SELECT COUNT(dochash) AS doccount FROM jobqueue t1 LIMIT 500001' queries which have a plan of:

        "Limit (cost=432498.24..432498.25 rows=1 width=41)"
        " -> Aggregate (cost=432498.24..432498.25 rows=1 width=41)"
        " -> Seq Scan on jobqueue t1 (cost=0.00..404109.99 rows=11355299 width=41)"

        Would it be worthwhile to change it to 'SELECT COUNT(id)....' to utilise the primary key index (or add a where clause to force the use of an index)?

        Show
        Graeme Seaton added a comment - Hi Karl, Applied the patch and it works (can't provide any metrics about speed increase as I was waiting for the REINDEX to complete while waiting for the screen to refresh). Was going suggest using jobid to take advantage of the existing index but you beat me to it On a similar note I notice quite a few 'SELECT COUNT(dochash) AS doccount FROM jobqueue t1 LIMIT 500001' queries which have a plan of: "Limit (cost=432498.24..432498.25 rows=1 width=41)" " -> Aggregate (cost=432498.24..432498.25 rows=1 width=41)" " -> Seq Scan on jobqueue t1 (cost=0.00..404109.99 rows=11355299 width=41)" Would it be worthwhile to change it to 'SELECT COUNT(id)....' to utilise the primary key index (or add a where clause to force the use of an index)?
        Hide
        Karl Wright added a comment -

        Hi Graeme,

        If you could try the second patch before increasing the maxcount parameter, that would be great.

        If you want an accurate count in the UI, you will definitely need to increase the maxcount parameter eventually though to get it. Otherwise you will get something like this:

        >500000

        Show
        Karl Wright added a comment - Hi Graeme, If you could try the second patch before increasing the maxcount parameter, that would be great. If you want an accurate count in the UI, you will definitely need to increase the maxcount parameter eventually though to get it. Otherwise you will get something like this: >500000
        Hide
        Karl Wright added a comment -

        r1537568 commits the second patch to trunk.

        Show
        Karl Wright added a comment - r1537568 commits the second patch to trunk.
        Hide
        Karl Wright added a comment -

        Second patch, to actually query with the correct row match.

        Show
        Karl Wright added a comment - Second patch, to actually query with the correct row match.
        Hide
        Karl Wright added a comment -

        Increasing the maxcount will make the problem go away, but it won't fix the bug . I'll look at this further.

        Show
        Karl Wright added a comment - Increasing the maxcount will make the problem go away, but it won't fix the bug . I'll look at this further.
        Hide
        Graeme Seaton added a comment -

        Thanks.

        After I posted, went and rechecked the status screen and saw a set of zero's for everything. Will increase the property value to retest (the counts are quite importent to our application).

        Show
        Graeme Seaton added a comment - Thanks. After I posted, went and rechecked the status screen and saw a set of zero's for everything. Will increase the property value to retest (the counts are quite importent to our application).
        Hide
        Karl Wright added a comment -

        Hi Graeme,

        The parameter simply sets the maximum number of documents that the status screen will try to count. On some databases (e.g. Postgresql), to count rows in a table the database has to enumerate through them, which is very very slow if you have a large jobqueue.

        Show
        Karl Wright added a comment - Hi Graeme, The parameter simply sets the maximum number of documents that the status screen will try to count. On some databases (e.g. Postgresql), to count rows in a table the database has to enumerate through them, which is very very slow if you have a large jobqueue.
        Hide
        Graeme Seaton added a comment -

        That did the trick.

        Didn't increase the property "org.apache.manifoldcf.ui.maxstatuscount" value. Would you expect that to improve performance or robustness (Testing against a set of repositories of ~10M docs)?

        Show
        Graeme Seaton added a comment - That did the trick. Didn't increase the property "org.apache.manifoldcf.ui.maxstatuscount" value. Would you expect that to improve performance or robustness (Testing against a set of repositories of ~10M docs)?
        Hide
        Karl Wright added a comment -

        r1537557

        Show
        Karl Wright added a comment - r1537557
        Hide
        Karl Wright added a comment -

        There is another workaround: to set the property "org.apache.manifoldcf.ui.maxstatuscount" to a number that is very large (e.g. 10000000).

        Show
        Karl Wright added a comment - There is another workaround: to set the property "org.apache.manifoldcf.ui.maxstatuscount" to a number that is very large (e.g. 10000000).
        Hide
        Karl Wright added a comment -

        I suggest you try the attached patch against 1.4, because on trunk the schema has changed.

        Show
        Karl Wright added a comment - I suggest you try the attached patch against 1.4, because on trunk the schema has changed.
        Hide
        Karl Wright added a comment -

        Patch which should correct the problem.

        Show
        Karl Wright added a comment - Patch which should correct the problem.
        Hide
        Karl Wright added a comment -

        There's a workaround to just set the limit really high - have to look up the parameter though.

        Show
        Karl Wright added a comment - There's a workaround to just set the limit really high - have to look up the parameter though.

          People

          • Assignee:
            Karl Wright
            Reporter:
            Graeme Seaton
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development