Jackrabbit Content Repository
  1. Jackrabbit Content Repository
  2. JCR-1248

Helper Method to escape illegal XPath Search Term

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5
    • Component/s: jackrabbit-jcr-commons
    • Labels:
      None

      Description

      If you try to perform a search like this

      //element(*, nt:base)[jcr:contains(., 'test!')]

      you get this exception

      javax.jcr.RepositoryException: Exception building query: org.apache.jackrabbit.core.query.lucene.fulltext.ParseException: Encountered "<EOF>" at line 1, column 6.

      1. patch.txt
        1 kB
        Claus Köll

        Issue Links

          Activity

          Hide
          Paco Avila added a comment -

          I'm not sure if is a bug or a "feature". The query

          String term = "pe[]pe";
          String scapedTerm = Text.escapeIllegalXpathSearchChars(term).replaceAll("'", "''")
          String query = "/jcr:root//*[jcr:contains(okm:content,'"+escapedTerm+"')]"

          should fail or the term "pe[]pe" should be escaped as "pe[]pe"?

          Show
          Paco Avila added a comment - I'm not sure if is a bug or a "feature". The query String term = "pe[]pe"; String scapedTerm = Text.escapeIllegalXpathSearchChars(term).replaceAll("'", "''") String query = "/jcr:root//* [jcr:contains(okm:content,'"+escapedTerm+"')] " should fail or the term "pe[]pe" should be escaped as "pe[]pe"?
          Hide
          Paco Avila added a comment -

          By the way, this sample code at http://wiki.apache.org/jackrabbit/EncodingAndEscaping is recursive:

          String q =
          "/jcr:root/foo/element(*, foo)" +
          "[jcr:contains(@title, '" + Text.escapeIllegalXpathSearchChars(q).replaceAll("'", "''") + "')]" +
          "[@itemID = '" + itemID.replaceAll("'", "''") + "']";

          Show
          Paco Avila added a comment - By the way, this sample code at http://wiki.apache.org/jackrabbit/EncodingAndEscaping is recursive: String q = "/jcr:root/foo/element(*, foo)" + " [jcr:contains(@title, '" + Text.escapeIllegalXpathSearchChars(q).replaceAll("'", "''") + "')] " + " [@itemID = '" + itemID.replaceAll("'", "''") + "'] ";
          Hide
          Alexander Klimetschek added a comment -

          > A query like this will fail:
          > //element(*, nt:base)[jcr:contains(., 'test \ done')]

          Did you use org.apache.jackrabbit.util.Text.escapeIllegalXpathSearchChars() ? In case that one has a bug, please file a new issue.

          See also http://wiki.apache.org/jackrabbit/EncodingAndEscaping

          Show
          Alexander Klimetschek added a comment - > A query like this will fail: > //element(*, nt:base) [jcr:contains(., 'test \ done')] Did you use org.apache.jackrabbit.util.Text.escapeIllegalXpathSearchChars() ? In case that one has a bug, please file a new issue. See also http://wiki.apache.org/jackrabbit/EncodingAndEscaping
          Hide
          Paco Avila added a comment - - edited

          A query like this will fail:

          //element(*, nt:base)[jcr:contains(., 'test \ done')]

          Specification JSR-170 at point 6.6.5.2 says that literal instances like single quote ( ' ), double quote ( " ) and hyphen ( - ) must be escaped with a backslash ( \ ), and backslash itself should be escaped as a double backslash (
          ). Also, I have noted that some chars like [ and ] need to be escaped also.

          Show
          Paco Avila added a comment - - edited A query like this will fail: //element(*, nt:base) [jcr:contains(., 'test \ done')] Specification JSR-170 at point 6.6.5.2 says that literal instances like single quote ( ' ), double quote ( " ) and hyphen ( - ) must be escaped with a backslash ( \ ), and backslash itself should be escaped as a double backslash ( ). Also, I have noted that some chars like [ and ] need to be escaped also.
          Hide
          Jukka Zitting added a comment -

          This turned out to be implemented as a new feature in jcr-commons, changing issue metadata accordingly.

          I guess the original problem (ParseException) is the expected (though undocumented) behavior, so there's no need to fix this for clients that don't use the new helper method.

          Show
          Jukka Zitting added a comment - This turned out to be implemented as a new feature in jcr-commons, changing issue metadata accordingly. I guess the original problem (ParseException) is the expected (though undocumented) behavior, so there's no need to fix this for clients that don't use the new helper method.
          Hide
          Claus Köll added a comment -

          Committed in Rev: 706242

          Show
          Claus Köll added a comment - Committed in Rev: 706242
          Hide
          Claus Köll added a comment -

          I added a helper Method in org.apache.jackrabbit.util.Text to escape illegal XPathChars.
          It checks illegal chars at the end of a XPatch search term.

          Show
          Claus Köll added a comment - I added a helper Method in org.apache.jackrabbit.util.Text to escape illegal XPathChars. It checks illegal chars at the end of a XPatch search term.
          Hide
          Marcel Reutegger added a comment -

          In addition to the already specified set of special character in JSR 170, Jackrabbit uses more of those characters for extended functionality.

          This set of characters should be limited to the ones really required (e.g. ! is equivalent to -) and clearly documented. It would be nice to also have a utility class that automatically escapes the special characters used in Jackrabbit.

          Show
          Marcel Reutegger added a comment - In addition to the already specified set of special character in JSR 170, Jackrabbit uses more of those characters for extended functionality. This set of characters should be limited to the ones really required (e.g. ! is equivalent to -) and clearly documented. It would be nice to also have a utility class that automatically escapes the special characters used in Jackrabbit.
          Hide
          Ard Schrijvers added a comment -

          Repeated from user-list:

          It seems that in LuceneQueryBuilder at

          Object visit(TextsearchQueryNode node, Object data) {

          it breaks at

          Query context = parser.parse(query.toString());

          where the parser is o.a.j.core.query.lucene.fulltext.QueryParser. It seems to break on string ending with a "!". Unfortunately, I do not have insight in how the QueryParser works. Perhaps somebody else knows where to look in the QueryParser .

          Show
          Ard Schrijvers added a comment - Repeated from user-list: It seems that in LuceneQueryBuilder at Object visit(TextsearchQueryNode node, Object data) { it breaks at Query context = parser.parse(query.toString()); where the parser is o.a.j.core.query.lucene.fulltext.QueryParser. It seems to break on string ending with a "!". Unfortunately, I do not have insight in how the QueryParser works. Perhaps somebody else knows where to look in the QueryParser .

            People

            • Assignee:
              Claus Köll
              Reporter:
              Claus Köll
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development