Uploaded image for project: 'OFBiz'
  1. OFBiz
  2. OFBIZ-5953

Problem with new UtilCodec code caused by HTMLEntityCodec.decode()

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Trunk
    • Fix Version/s: 14.12.01, 16.11.01
    • Component/s: framework
    • Labels:
      None

      Description

      From Adrian on ML:

      When I navigate to https://localhost:8443/accounting/control/paymentOverview?paymentId=8004 many exceptions are thrown and the screen fails to render. I tried changing WidgetWorker.java line 74 to localRequestName = UtilCodec.canonicalize(localRequestName, false, false); which fixed the exceptions, but the generated link is wrong. I don't know how to fix it.

      Errors related to this class are also thrown at accounting/control/invoiceOverview. Setting a breakpoint at line 167 of UtilCodec.java shows that 2 HTMLEntityCodec.decode calls transforms the URL from
      EditAcctgTrans?acctgTransId=10070&organizationPartyId=10010 to
      EditAcctgTrans?acctgTransId=10070&organizationPartyId=10010 to
      EditAcctgTrans?acctgTransId=10070∨ganizationPartyId=10010.

      Not sure if the error is in class UtilCode or HTMLEntityCodec.

        Issue Links

          Activity

          Hide
          jacopoc Jacopo Cappellato added a comment -

          Thanks for the report Christian and Adrian.
          As a temporary workaround we can disable the HTMLEntityCodec wight he following patch:

          Index: framework/base/src/org/ofbiz/base/util/UtilCodec.java
          ===================================================================
          --- framework/base/src/org/ofbiz/base/util/UtilCodec.java	(revision 1650452)
          +++ framework/base/src/org/ofbiz/base/util/UtilCodec.java	(working copy)
          @@ -43,7 +43,7 @@
               private static final List<Codec> codecs;
               static {
                   List<Codec> tmpCodecs = new ArrayList<Codec>();
          -        tmpCodecs.add(new HTMLEntityCodec());
          +        //tmpCodecs.add(new HTMLEntityCodec());
                   tmpCodecs.add(new PercentCodec());
                   codecs = Collections.unmodifiableList(tmpCodecs);
               }
          

          This will fix the screens.
          I am digging into it now.

          Show
          jacopoc Jacopo Cappellato added a comment - Thanks for the report Christian and Adrian. As a temporary workaround we can disable the HTMLEntityCodec wight he following patch: Index: framework/base/src/org/ofbiz/base/util/UtilCodec.java =================================================================== --- framework/base/src/org/ofbiz/base/util/UtilCodec.java (revision 1650452) +++ framework/base/src/org/ofbiz/base/util/UtilCodec.java (working copy) @@ -43,7 +43,7 @@ private static final List<Codec> codecs; static { List<Codec> tmpCodecs = new ArrayList<Codec>(); - tmpCodecs.add( new HTMLEntityCodec()); + //tmpCodecs.add( new HTMLEntityCodec()); tmpCodecs.add( new PercentCodec()); codecs = Collections.unmodifiableList(tmpCodecs); } This will fix the screens. I am digging into it now.
          Hide
          gareth.carter Gareth Carter added a comment -

          FYI, duplicate of OFBIZ-5910

          Show
          gareth.carter Gareth Carter added a comment - FYI, duplicate of OFBIZ-5910
          Hide
          jacopoc Jacopo Cappellato added a comment -

          I have spent some time digging into the source code of HTMLEntityCodec (ESAPI) and specifically the method decodeCharacter is relevant here; see:
          https://code.google.com/p/owasp-esapi-java/source/browse/trunk/src/main/java/org/owasp/esapi/codecs/HTMLEntityCodec.java

          As you can see, and as described by the comment:

          • Returns the decoded version of the character starting at index, or
          • null if no decoding is possible.
          • Formats all are legal both with and without semi-colon, upper/lower case:
          • &#dddd;
          • &#xhhhh;
          • &name;

          the codec recognizes the strings "&op" and "&op;" both as the html entity representation of the OR symbol.
          I am not sure if this is right or wrong according to the specifications but it is definitely too strict for OFBiz because it causes problems like the one reported here.
          My next step will be that of finding and studying the source file of the old version of ESAPI and see if the behavior changed since then; as I mentioned, removing the HTMLEntityCodec will fix this issue but I still have to figure out the implications of this change.

          Show
          jacopoc Jacopo Cappellato added a comment - I have spent some time digging into the source code of HTMLEntityCodec (ESAPI) and specifically the method decodeCharacter is relevant here; see: https://code.google.com/p/owasp-esapi-java/source/browse/trunk/src/main/java/org/owasp/esapi/codecs/HTMLEntityCodec.java As you can see, and as described by the comment: Returns the decoded version of the character starting at index, or null if no decoding is possible. Formats all are legal both with and without semi-colon, upper/lower case: &#dddd; &#xhhhh; &name; the codec recognizes the strings "&op" and "&op;" both as the html entity representation of the OR symbol. I am not sure if this is right or wrong according to the specifications but it is definitely too strict for OFBiz because it causes problems like the one reported here. My next step will be that of finding and studying the source file of the old version of ESAPI and see if the behavior changed since then; as I mentioned, removing the HTMLEntityCodec will fix this issue but I still have to figure out the implications of this change.
          Hide
          gareth.carter Gareth Carter added a comment -

          Its more about context. Why use HTMLEntityCodec to decode urls? If UtilCodec is used in other places for other purposes than it will need to split off

          Show
          gareth.carter Gareth Carter added a comment - Its more about context. Why use HTMLEntityCodec to decode urls? If UtilCodec is used in other places for other purposes than it will need to split off
          Hide
          jacopoc Jacopo Cappellato added a comment -

          Yes, this is what I am trying to do; I would like to create two separate mechanisms, one to sanitize urls and other html contents.

          Show
          jacopoc Jacopo Cappellato added a comment - Yes, this is what I am trying to do; I would like to create two separate mechanisms, one to sanitize urls and other html contents.
          Hide
          jacques.le.roux Jacques Le Roux added a comment - - edited

          Thanks Jacopo, excellent article!

          I meant this one http://security.coverity.com/blog/2013/Nov/to-escape-or-not-to-escape-that-is-the-question.html suggested in OFBIz-5910

          Show
          jacques.le.roux Jacques Le Roux added a comment - - edited Thanks Jacopo, excellent article! I meant this one http://security.coverity.com/blog/2013/Nov/to-escape-or-not-to-escape-that-is-the-question.html suggested in OFBIz-5910

            People

            • Assignee:
              Unassigned
              Reporter:
              ofbizzer Christian Carlow
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development