Uploaded image for project: 'Batik'
  1. Batik
  2. BATIK-1018

"XML External Entities" vulnerability

    Details

    • Type: Bug
    • Status: Resolved
    • Resolution: Fixed
    • Affects Version/s: 1.8
    • Fix Version/s: 1.9
    • Component/s: Web Site
    • Labels:
      None
    • Environment:
      Operating System: All
      Platform: All

      Description

      During visualization with Squiggle or rasterization via the CLI tool, XML external entities defined in the DTD are dereferenced and the content of the target file is included in the output.

      The impact of this vulnerability range form denial of service to file disclosure. Under Windows, it can also be used to steal LM/NTLM hashes.

      For some additional information about XXE attacks, please refer to http://cwe.mitre.org/data/definitions/827.html

      How to reproduce:
      $> rasterizer xxe.svg -d xxe.png

      1. ssrf.svg
        0.3 kB
        Lars Krapf
      2. xxe.png
        125 kB
        Nicolas GREGOIRE
      3. xxe.svg
        0.6 kB
        Nicolas GREGOIRE

        Issue Links

          Activity

          Hide
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment -

          Attachment xxe.svg has been added with description: Malicious SVG file

          Show
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment - Attachment xxe.svg has been added with description: Malicious SVG file
          Hide
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment -

          Attachment xxe.png has been added with description: Result of the rasterization of xxe.svg

          Show
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment - Attachment xxe.png has been added with description: Result of the rasterization of xxe.svg
          Hide
          deweese@apache.org Thomas DeWeese added a comment -

          I don't want to dismiss this out of hand but I'm not sure I agree that a vulnerability really exists.

          Given that Batik is more a toolkit than a finished product a lot more of the responsibility for avoiding these issues falls on the users rather than the library. This more or less required given that it's impossible for us to know ahead of time what parts of the system the batik libraries should be allowed to access or not.

          Please note that xxe.svg will fail if you use squiggle and you fetch 'xxe.svg' from a server (I even tried variants like replacing etc/passwd with file:///etc/passwd).

          People using the rasterizer to rasterize random content from the web should be more careful. They can use Java's build in support for policy files to restrict access to the file system. I don't think it would be appropriate for the toolkit to restrict this ahead of time since many legitimate uses may need fairly wide access to the filesystem. I checked and browsers seem to block all access to the file system when loading a file from the disk even if it's co-located. That may make sense for a browser but I think would block many legitimate uses of Batik.

          Show
          deweese@apache.org Thomas DeWeese added a comment - I don't want to dismiss this out of hand but I'm not sure I agree that a vulnerability really exists. Given that Batik is more a toolkit than a finished product a lot more of the responsibility for avoiding these issues falls on the users rather than the library. This more or less required given that it's impossible for us to know ahead of time what parts of the system the batik libraries should be allowed to access or not. Please note that xxe.svg will fail if you use squiggle and you fetch 'xxe.svg' from a server (I even tried variants like replacing etc/passwd with file:///etc/passwd ). People using the rasterizer to rasterize random content from the web should be more careful. They can use Java's build in support for policy files to restrict access to the file system. I don't think it would be appropriate for the toolkit to restrict this ahead of time since many legitimate uses may need fairly wide access to the filesystem. I checked and browsers seem to block all access to the file system when loading a file from the disk even if it's co-located. That may make sense for a browser but I think would block many legitimate uses of Batik.
          Hide
          jeremias@apache.org Jeremias Maerki added a comment -

          I agree with Thomas. In a short experiment, I was able to use XInclude (implemented by Apache Xerces-J) to force the same effect. Batik does not even know about XInclude since it's a parser-level feature.

          However, it might be a good idea to write some documentation about it so users are reminded to secure their applications.

          Show
          jeremias@apache.org Jeremias Maerki added a comment - I agree with Thomas. In a short experiment, I was able to use XInclude (implemented by Apache Xerces-J) to force the same effect. Batik does not even know about XInclude since it's a parser-level feature. However, it might be a good idea to write some documentation about it so users are reminded to secure their applications.
          Hide
          helder.magalhaes@gmail.com Helder Magalhães added a comment -

          (In reply to comment #3)
          > I agree with Thomas.

          I agree with Thomas and Jeremias as well.

          > However, it might be a good idea to write some documentation about it so
          > users are reminded to secure their applications.

          Decreasing severity and moving this to the "Web Site" component, more in the sense of "Documentation" (which doesn't exist); "javadoc" alone doesn't feel right as well: I'd say that these sort of reminders belong to a higher level than Javadoc, although probably something might be done in code documentation as well.

          (In reply to comment #0)
          > During visualization with Squiggle or rasterization via the CLI tool, XML
          > external entities defined in the DTD are dereferenced and the content of the
          > target file is included in the output.
          >
          > The impact of this vulnerability range form denial of service to file
          > disclosure. Under Windows, it can also be used to steal LM/NTLM hashes.

          First of all, thanks for the report!

          Thomas has provided a good insight about this potential issue in comment #2. Based in the feedback and in a few performed tests, I'd say the example provided is roughly equivalent to an ECMAScript getURL fetching the "/etc/passwd" (using the "file" protocol).

          If you still believe this can be considered a security issue then please adjust the priority accordingly. In any case, elaborating a bit longer would help - for further understanding what can be involved or (simply) to serve as base for the documentation improvements.

          Show
          helder.magalhaes@gmail.com Helder Magalhães added a comment - (In reply to comment #3) > I agree with Thomas. I agree with Thomas and Jeremias as well. > However, it might be a good idea to write some documentation about it so > users are reminded to secure their applications. Decreasing severity and moving this to the "Web Site" component, more in the sense of "Documentation" (which doesn't exist); "javadoc" alone doesn't feel right as well: I'd say that these sort of reminders belong to a higher level than Javadoc, although probably something might be done in code documentation as well. (In reply to comment #0) > During visualization with Squiggle or rasterization via the CLI tool, XML > external entities defined in the DTD are dereferenced and the content of the > target file is included in the output. > > The impact of this vulnerability range form denial of service to file > disclosure. Under Windows, it can also be used to steal LM/NTLM hashes. First of all, thanks for the report! Thomas has provided a good insight about this potential issue in comment #2. Based in the feedback and in a few performed tests, I'd say the example provided is roughly equivalent to an ECMAScript getURL fetching the "/etc/passwd" (using the "file" protocol). If you still believe this can be considered a security issue then please adjust the priority accordingly. In any case, elaborating a bit longer would help - for further understanding what can be involved or (simply) to serve as base for the documentation improvements.
          Hide
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment -

          I understand your position but I think that these risks should then be much more visible to casual users of the framework (i.e. documentation improvement).

          Nowadays, it's trivial to find some applications using Batik in a insecure way (allowing the disclosure of local files). Examples:

          • Apache FOP: vulnerable. Repro: FOP document including a malicious SVG image
          • HighCharts JS: vulnerable. Repro: submit a malicious SVG to the on-line export feature of this graph library

          MediaWiki seems impacted too:
          http://www.mediawiki.org/wiki/Manual:$wgSVGConverters

          Regarding XInclude: it is a feature of the XML parser and could be disabled there in security-conscious deployments
          Regarding ECMAScript: it can disabled using command-line options. The main differences with the XXE attack are that this one is scriptless and can't be inhibited using options

          Show
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment - I understand your position but I think that these risks should then be much more visible to casual users of the framework (i.e. documentation improvement). Nowadays, it's trivial to find some applications using Batik in a insecure way (allowing the disclosure of local files). Examples: Apache FOP: vulnerable. Repro: FOP document including a malicious SVG image HighCharts JS: vulnerable. Repro: submit a malicious SVG to the on-line export feature of this graph library MediaWiki seems impacted too: http://www.mediawiki.org/wiki/Manual:$wgSVGConverters Regarding XInclude: it is a feature of the XML parser and could be disabled there in security-conscious deployments Regarding ECMAScript: it can disabled using command-line options. The main differences with the XXE attack are that this one is scriptless and can't be inhibited using options
          Hide
          tony.benbrahim Tony BenBrahim added a comment -

          It will take more than documentation improvements, as some classes do not expose any way to configure the parser or parser factory between creation and use, and require subclassing, access to the source, etc..., we are now well past the point of the casual user. See BATIK-1113 for details.

          More to the point, the ideal fail-safe solution would be to disable XEE by default, and provide methods to turn the features back on, for the handful of users who need this feature. If BATIK is run on a server environment, you most certainly do not want this feature, unless you also fully control external entity resolution and loading. I suspect the majority of users do not know what XML external entities are, do not need them and are not aware of the security implications, so the fail-safe approach seems like the best approach,

          Show
          tony.benbrahim Tony BenBrahim added a comment - It will take more than documentation improvements, as some classes do not expose any way to configure the parser or parser factory between creation and use, and require subclassing, access to the source, etc..., we are now well past the point of the casual user. See BATIK-1113 for details. More to the point, the ideal fail-safe solution would be to disable XEE by default, and provide methods to turn the features back on, for the handful of users who need this feature. If BATIK is run on a server environment, you most certainly do not want this feature, unless you also fully control external entity resolution and loading. I suspect the majority of users do not know what XML external entities are, do not need them and are not aware of the security implications, so the fail-safe approach seems like the best approach,
          Hide
          lmpmbernardo Luis Bernardo added a comment -

          this was disabled in trunk recently. if there is demand to make this configurable then we can implement that feature, but by default XXE will be disabled.

          Show
          lmpmbernardo Luis Bernardo added a comment - this was disabled in trunk recently. if there is demand to make this configurable then we can implement that feature, but by default XXE will be disabled.
          Hide
          chaotic Lars Krapf added a comment -

          Hello

          The fix for this issue seems to be incomplete. You should also disable external DTD resolution to avoid SSRF:

          dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

          See attached ssrf.svg for an example.

          Show
          chaotic Lars Krapf added a comment - Hello The fix for this issue seems to be incomplete. You should also disable external DTD resolution to avoid SSRF: dbf.setFeature( "http: //apache.org/xml/features/nonvalidating/load-external-dtd" , false ); See attached ssrf.svg for an example.
          Hide
          chaotic Lars Krapf added a comment -
          chaotic@m0lly:~$ nc -l 2323
          
          GET / HTTP/1.1
          User-Agent: Java/1.7.0_60-ea
          Host: localhost:2323
          Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
          Connection: keep-alive
          
          Show
          chaotic Lars Krapf added a comment - chaotic@m0lly:~$ nc -l 2323 GET / HTTP/1.1 User-Agent: Java/1.7.0_60-ea Host: localhost:2323 Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: keep-alive
          Hide
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment -

          I confirm the SSRF vector. However, this ticket is closed for months. Maybe you should create a new ticket referring to this one...

          Show
          nicolas.gregoire@agarri.fr Nicolas GREGOIRE added a comment - I confirm the SSRF vector. However, this ticket is closed for months. Maybe you should create a new ticket referring to this one...
          Hide
          chaotic Lars Krapf added a comment -

          Hello Nicolas GREGOIRE - Thanks for confirming, and yes, I've already created BATIK-1139 in the meantime.

          Show
          chaotic Lars Krapf added a comment - Hello Nicolas GREGOIRE - Thanks for confirming, and yes, I've already created BATIK-1139 in the meantime.

            People

            • Assignee:
              batik-dev@xmlgraphics.apache.org Batik Developer's Mailing list
              Reporter:
              nicolas.gregoire@agarri.fr Nicolas GREGOIRE
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development