Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16766

hdfs ec command loads (administrator provided) erasure code policy files without disabling xml entity expansion

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      XML External Entity (XXE) attacks can occur when an XML parser supports XML entities while processing XML received from an untrusted source. The attack resides in XML input containing references to an external entity an is parsed by the weakly configured javax.xml.parsers.DocumentBuilder XML parser.

       

      https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/ECPolicyLoader.java#L93

      sonatype-2022-5732

      If anyone is landing on this page following the sonatype-2022-5732 alert

      1. the xml expansion only happens on the command line of the hdfs ec command https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html#Administrative_commands
      2. the outcome of entity expansion will be the command failing/running out of memory
      3. if a cluster admin is loading erasure policies from untrusted sources, there are fundamental process issues to worry about beyond xml references

      hdfs cluster administrators who receive XML erasure coding policies from untrusted sources (email etc) must sanitize the file by removing all &entity; references before using the "hdfs ec" command. otherwise the tool will fail before it has a chance to apply whatever the malicious EC policy was. Alternatively: do not configure your hadoop cluster from XML files you haven't written yourself.

      Attachments

        Issue Links

          Activity

            People

              groot Ashutosh Gupta
              Du Jing
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: