Coordinated disclosure of XML roundtrip vulnerabilities in Go’s standard library

tedunangst · on Dec 14, 2020

I've never liked (nor understood the popularity) of signature schemes that require parsing before verification. This has also led to problems with X.509. And DKIM. And plists. And package managers. And more.

It's much simpler to sign the entire message, unparsed, and it's immune to these issues.

We went through a decade of debate before deciding that "encrypt then mac" is the only right way to do things. That knowledge hasn't trickled down to other domains.

Everlag · on Dec 14, 2020

Aye, this is the 'Cryptograph Doom Principle'[0].

To very lossily summarize: always authenticate before looking at the message.

Its a handy rule of thumb when you're making choices like how to validate a message.

https://moxie.org/2011/12/13/the-cryptographic-doom-principl....

vbezhenar · on Dec 14, 2020

I agree that it's a bad idea. I had an issue with two XML libraries with different languages. One did XML signing, other did verifying. They just did not work, properly signed message failed to validate. I tried to debug, but those standards were incomprehensible, there are thousands of LoC dedicated to normalization and whatnot. You need few dozens of LoC to sign or verify bytes and you need incredible complexity to implement that XML security thing.

But the issue is: those standards are out there and they're used and probably some people will use it in new projects and you have to interoperate with them.

So yeah, don't use those standards when you can, but sometimes you have to.

marcan_42 · on Dec 15, 2020

Yup, as I was reading this I was wondering how this could possibly lead to a critical vuln since nothing would ever depend o... and then I read about how SAML works.

Kill it with fire. This stuff's broken. We know better than to do things this way no. Just no. You sign binary blobs. Signature check fails, your binary blob is garbage and never gets parsed. End of story.

(Mental note: never deploy SAML anywhere)

Aside: I've seen a credit card processor implement nonsense like this, where I had to parse XML with regular expressions to extract the to-be-signed segment, because it was never going to round trip through a typical XML parser. But then again, this was only about the 25th batshit insane and likely insecure thing they were doing, just like every other banking related company, so shrug.

userbinator · on Dec 15, 2020

This has also led to problems with X.509.

Could you explain more about this? I thought the whole point of ASN.1 DER was to have only one canonical representation for a given structured value, and that the signing was done as-is on the sequence of bytes directly. It definitely doesn't have the same problems as XML and other text-based formats.

tedunangst · on Dec 15, 2020

Overshoot on my part. I was thinking of problems like those in https://arxiv.org/abs/1812.04959, but that's a different problem.

dboreham · on Dec 14, 2020

I believe the problem being addressed is where the payload may be transcoded in flight or otherwise not delivered in exactly the same form. Put another way: the signature validates the payload, however it may end up being represented to the validator on delivery. It isn't simply a transport integrity measure.

hnarn · on Dec 14, 2020

I might be missing the point here, but isn't the whole idea of signing a message that it should not be possible to "transcode the message in flight"? If you even allow the message to be "not delivered in exactly the same form" in the first place, you're introducing an attack vector completely without reason, because what you instead could do is let the payload be strongly signed and unchanged, and then have differing parsing rules at the end.

koolba · on Dec 15, 2020

Yep and it’s even worse because the signing and encryption involves XML transforms to canonicalize the source prior to verifying them. So you force the recipient to not only validate a potentially transformed message, but they have to transform it again too!

It’s the perfect intersection of precarious and deranged.

whatshisface · on Dec 14, 2020

You sign your letter and seal it in an envelope. I put your envelope into a cardboard box and give it to your friend. Your friend refuses to open your letter because you did not sign my box.

jeltz · on Dec 14, 2020

No, this is more like your friend refusing to trust the contents of the letter after the mailman cut the letter into small pieces and glued them back together.

hnick · on Dec 15, 2020

I think that would be more analogous to receiving a message, parsing it, then realising the payload is another signed message, and then validating that.

Depending on the situation, signing the container might not even be necessary, much like a zip file without a password that only contains encrypted contents anyway.

syrrim · on Dec 14, 2020

Usually, signing the whole damn thing is too computationally expensive, so you sign a hash instead.

hakre · on Dec 15, 2020

as long as binary safe transport is given, possible. unless that ... . you can't skip the memory protocol.

GoblinSlayer · on Dec 15, 2020

signify works with text, you only decode the signature from base64 string and check the rest in raw.

hakre · on Dec 15, 2020

yes, as long as the text is binary safe ... . XML specifies the character set for the document. IIRC also for base64.

russell_h · on Dec 14, 2020

I'm the maintainer of one of the affected SAML libraries.

People need to stop using SAML. This needs to be a priority. A little background, for those who haven't had the displeasure of working with it:

When a user wants to log into an application (the "Service Provider"), and is required to SSO against an "Identity Provider", the Identity Provider basically generates an XML document with information about the user, then signs that document using a thing known as an XML Digital Signature, or XMLDSIG.

When you think of "signing" a document, normally you would serialize that document out to bytes, apply your signature scheme over the bytes, then send along both the bytes and the signature. But for reasons which are irrelevant to modern implementations, XMLDSIG prefers to stuff the signature metadata back inside the XML document that was just signed. Obviously this invalidates the signature, so you also inject some metadata instructing receivers on how to put the document back how it was. There are several algorithms available for this. Then you ship around that XML document. Basically means that when the Identity Provider receives one of these documents it needs to:

  1. Parse the XML document (which cannot yet be trusted)
  2. Find the signature inside the document
  3. Find the metadata about what algorithm(s) to use to restore the document
  4. Run the document through whatever transforms are described in that metadata (keep in mind that up to this point the document might well have been supplied by an attacker)
  5. Serialize the transformed document back out to bytes, being careful not to touch any whitespace, etc
  6. Verify the signature over the re-serialized document

If all of this succeeds and was implemented perfectly, you can trust the output of step 5. Ideally you should re-parse it. A common failure mode is trusting the original input instead, so be careful about that.

Obviously this is a crazy approach to one of the most security-critical parts of an application on the internet, and it breaks all the time.

Unfortunately people persist in using this fundamentally broken protocol, so huge thank you to the team at Mattermost for their research in this area.

tptacek · on Dec 14, 2020

I share your opinion of SAML, but I have to ask, as someone who has also implemented it in Golang: what gave you any confidence in an implementation backed by encoding/xml? It was to me immediately pretty obvious that DSIG and encoding/xml aren't a fit, if only because of encoding/xml's poor namespace support. There are other DSIG Golang libraries that use an etree-style interface for what I presume is the same reason.

jupenur · on Dec 15, 2020

Blog author here; Russell's implementation is backed by github.com/beevik/etree, but like you said, it's just an interface. The tokenizer is still encoding/xml.

Adding better support for namespaces and providing APIs compatible with dsig doesn't remove the underlying vulnerabilities.

tptacek · on Dec 15, 2020

Ugh. That's disappointing. I loathe SAML, but also think the right thing to do here is to make sure nobody uses encoding/xml as part of their SAML stack.

ptman · on Dec 15, 2020

Pardon my ignorance, but what should be used instead of encoding/xml?

hakre · on Dec 15, 2020

I think libxml does not have that problem, IIRC golang supports c-bindings.

masklinn · on Dec 16, 2020

> I think libxml does not have that problem

I don't know about that. libxml certainly doesn't round-trip XML documents in general (though I don't think it breaks namespaces at least), whether that breaks SAML or not I have no idea.

Anyway from tptacek's other comments it looks like general-purpose XML libraries should not be assumed suitable for SAML, instead they should have purpose-built implementation for the SAML bit, then once the document has been properly validated and the SAML bits stripped off I guess that can be passed onto a general-purpose library:

> SAML libraries should include purpose-built, locked-down, SAML-only XMLDSIGs, and those XMLDSIGs should include purpose-built, stripped-down XMLs.

tptacek · on Dec 17, 2020

I would go out of my way to avoid libxmlsec1 and libxml. I honestly don't understand why it's so hard for a SAML implementation to just bring its own hardened stripped-down XML.

masklinn · on Dec 17, 2020

If I had to hazard a guess, bespoke implementation is usually recommended against, especially for complex formats. That it would be the best practice for saml does sound counter-intuitive.

GoblinSlayer · on Dec 15, 2020

Huh, namespaces are a semantic convention on top of xml syntax, tokenizer can't really implement it.

rsc · on Dec 15, 2020

This is like saying that variable name scoping is a semantic convention on top of the C language grammar and that a lexer can't really implement it. In the case of C, it turns out that the lexer must implement it. In the case of XML, processing name spaces directives during lexing is the right thing to do in nearly all cases. But it's not what these SAML libraries needed.

jupenur · on Dec 15, 2020

Well that didn't stop them from trying

zwass · on Dec 14, 2020

As a service provider, are there viable alternatives to support SSO in an application?

KajMagnus · on Dec 15, 2020

OpenID Connect, like others mention, but there's more in life, than only SSO? Organizations also want to automatically deactivate of user accounts?

There's something called SCIM, "System for Cross-domain Identity Management", that does this, and which you can use together with OpenID Connect (OIDC).

SCIM can automatically deactivate a user account, if the person leaves the organization or moves to a different department. And can auto add and remove him/her to/from various user groups.

But with SAML, managers / admins still need to micro manage the user accounts, e.g. place the user in the correct group, if s/he gets a new job role. SAML only syncs user accounts upon login, from what I've understood. (So if the user stays logged in, then, with SAML, his/her account permissions can get out-of-date?)

SCIM: https://docs.microsoft.com/en-us/azure/active-directory/app-...

Azure AD uses this, and Okta, OneLogin, Github and some others too I suppose.

If anyone has tried SCIM it'd be interesting to hear what you think about it? (I've just read about it)

ucarion · on Dec 14, 2020

Depending on who your customers are, you might get away with only supporting OIDC. But not supporting SAML is going to be a problem as you move into the big enterprises.

Many big companies run on SAML, and expect to auth with vendors over SAML. That's why russell_h's comment is probably futile; it's the enterprises with the big SaaS budgets that keep SAML relevant, and they don't care if HN doesn't like it.

Maybe in about a decade SAML will be less important to enterprises? SAML 2.0 is only about 15 years old.

KajMagnus · on Dec 15, 2020

What about SCIM, "System for Cross-domain Identity Management", instead of SAML, if creating software for enterprises?

SCIM: https://docs.microsoft.com/en-us/azure/active-directory/app-...

ucarion · on Dec 15, 2020

SCIM is a pleasure to implement compared to SAML, no doubt. You might be able to get away with only supporting SCIM, the main thing you'd be missing is "just-in-time" user provisioning.

But given that you'll probably need SCIM at some point anyway, probably a good idea to start with SCIM, and then add SAML only when you need to! It'll also inform what subset of SAML you actually need to implement.

KajMagnus · on Dec 16, 2020

> good idea to start with SCIM, and then add SAML only when you need to

Sounds like a good approach yes. (It seems you've added SCIM to some software? About how long did it take? Was there any "gotchas")

> the main thing you'd be missing is "just-in-time" user provisioning

Hmm could that depend on the organization using the software I'm developing? — Possibly they'll synchronize user accounts and groups, upon installation of the software, and whenever anything changes — and then all user accounts will be ready already, when someone wants to log in.

But if they syncronize only, say, once a day, then, with SAML, one could still log in, and the account would get created and added to the correct groups, also if the sync that would create one's account, hadn't happened yet? (OIDC could help a bit, but it doesn't understand user groups and permissions, only SAML and SCIM does, right.)

toupeira · on Dec 14, 2020

OpenID Connect seems to be pretty well established: https://openid.net/certification/

It defines an authentication protocol on top of OAuth2, and is a different beast from the older OpenID standards.

GoblinSlayer · on Dec 15, 2020

CAS looks like the simplest protocol to me https://en.wikipedia.org/wiki/Central_Authentication_Service

pabs3 · on Dec 15, 2020

If the user interface for TLS client certificates were not so consistently terrible across all browsers on all platforms, I would suggest them.

Mikhail_Edoshin · on Dec 15, 2020

It looks intimidating, but it seems to me that most of what happens are standard XML operations. E.g. 1: read XML, 2: use XPath, 3: read element names, structure, and attributes, 4, some parts of: apply XPath or XSLT, if specified, 5: serialize XML (which implies whitespace treatment). Given that the authors of the standard considered the standard XML operations to be readily available, the cryptographic additions do not seem to be that complex.

ARolek · on Dec 14, 2020

I just wrapped up a Go based SAML integration and I was initially using your package. I encounter a bunch of issues with namespaces, and eventually ended up using xmlsec1 directly against a formed up XML template. I agree with you, SAML should go away. This was my first experience with it, and the nuanced complexities were so exhausting. I’m glad the integration is done...

nine_k · on Dec 15, 2020

I wonder how hard would it be to adopt some SAML 2.0 (or maybe just 1.x) with this, and maybe a few other problematic bits updated, but otherwise unchanged? Do you think the rest is worth keeping?

E.g. we did not stop using TLS when TLS 1.0 proved to have problems; we updated the cryptography and kept using the logic.

ByteJockey · on Dec 15, 2020

But the problem here isn't he encryption. Well, for all I know, the encryption could be completely broken, I'm not a crypto-expert.

But the problem described in the post wasn't the encryption. It was the logic. Specifically the order that things are done in. Parsing something before verifying it can be dangerous.

nine_k · on Dec 15, 2020

Indeed! Let's scratch the XMLDSIG entirely and replace it with a sane scheme.

Does SAML have enough salvageable parts to try fixing that, instead of going with something completely different? SAML is so pervasive that migrating off it can't be cheap or easy.

Lorkki · on Dec 15, 2020

> But for reasons which are irrelevant to modern implementations, XMLDSIG prefers to stuff the signature metadata back inside the XML document that was just signed.

Out of curiosity, what are those reasons?

tannhaeuser · on Dec 14, 2020

XML namespaces were controversial when introduced, and their implementation as privileged "xmlns:..." attributes with complex scoping, layering, and defaulting rules have been criticized many times; see [1] for a reflection from 2010 by an insider admitting to the fact that "every step on the process that led to the current situation with XML Namespaces seems reasonable".

When in 1996-98 W3C/The SGML Extended Review Board subset XML from SGML to define a generic markup convention for use with the expected wealth of upcoming vocabularies on the web, the issue of name collisions between elements (and attributes) from different vocabularies was deemed significant. Of course, in hindsight, with only SVG and MathML (and rarely HTML 5 in XHTML serialization) left on the web and having been incorporated as foreign elements directly into HTML, this seems overkill (even though there are actually collisions between eg. the title element in SVG vs HTML).

There's an alternative (and saner IMHO) approach for dealing with XML namespaces in ISO/IEC 19757-9 [2] by just presenting a canonical (ie. always the same) namespace prefix as part of an element name by a parser API to an app, guided by processing instructions for binding canonical namespace prefixes to namespace URLs, which might also help enterprise-y XML with lots of XML Schema use. Of course, this doesn't help with roundtripping xmlns-bindings (eg. with their exact ordering, possible redundancy, temporary/insignificant namespace prefixes, re-binding in document fragments etc.) through DOM representations, which seems the problem here.

[1]: https://blog.jclark.com/2010/01/xml-namespaces.html

[2]: https://www.iso.org/obp/ui/#iso:std:iso-iec:19757:-9:ed-1:v1...

warp · on Dec 14, 2020

> Despite significant efforts by the Go security team, it has not been possible to patch the vulnerabilities discussed in this blog post.

Well, that is not something you want to see in a public disclosure.

tptacek · on Dec 14, 2020

It's a little-loved library in the standard library that wasn't designed to support cryptographic security, but was by many projects repurposed as such to support SAML. It was a mistake for any SAML project to depend on encoding/xml.

ivanbakel · on Dec 14, 2020

> that wasn't designed to support cryptographic security

An XML library doesn't have to support cryptographic security - it just has to perform XML en/decoding effectively. How can it be a mistake for a project to rely on part of the standard library?

tptacek · on Dec 14, 2020

You write this as if XMLDSIG was straightforward, but people who have worked with XMLDSIG before know that it is not, and people who haven't worked with XMLDSIG should know that they need to research new cryptosystems before slapping them together out of spare parts.

Just to clear it up and state it plainly: it is never reasonable to assume that a given XML library is suitable for building XMLDSIG on top of or alongside.

eznzt · on Dec 14, 2020

I don't know what I'm talking about, but if the XMLDSIG support of encoding/xml does not work properly and no one can fix it, can't they just drop it?

tptacek · on Dec 14, 2020

Drop what? encoding/xml was never a reasonable building block for SAML and XMLDSIG, that was pretty immediately apparent from the library itself, and the Go project never told people they should using encoding/xml this way. They never picked it up in the first place, is what I'm saying.

xyzzy_plugh · on Dec 14, 2020

I don't doubt it was a mistake, but I'm curious what the alternative is here? I'm looking to add SAML to my to project in the medium-term.

tptacek · on Dec 14, 2020

SAML is the only mainstream user of XMLDSIG and 99%+ of the installed base of XMLDSIG. SAML libraries should include purpose-built, locked-down, SAML-only XMLDSIGs, and those XMLDSIGs should include purpose-built, stripped-down XMLs.

The XML isn't even the hard problem here! XMLDSIG and XML Canonicalization are much more complicated than the baseline XML parser.

layer8 · on Dec 14, 2020

> SAML is the only mainstream user of XMLDSIG

That’s not quite accurate. XMLDSIG is widely used in SOAP, and also in the European XAdES signature standard (which is an extension of XMLDSIG).

vbezhenar · on Dec 14, 2020

Yep, in Kazakhstan almost every government web service uses XML signatures for interoperating. Of course nobody sees it outside of those systems, but it's everywhere inside. I have no idea about other countries, but I would not be surprised to find out that there are many other similar countries or organizations where that stuff works inside. You won't know about it until you touch it.

cesarb · on Dec 14, 2020

Here in Brazil too, for instance the NF-e (electronic "nota fiscal", the English term seems to be "receipt", the documentation is at http://www.nfe.fazenda.gov.br) and the recently released digital payment network (PIX, see https://www.bcb.gov.br/content/estabilidadefinanceira/cedsfn... for the current release of the relevant manual) both use XMLDSig.

neilv · on Dec 14, 2020

Yep, I've implemented SAML multiple times with a purpose-built processing.

And I implemented a non-SAML use of XML-DSIG standards, and discovered, when attempting to interoperate, that a major platform vendor's implementation of wasn't compliant, such that hashes would only be correct using that vendor's implementation (which I initially assumed was a mistake of mine, until an expert confirmed the major vendor was actually wrong).

KajMagnus · on Dec 15, 2020

> I'm looking to add SAML to my to project in the medium-term

Did you hear about SCIM, "System for Cross-domain Identity Management"? If combining with OIDC, then, seems to me one gets a more modern alternative to SAML. I've read just a bit about SCIM though.

SCIM: https://docs.microsoft.com/en-us/azure/active-directory/app-...

I wrote more in this comment: https://news.ycombinator.com/item?id=25425665

(What's your project about?)

tatersolid · on Dec 20, 2020

If you want enterprise customers, you still need to support SAML in 2020.

Thaxll · on Dec 14, 2020

Because the standard lib was not designed to handle that use case and so they can't change it right now to not break compatibility, which is why they're going to add new API in Go 1.16 that will release in February.

Groxx · on Dec 14, 2020

Which they also point out may be insufficient:

>By Mattermost’s estimates this new API will not be a reasonable solution for most use cases currently affected by the vulnerabilities. Parsing and resolving namespaces is an essential requirement for correctly implementing SAML, and even considering only a limited set of real-world SAML messages without strict namespacing requirements would be unlikely to allow for a secure implementation.

tptacek · on Dec 14, 2020

Yes: people shouldn't be using encoding/xml to implement SAML, at all. The library was already functionally problematic for SAML, because it doesn't fully implement namespaces. Nor does it implement `xml-exc-c14n`. For the IdP I wrote last year, I just wrote my own XML; it's not that big a deal.

Software security people have understood for a long time that XMLDSIG is sketchy, and that implementations often need to be "bug-compatible" to interoperate safely. SAML is an XMLDSIG protocol. I feel bad for putting it this way, but I think that reasonably skilled security engineers should be alarmed if their platform's standard XML library easily allows you to implement something that claims to be DSIG.

jka · on Dec 14, 2020

Meta-question: are software standards generally becoming more security-friendly over time?

tptacek · on Dec 14, 2020

No, but software itself is, because of designs like WireGuard (which aren't formal standards) and software like libsodium and signify.

jarym · on Dec 14, 2020

Glad this got found. I remember when XML was being widely adopted that there'd be frequent vulnerabilities found in Java-based parsers.

A large part of this stems from how complicated XML can get - if it were only elements and attributes it might have been fine. Namespaces made it a bit more complicated. Processing Instructions made it hideous.

userbinator · on Dec 15, 2020

XML and many other things related to it (including Java, SOAP, CORBA, etc.) are an example of what could be called "Enterprise mindset" taken to an extreme. Insanely high levels of abstraction and indirection, absurd amounts of needless flexibility and generality, and essentially zero thought given to efficiency or simplicity. It's as if the people responsible for these spent all their time thinking "what's the most complicated way to do something?"

ben509 · on Dec 14, 2020

It's really nuts trying to implement something like SAML in XML. The standard is a security minefield.

tptacek · on Dec 14, 2020

What makes it worse is that XMLDSIG is exponentially more complicated. Most of the ecosystem literally shells out to libxmlsec1 and assumes it does the right thing. DSIG is a batshit standard that attempts to support arbitrary combinations of signed and unsigned parts in a single document, tied together with a DOM-like scheme, passed through a canonicalizing transformation that has itself broken SAML before. It's a fractal of bad security design.

bawolff · on Dec 14, 2020

Not to mention that libxmlsec1 has some insane insecure defaults that are effectively undocumented.

(I'd go into more details, but i literally just sent a security report yesterday to a saml library for using it wrong, so i guess i shouldn't post publicly about it until they fix)

swiley · on Dec 14, 2020

You probably shouldn't have posted this either.

llimllib · on Dec 14, 2020

Also it's my experience that nobody follows the standard - producing valid SAML isn't enough, you need to produce the exact SAML your consumer expects or receivers will reject it. (The context here was passing users off from healthcare.gov to issuers)

tptacek · on Dec 14, 2020

What makes it worse is that there are practical reasons to implement that way; I've done so for clients, because of bugs found in other SAML parsers that we couldn't leave people susceptible to. One of the material things you can do to lock down a SAML implementation is to accept only the pattern of XML tokens you expect from mainstream IdPs, and then wait for people to complain.

llimllib · on Dec 14, 2020

I'm so happy I no longer need to work with it. I wrote a manifesto on how it (doesn't) work for the person that replaced me on that project, and it was long, detailed, and angry

jiggawatts · on Dec 14, 2020

I've been cavorting around that minefield recently. I still have some of my legs and a tiny bit of my sanity.

The most recent "fun" I had was that on a Citrix NetScaler, if you enable a certain n-Factor workflow, it sends a SAML request to the IdP that Microsoft products only reject as "invalid XML".

From what I can gather the XML being sent is perfectly valid. The issue must be something hideously subtle, like the white space or UTF-8 encoding being subtly different that is upsetting the Microsoft SAML implementations, but not any others.

Have a look at some SAML XML examples online: https://www.samltool.com/generic_sso_res.php

They're hideous not because they're XML, but because they're bad XML! The SAML standard defines its own "namespace attributes" separately but on top of the XML namespaces!

Similarly, instead of the straightforward way to encode the data:

    <tag prop="attr">value</tag>

They abstract one level up unnecessarily:

    <element name="tag">
        <attribute name="prop">attr</attribute>
        <content>value</content>
    </element>

This is the same mistake people make in database schema design, where they'll have a table with columns called "Key", "ColumnName", and "ColumnValue".

userbinator · on Dec 15, 2020

The issue must be something hideously subtle, like the white space or UTF-8 encoding being subtly different that is upsetting the Microsoft SAML implementations, but not any others.

That is almost certainly the case, as another comment here indirectly references: https://news.ycombinator.com/item?id=25422734

ikiris · on Dec 14, 2020

oh wow that's disgusting, why would someone design something like this.

jiggawatts · on Dec 14, 2020

It's the most egregious example of design-by-committee that I have ever seen.

Everything about SAML is about 10x more complex than it technically needs to be.

On top of that, it has so many optional features that interoperability problems are likely even between 100% standards compliant implementations.

chrsig · on Dec 14, 2020

It's worth noting that the go1 compatibility promise[1] allows for breaking compatibility in the name of security issues:

>Security. A security issue in the specification or implementation may come to light whose resolution requires breaking compatibility. We reserve the right to address such security issues.

If the go team decides that this issue is worth a breaking change is another question entirely.

[1]https://golang.org/doc/go1compat

jerf · on Dec 14, 2020

Anyone have examples of XML that can be mutated? My guess is that it wouldn't take much.

I expect that a similar problem will be found in many other libraries, if the XML was publicized. XML namespaces made a critical... "mistake" is probably too strong, but "design choice that deviated too far from people's mental model" is about right... that has prevented them from being anywhere near as useful or safe as they could be. In an XML document using XML namespaces, "ns1:tagname" may not equal "ns1:tagname", and "ns1:tagname" can be equal to "ns2:tagname". This breaks people's mental models of how XML works, and correspondingly, breaks people's code that manipulates XML.

(I actually used the Go XML library as an SVG validator in the ~1.8 timeframe and had to fork it to fix namespaces well enough to serve in that role. I didn't know about how to exploit it in a specific XML protocol but I've know about the issues for a while. "Why didn't you upstream it then?" Well, as this security bulletin implies, the data structures in encoding/xml are fundamentally wrong for namespaced XML to be round-tripped and there is no backwards-compatible solution to the problem, so it was obvious to me without even trying that it would be rejected. This has also been discussed on a number of tickets subsequently over the years, so that XML namespace handling is weak in the standard library is not news to the Go developers. Note also that it's "round-tripping" that is the problem; if you parse & consume you can write correct code, it's the sending it back out that can be problematic.)

Namespaces fundamentally rewrite the nature of XML tag and attribute names. No longer are they just strings; now they are tuples of the form (namespace URL, tag name)... and namespace URL is NOT the prefix that shows up before the colon! The prefix is an abbreviation of an earlier tag declaration. So in the XML

    <tag xmlns="https://sample.com/1" xmlns:example1="https://blah.org/1">
      <example1:tag xmlns:example2="https://blah.org/2">
        <example2:tag xmlns:example1="https://anewsite.com/xmlns">
          <example1:tag />
        </example2:tag>
      </example1:tag>
    </tag>

not a SINGLE ONE of those "tag"s is the same! They are, respectively, actually (https://sample.com/1, tag), (https://blah.org/1, tag), (https://blah.org/2, tag), and (https://anewsite.com/xmlns, tag). There's a ton of code, and indeed, even quite a few standards, that will get that wrong. (Note the redefinition of 'example1' in there; that is perfectly legal.) Even more excitingly,

    <tag xmlns="https://sample.com/1" xmlns:example1="https://sample.com/1">
      <example1:tag/>
      <example2:tag xmlns:example2="https://sample.com/1" />
    </tag>

ARE all the exact tag and should be treated as such, despite the different "tag names" appearing.

Reserializing these can be exciting, because A: Your XML library, in principle, ought to be presenting you the (XMLNS, tagname) tuple with the abbreviation stripped away, to discourage you from paying too much attention to the abbreviation but B: humans in general and a lot of code expect the namespace abbreviations to stay the same in a round trip, and may even standardize on what the abbreviations should be. There's a LOT of code out there in the world looking for "'p' or 'xhtml:p'" as the tag name and not ("http://www.w3.org/1999/xhtml", "p").

In general, to maintain roundtrip equality, you have to either A: maintain a table of the abbreviations you see, when they were introduced, and also which was used or B: just use the (XMLNS, tagname) and ensure that while outputing that the relevant namespaces have always been declared. Generally for me I go for option B as it's generally easier to get correct and I pair it with a table of the most common namespaces for what I'm working in, so that, for example, XHTML gets a hard-coded "xhtml:" prefix. It is very easy if you try to implement A to screw it up in a way that can corrupt the namespaces on some input.

(Option B has its own pathologies. Consider:

    <tag xmlns:sample="https://example.com/1">
      <sample:tag1 />
      <sample:tag2 />
    </tag>

It's really easy to write code that will drop the xmlns specification on all of the children of "tag", since it didn't use it there, and if your code throws away where the XMLNS was declared and just looks to whether the NS is currently declared, it'll see a new declaration of the "sample" namespace on every usage. Technically correct if the downstream code handles namespaces correctly (big if!), but visually unappealing.)

Not defending Go here, except inasmuch as it's such a common error to make that I have a hard time naming libraries and standards that get namespaces completely correct, for as simple as they are in principle. (I think SVG and XHTML have it right. XMPP is very, very close, but still has a few places where the "stream" tag is placed in different namespaces and you're just supposed to know to handle it the same in all the namespaces it appears it... which most people do only because it doesn't occur to them that technically these are separate tags, so it all kinda works out in the end.... libxml2 is correct but I've seen a lot of things that build on top of it and they almost all screw up namespaces.)

masklinn · on Dec 14, 2020

> I expect that a similar problem will be found in many other libraries, if the XML was publicized. XML namespaces made a critical... "mistake" is probably too strong, but "design choice that deviated too far from people's mental model" is about right... that has prevented them from being anywhere near as useful or safe as they could be. In an XML document using XML namespaces, "ns1:tagname" may not equal "ns1:tagname", and "ns1:tagname" can be equal to "ns2:tagname". This breaks people's mental models of how XML works, and correspondingly, breaks people's code that manipulates XML.

That right there is why I like Clark's notation (despite its unholy verbosity), which I learned of because that's how ElementTree manipulates namespaces: in Clark's notation, the document is conceptually

    <{https://sample.com/1}tag>
      <{https://blah.org/1}tag>
        <{https://blah.org/2}tag>
          <{https://anewsite.com/xmlns}tag />
        </{https://blah.org/2}tag>
      </{https://blah.org/1}tag>
    </{https://sample.com/1}tag>

Which is unambiguous. But as you note adds challenges for round-trip equality (in fact ElementTree doesn't maintain that, it simply discards the namespace prefixes on parsing, which I have seen outright break supposedly XML parsers which were really hard-coded for specific namespace prefixes).

lxml does round-trip prefixes (though it still doesn't round-trip documents) by including a namespace map on each element.

dasyatidprime · on Dec 14, 2020

Not round-tripping prefixes breaks other things, and this is around where I start more thoroughly agreeing that XML namespaces tried to do too much at once: mainly, there are XML-based definitions, I think including XPath and XSLT's use of it (or old XSLT's use of what might not have been XPath yet?), that use the ambient namespace prefix set as context for interpreting attribute values that contain XML names. If you might ever interchange some prefixes, you might even get similar “looks valid but means the wrong thing” problems.

And if you try to combine namespaces with DTDs (which is just an explosive mix to start with, and I think is just recommended to never do) you get other problems, because you're no longer allowed to add arbitrary namespace declarations in the middle, so anything that round-trips prefixes but might ever add redundant declarations of them won't reliably produce something that DTD-validates, and if you're transforming into a DTD from something that might have used other namespaces, you have to make sure to remove all the extra declarations, and…

Note that most of this is still “well-defined”, it's just awkwardly hairy. This is not to be taken as an excuse to implement the standard badly or incorrectly if you're going to handle it at all.

dasyatidprime · on Dec 14, 2020

In other words, the minds of a large number of human specifiers and implementors have their own roundtrip corruption flaws where when you pass the XML namespaces specification through them, you get something out that's incoherent and doesn't interoperate properly, creating representation mismatch problems down the line back in the digital world.

dasyatidprime · on Dec 14, 2020

To add to this with some (potentially out-of-date!) personal experience: sometime around 2008, now somewhere in my dusty directories, I started implementing my own Ruby XML DOM-like library (based on the Parsifal XML lexer in C) mainly to handle this properly, because the most “normal” REXML did something really horrible in its API for namespaced attributes that made them almost unusable by clients of the library. (I forget why I couldn't use the Ruby bindings to libxml2 at the time; I think maybe they didn't support several things I wanted to do, and I couldn't add them without patching and vendoring libxml2, and that would be its own disaster because of shared libraries, etc.) Specifically, attribute accesses didn't always work consistently with namespaces, the callers had to handle prefix management themselves in places, and looping over the attributes of an element required you to handle both “single” attributes and sub-collections of namespaced attributes if there were any qualified and unqualified attributes with the same short name, because the second was what the outer collection was indexed by…

(I originally wrote this comment with a more fully worked-out example, but after viewing it in context I realized it was way too long to be an only-partly-on-topic comment on this thread, so I'll probably move it to a post elsewhere and submit it later.)

lyxsus · on Dec 14, 2020

> This breaks people's mental models of how XML works, and correspondingly, breaks people's code that manipulates XML.

Because they usually have incorrect mental model. Blaming namespaces for name ambiguity would be the same as blaming the code "x = a + b" because "a" and "b" could be defined differently.

Namespace prefixes are absolutely irrelevant, they only exists for your convenience.

tptacek · on Dec 14, 2020

That doesn't seem accurate at all. It would be the case if there was some deterministic abbreviation from URL namespace qualifiers down to namespace prefixes, but there is not; instead, they are template variables, which can be shuffled throughout an XML document, requiring security software to constantly and reliably keep track of the value of the variable at multiple points. People sign URLs and JSON documents all the time with schemes that don't have this goofy property.

There's a similar problem with XML entity references, which have been happily breaking enterprise security for over a decade, because nobody has a good mental model of how entities in XML documents actually behave.

It seems fair at this point to blame the standard.

jerf · on Dec 15, 2020

In hindsight, it probably would have been better to define standard prefixes, let people just sort of register their own for non-standard ones in whatever manner is suitable for their particular top-level document type, and if somebody, somewhere out there did finally manage to stomp on each other, let that particular type of document where that happened deal with it.

While technically suboptimal compared to what currently exists, it would match people's expectations better, and in practice, I can't speak for everybody, but I just don't see a whole lot of documents with hundreds+ namespaces such that collisions are a realistic possibility. And when I do see documents with a lot of namespaces (XMPP, for instance, or XHTML+SVG+some other thing), there's still a top-level type that could keep its own registry just fine. A bit of guidance on naming extensions probably ("don't call it e:, work your name in somehow like with the initial of your company or something") would have 99.9% solved the problem.

Prior to seeing what happened I'd probably still have argued for the current namespaces spec. In principle it doesn't seem that complicated to me. But I'm obviously wrong in practice, because, like I said, I can hardly cite an example of them being used correctly at all.

(Likewise, in hindsight, entities shouldn't have been able to be recursive, and if we were spec'ing out the next generation of XML I'd straght-up remove them except for the ones necessary to XML itself, <, >, and & because UTF covers the major use case of entities now. I'd discard the "terrible, terrible templating language" use case entirely.)

pvg · on Dec 15, 2020

In principle it doesn't seem that complicated to me. But I'm obviously wrong in practice, because, like I said, I can hardly cite an example of them being used correctly at all.

A snarky-but-mostly-true oversimplification: the complexity was necessary because XML was supposed to become a machine-readable interchange format for everything but it ended up not becoming that due to the complexity.

progval · on Dec 14, 2020

> instead, they are template variables, which can be shuffled throughout an XML document, requiring security software to constantly and reliably keep track of the value of the variable at multiple points

Isn't the issue here that they are mixing this templating with the business logic? They should be fine if the XML parser (or some post-processing) expanded the namespaces and business logic didn't see them at all.

> People sign URLs and JSON documents all the time with schemes that don't have this goofy property.

Similarly, that might be a design issue. They should only sign documents they 100% built and serialized themselves, so the set of tags and namespaces.

lyxsus · on Dec 14, 2020

> That doesn't seem accurate at all. It would be the case if there was some deterministic abbreviation from URL namespace qualifiers down to namespace prefixes, but there is not;

I'm not sure what you mean by that, tbh. It seems to me that namespace expansion is absolutely straightforward and deterministic. There're scopes, yes, but they're too well-defined (if that's what you mean).

tptacek · on Dec 14, 2020

Yes, you are describing the same feature I am with slightly different words. It obviously causes problems. You could describe XML entity expansion in simple terms too, and it would remain one of the major causes of game-over vulnerabilities in enterprise software over the last decade.

lyxsus · on Dec 14, 2020

Well, yeah, true.

I believe it's mostly implementation and popularisation problems.

The w3c specs surrounding xml/xpath/xslt/rdf and etc are very well designed but it's possible to appreciate them only after you spend ridiculously unreasonable amount of time reading and putting them all together. Otherwise it looks like a stupid pile of complexity with no purpose.

And what upsets me the most is the lack of really good libraries, everything I worked with just sucks so much.

I still have a hope that maybe in 5-15 years things will change.

layoutIfNeeded · on Dec 14, 2020

>Namespace prefixes are absolutely irrelevant, they only exists for your convenience.

This is false. As soon as you need XML canonicalization you very much need those prefices exactly as they were present in the original document.

lyxsus · on Dec 14, 2020

It doesn't affect data model encoded in document even a tiny bit. Namespace prefixes are irrelevant. If changing these prefixes breaks the program, the program is incorrect.

layoutIfNeeded · on Dec 14, 2020

Again, this is false.

“The C14N-20000119 Canonical XML draft described a method for rewriting namespace prefixes such that two documents having logically equivalent namespace declarations would also have identical namespace prefixes. The goal was to eliminate dependence on the particular namespace prefixes in a document when testing for logical equivalence. However, there now exist a number of contexts in which namespace prefixes can impart information value in an XML document. For example, an XPath expression in an attribute value or element content can reference a namespace prefix. Thus, rewriting the namespace prefixes would damage such a document by changing its meaning (and it cannot be logically equivalent if its meaning has changed).”

https://www.w3.org/TR/xml-c14n/#NoNSPrefixRewriting

lyxsus · on Dec 14, 2020

> However, there now exist a number of contexts in which namespace prefixes can impart information value in an XML document.

Well, yeah. They've given up to a mass amount of half-ass implementations? So what? I think it's our moral duty to ignore it :)

oever · on Dec 14, 2020

DTD does not know about namespaces and checks against "prefix:local-name".

E.g. the xhtml dtd will not accept this:

  <h:html xmlns:h="http://www.w3.org/1999/xhtml"/>

If you want to change prefixes, use XML Schema or Relax NG.

Ygg2 · on Dec 15, 2020

DTD is the devil spawn. Devil here being massive security vulnerabilities.

lyxsus · on Dec 14, 2020

I would say use XML Schema at least. DTD looks alien to XML anyway.

jsmith45 · on Dec 15, 2020

It is not in general legal to change prefixes and reserialize an XML document. Some official XML formats including XML Schema allow attribute values to reference prefixes in xs:QName types. One needs to bind the schema to detect that.

But it gets worth with XSLT using the prefixes in XPath expressions in attributes. If the prefixes are changed those values also need to be updated to change the prefix too, which requires complete knowledge of the format. This is because one cannot programmatically detect something like attributes that use custom data types that reference the prefixes in scope, but XSLT's xpath expressions show that W3C considers it legal to create such custom formats.

GoblinSlayer · on Dec 15, 2020

You can see test cases: https://github.com/mattermost/xml-roundtrip-validator/blob/m...

Though they mention something called xml directive. I don't think such a thing exists.

politician · on Dec 14, 2020

Do you think there is a path forward for the Go team to release an XML library without namespace support that simply errors when they are encountered ("XML namespaces are considered harmful")?

jerf · on Dec 14, 2020

They release something not called "encoding/xml". They could do what they did to the syscall package. The syscall package, by its nature, can't conform to the 1.0 compatibility promise Go itself maintains, because it changes outside of the scope of the Go project. So they froze the syscall package at some point, and then offered one in the golang.org/x/ namespace at https://pkg.go.dev/golang.org/x/sys .

I would again emphasize that encoding/xml, to my knowledge, only has problems with this particular roundtripping use case. It can consume non-namespaced XML correctly, and handle namespaced XML as long as you don't plan on re-emitting XML.

What would probably end up happening is a new package appearing on github.com for this use case, forked off of encoding/xml, for this use case. (If you're looking for a project that might attain some use, this is a likely candidate.) Unlike something like Python where the core packages are often C-based and thus you can expect better performance from the built-in "set" than somebody's pure-Python "set" implementation from before the built-in, encoding/xml is just a pile of pure Go code whose only advantage is that it ships with the compiler. Anyone can replace it without incurring any other disadvantage whenever they like.

(I looked a few versions ago, FWIW; encoding/xml has deviated so much from what I forked that my fork is essentially dead and no longer releasable without basically starting over from scratch. Plus I built it with the idea that it should be a minimal modification (so I could port it forward, which turned out to not work, but it's still how it was built)... if I was truly forking I'd have done some more extensive changes to it to support namespaces in general, rather than for my particular case.)

Anyhow, upshot, the Go project as a whole is not stuck... it is specifically encoding/xml as the standard, built-in library that is stuck. It's not like Go is completely incapable of handling XML correctly from first principles for some reason or anything.

tptacek · on Dec 14, 2020

There is no good reason for the standard library to include a SAML-safe XML, which is its own huge project, and which is useful only for that one standard. SAML implementations should include their own purpose-built, defensively-written XMLs.

sleepydog · on Dec 14, 2020

XML namespaces are ubiquitous. The utility of such a library would be very questionable.

While they do have the problems described, XML namespaces are what allow for abstraction and composition of documents from disparate systems.

tptacek · on Dec 14, 2020

You can do that, but SAML is heavily namespaced.

dolmen · on Dec 15, 2020

No need to be something published by the Go team.

GauntletWizard · on Dec 15, 2020

I'd like to ask everyone here who's familiar with SAML to take a look at SPIFFE[1], which underlies Istio.

I'm biased in this regard, but I view SPIFFE's inclusion of JWT Tokens as an authentication method as fundamentally flawed - By allowing bearer tokens, you are no longer verifying identity, but passing identity around. JWT has also been susceptible in the past[2] to the same kinds of attacks here - Poorly defined verification semantics.

I suspect that buried in the semantics around SPIFFE's SPIRE Server and Agent are a number of vulnerabilities or other ways that trust doesn't mean quite what you think it means. I'd love for someone with interest to take a look. Besides the obvious downsides fundamental to Isitio's MITM Proxy architecture, I think there's more lurking on that edge.

[1] https://spiffe.io/ [2] https://auth0.com/blog/critical-vulnerabilities-in-json-web-...

nimish · on Dec 14, 2020

`encoding/xml` has had broken handling of namespaces for a long time. It’s possible to hack it on but the only reasonable choice is to use a libxml2 binding which also gets you canonicalization, another can of worms.

Unsurprised it can cause security issues, especially in XML-DSig which is a nightmare to handle correctly.

blablabla123 · on Dec 14, 2020

Yup, I think it becomes very quickly obvious when using `encoding/xml` with XMLs that have multiple namespaces that the handling is incomplete. Hard to believe such an xml could even survive one roundtrip. It's also documented that the implementation is incomplete:

Mapping between XML elements and data structures is inherently flawed ... See package json for a textual representation more suitable to data structures.

random5634 · on Dec 14, 2020

XML

I'm amazed people can get it as right as they do half the time? I do think Go will get fixed eventually. It's just too weird if they couldn't fix the core issue? But I've never used XML if I can help it, so I'm absolutely no expert on what would make it impossible to fix something like this.

jerf · on Dec 14, 2020

They can fix the core issue... they can not do so while maintaining the 1.0 backwards compatibility promise. The data structures in encoding/xml in Go 1.0 are fundamentally incorrect for this use case.

random5634 · on Dec 14, 2020

Is you sense they will maintain the promise? That is commitment (and I wouldn't be surprised if true). Could they add a flag / toggle to the existing API you could toggle to change behavior or do they need an entire new API?

forrestthewoods · on Dec 14, 2020

Can we just consider XML itself an unsolvable vulnerability and call it day?

lyxsus · on Dec 14, 2020

What do you suggest to use instead? It's a bit too early to use RDF, people haven't catch up yet.

tptacek · on Dec 14, 2020

There is no good reason that the thing that 98% of people use SAML to do should involve XML.