libfirm doesn't really classify as small to me, it is currently 132,638 lines of code. The cparser alone is 11k lines of code (5 times the size of my c parser).
That's true. I was rather thinking of it in comparison to LLVM and gcc, which are both space-distorting behemoths. (The catchphrase I've seen is that you can compile libfirm in less time than it takes to run the gcc configure script!)
The GCC extensions are just that - they're not part of the standard. You can claim 100% standards conformance without implementing a single one of them.
That said, in practice a large amount of C code out there is not pure standard C so if you want to be able to compile something "interesting" like e.g. a Linux kernel you'll need to implement some.