Support for more CPU types. LLVM is limited to the mainstream architectures.

comex · on Dec 8, 2019

Eh, it’s not as one-sided as that. GCC has a larger number of targets, but LLVM supports several newer targets that GCC doesn’t, like WebAssembly and eBPF (although the latter is coming in GCC 10). But it would certainly be nice for Rust to support both sets of targets.

cbmuser · on Dec 8, 2019

Compare the number of architectures by GCC:

> https://en.wikipedia.org/wiki/GNU_Compiler_Collection#Archit...

And the number of architectures by LLVM:

> https://en.wikipedia.org/wiki/LLVM#Back_ends

GCC supports vastly more targets.

est31 · on Dec 8, 2019

That's the current state but LLVM is adding new targets (like the newly added AVR target), while the AVR target in GCC is under threat of removal: https://www.bountysource.com/issues/84630749-avr-convert-the...

vunie · on Dec 8, 2019

GCC still supports more architectures even if AVR support is removed.

comex · on Dec 9, 2019

What part of my comment does that contradict?

saagarjha · on Dec 8, 2019

Can the semantics of Rust handle some of the stranger architectures that GCC supports?

onei · on Dec 8, 2019

In theory, both GCC and LLVM take a front-end (in this case rust) and compile it down to an intermediate representation (IR). There will likely be some differences between the output from a front-end, but after successive optimisations have been applied this will likely disappear. By the time you get to generating assembly, you can't really tell the difference anymore so the semantics of the original language don't make an impact.

saagarjha · on Dec 8, 2019

I'm sure there are a number of "reasonable" assumptions that aren't true–probably things like the number of bits in a byte, or the size of a particular integral type, or support for a particular platform behavior.

sanxiyn · on Dec 8, 2019

https://gankra.github.io/blah/rust-layouts-and-abis/ lists assumptions. As you suspected, one assumption is 8-bit bytes.

Yoric · on Dec 8, 2019

Wait, isn't this the definition of a byte?

sanxiyn · on Dec 8, 2019

It isn't in Texas Instruments TMS320. I will quote from http://processors.wiki.ti.com/index.php/C89_Support_in_TI_Co...

> The C standard uses the term byte to mean the minimum addressable unit in the implementation, which is char, which means a byte on these targets is 16 bits. This is in conflict with the widespread use of byte to mean 8 bits exactly. This is an unfortunate disagreement between C terminology and widespread industry terminology that TI can't do anything about.

hyperman1 · on Dec 8, 2019

Absolutely not. A byte is the smallest block of memory with an address. E.g you can't take the address of 7 combined bits on x86 but you can for 8.

In the past, architectures differed wildly in number of bits per byte, e.g 36 for the machine where the Pascal language was created.

Today, the industry mostly standardized on 8 bits per byte, but see e.g the PIC architecture for an example relevant today with a different choice: 8 bit bytes for data, but 10 bit bytes for instructions.

https://en.m.wikipedia.org/wiki/PIC_microcontrollers

cwzwarich · on Dec 8, 2019

> A byte is the smallest block of memory with an address. E.g you can't take the address of 7 combined bits on x86 but you can for 8.

I think that's an anachronistic/incorrect usage. A lot of machines (including several with 36-bit words that you mentioned) supported larger basic addressable units of memory, but didn't call these larger units "bytes", and distinguished between "bytes" and "words". In fact, one of the elements of the early RISC philosophy was that CPU support for byte accesses (as opposed to word accesses) was extraneous, based on statistics gathered from real programs. Early MIPS/Alpha/etc. machines did not support byte addressing, but the people using them still called 8 bits a byte.

justincormack · on Dec 8, 2019

Arguably the first Alphas could have had a C compiler with 64 bit bytes but that would have made porting hard. Even then they were forced to add byte operations pretty early on.

detaro · on Dec 8, 2019

Byte is also often defined as the smallest addressable unit in a computer. Which nowadays most commonly is 8 bit, to the point where you can generally assume it, but this was different in the past (6 and 9 bit being especially common alternatives) and is still in some niches like DSPs, which sometimes only can work on wider types. But at least those then are typically powers of two, which makes it easier for many tools.

int_19h · on Dec 8, 2019

There are architectures on which sizeof(char)==sizeof(short)==sizeof(int) in C implementations, because it's the only way to produce efficient code.