> you need hooks from the symbol table back into the tokeniser/parser
not necessary. you can add a bit of information to each identifier token: is this the name of a type or not?
you have to split up one of the parser passes, so if you had a two pass "tokenizer" and "LR parser" compiler now you have a tokenizer pass followed by two LR parser passes. The first LR pass is "last mile tokenization". I guess another word would be a (very short) cascade tokenizer/parser.
The intermediate pass adds a piece of metadata to each token marking it as the name of a type or not (this makes the language actually context free instead of just nearly context free). If you do this, then the second pass doesn't see this as two identifiers at all because it knows which identifiers refer to types and which do not.
It's a different application (computing scope data instead of type data) but I've implemented the basic concept of annotating token data with the result of an intermediate parsing stage here...check out line 318 here:
not necessary. you can add a bit of information to each identifier token: is this the name of a type or not?
you have to split up one of the parser passes, so if you had a two pass "tokenizer" and "LR parser" compiler now you have a tokenizer pass followed by two LR parser passes. The first LR pass is "last mile tokenization". I guess another word would be a (very short) cascade tokenizer/parser.
The intermediate pass adds a piece of metadata to each token marking it as the name of a type or not (this makes the language actually context free instead of just nearly context free). If you do this, then the second pass doesn't see this as two identifiers at all because it knows which identifiers refer to types and which do not.
It's a different application (computing scope data instead of type data) but I've implemented the basic concept of annotating token data with the result of an intermediate parsing stage here...check out line 318 here:
https://github.com/dbpokorny/autoclave/blob/master/tree.js