a token is a sequence of characters that represents a unit
tokenising is the first step of a compiler run
storing as chars/strings is not handy, need to still be extracted
predefined types from C++, but nothing that suits a token. so define own with classes.
token consists of kind and value. value is only used for numbers.
class Token {
public:
char kind;
double value;
}
Token t,t2,t3; // declare tokens
t.kind = ‘+’; // token 1 is a +
t2.kind=‘8’; // indicating number
t2.value=3.14; // value
t3=t; // copy initialisation
Now, (1.5+4)*11
can be shown as:
'(' | '8' | '+' | '8' | ')' | '*' | '8' |
1.5 | 4 | 11 |
Token get_token();
vector<Token> toks;
int main() {
while (cin) {
Token t=get_token();
tok.push_back(t);
}
}