The tokens of a language are the basic building blocks which can be put together to construct programs. A token can be a reserved word (such as int or while), an identifier (such as b or sum), a constant (such as 25 or "Alice in Wonderland"), a delimiter (such as { or ;) or an operator (such as + or =).
For example, consider the following portion of the program we met in this article:
main() {int a, b, sum;a = 14;b = 25;sum = a + b;printf("%d + %d = %d\n", a, b, sum);
}
Starting from the beginning, we can list the tokens (in bold) in order:
main - identifier
( - left bracket, delimiter
) - right bracket, delimiter
{ - left brace, delimiter
int - reserved word
a - identifier
, - comma, delimiter
b - identifier
, - comma, delimiter
sum - identifier
; - semicolon, delimiter
a - identifier
= - equals sign, operator
14 - constant
; - semicolon, delimiter
and so on. Thus we can think of a program as a ‘stream of tokens’, which is precisely how the compiler views it. So that, as far as the compiler is concerned, the above could have been written:
main() { int a, b, sum;
a = 14; b = 25; sum = a + b;
printf("%d + %d = %d\n", a, b, sum); }
The order of the tokens is exactly the same; to the compiler, it is the same program. To the computer, only the order of the tokens is important. However, layout and spacing are important to make the program more readable to human beings.