Lexical Elements in C Programming

0
In C programming, a token is the smallest element of the language. It can be categorized into five types:
  • Keyword
  • Identifier
  • Constant
  • String-literal
  • Punctuator
In addition to these, there are preprocessing tokens which are the smallest elements of the language in translation phases 3 through 6. The categories of preprocessing tokens are:
  1. Header-name
  2. Identifier
  3. Preprocessing number (pp-number)
  4. Character-constant
  5. String-literal
  6. Punctuator
  7. Each non-white-space character that cannot be one of the above
Each preprocessing token that is converted to a token should have the lexical form of a keyword, an identifier, a constant, a string literal, or a punctuator.

A token is the minimal lexical element of the language in translation phases 7 and 8. The categories of tokens are: keywords, identifiers, constants, string literals, and punctuators. A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6. 

Preprocessing tokens can be separated by white space; this consists of comments or white-space characters (space, horizontal tab, new-line, vertical tab, and form-feed), or both. White space may appear within a preprocessing token only as part of a header name or between the quotation characters in a character constant or string literal.

If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token. There is one exception to this rule: header name preprocessing tokens are recognized only within #include preprocessing directives and in implementation-defined locations within #pragma directives. In such contexts, a sequence of characters that could be either a header name or a string literal is recognized as the former.

Examples

1. The program fragment `1Ex` is parsed as a preprocessing number token (one that is not a valid floating or integer constant token), even though a parse as the pair of preprocessing tokens `1` and `Ex` might produce a valid expression (for example, if `Ex` were a macro defined as `+1`). Similarly, the program fragment `1E1` is parsed as a preprocessing number (one that is a valid floating constant token), whether or not `E` is a macro name.

2. The program fragment `x+++++y` is parsed as `x ++ ++ + y`, which violates a constraint on increment operators, even though the parse `x ++ + ++ y` might yield a correct expression.
Tags

Post a Comment

0Comments
Post a Comment (0)