Signed Integers
A string is a valid integer if it consists of one or more ASCII digits, optionally prefixed with a U+002D HYPHEN-MINUS character (-).
In simpler terms, a valid integer can be a series of digits, and it can also be a negative integer with a "-" sign preceding the digits.
A valid integer without a "-" prefix represents a positive number, and with a "-" prefix, it represents a negative number.
Parsing Algorithm:
- Initialize input, position, and sign variables.
- Skip whitespace characters within input.
- If position is at the end of input, return an error.
- If the first character indicated by position is "-", set sign to "negative" and move to the next character.
- If the next character is "+," ignore it (it is not conforming).
- If the character indicated by position is not an ASCII digit, return an error.
- Collect a sequence of ASCII digits, interpret them as a base-ten integer, and store it in the value variable.
- Return the value with the sign applied.
Non-negative Integers
A string is a valid non-negative integer if it consists of one or more ASCII digits.
In this case, there's no need for a prefix like the "-" sign; the string should only contain digits.
The parsed string represents a non-negative integer.
Parsing Algorithm:
- Initialize input and value variables.
- Parse the input as an integer using the rules outlined in the "Signed Integers" section.
- If the parsed value is an error or less than zero, return an error.
- Return the parsed value as a non-negative integer.
Floating-point Numbers
A string is a valid floating-point number if it consists of various components - optional "-", digits, a decimal point ".", more digits, an optional exponent "e" or "E," an optional "-", "+," and more digits.
This microsyntax allows you to represent numbers with a decimal point and scientific notation.
A valid floating-point number represents a real number, factoring in the significand (the number before the "e") and the exponent (the number after the "e").
Parsing Algorithm:
- Initialize input, position, value, divisor, and exponent variables.
- Skip whitespace characters within input.
- If position is at the end of input, return an error.
- Handle the optional "-" sign by changing value and divisor to -1.
- Ignore a "+" sign (if present).
- If the next character is ".", set value to zero, and jump to the "Fraction" step.
- If the next character is not an ASCII digit, return an error.
- Collect a sequence of ASCII digits, interpret them as an integer, and multiply it by value.
- Move to the "Conversion" step if you're at the end of input.
- If the next character is ".", move to the "Fraction" step.
- If you find "e" or "E," proceed to process the exponent.
- In the "Fraction" step, gather digits after the decimal point, keeping track of the divisor.
- Once you encounter "e" or "E," handle the exponent, which can be negative (if prefixed with "-") or positive.
- Perform the final conversion using IEEE 754 double-precision floating-point values.
- Ensure that the result is not Infinity or NaN.
- Return the parsed floating-point number.
Percentages and Lengths
The HTML standard, as maintained by the WHATWG (Web Hypertext Application Technology Working Group), specifies how to parse dimension values like percentages and lengths. This is essential for various CSS properties and HTML attributes that deal with sizes and dimensions.
The parsing algorithm for percentages and lengths is a step-by-step process:
- Input: Begin with the string to be parsed.
- Position: A pointer initially set to the beginning of the input.
- Skip Whitespace: Remove any leading whitespace from the input.
- Detect Initial Digit: If the current position is either past the end of the input or not an ASCII digit, the parsing fails.
- Collect Digits: Collect a sequence of ASCII digits, interpreting them as a base-ten integer. This sequence represents the value.
- Decimal Point Check: If the current position is a decimal point (U+002E), proceed to parse the fractional part.
- Parsing Fractional Part: As long as there are digits after the decimal point, continue parsing. The collected value is divided by ten for each digit.
- Determine the Type: If the current position is a percentage sign (U+0025), the value is returned as a percentage; otherwise, it's considered a length.
This algorithm allows web developers to accurately interpret and utilize dimension values within their HTML and CSS, ensuring proper rendering on various devices and screen sizes.
Nonzero Percentages and Lengths
Building on the previous section, non-zero percentages and lengths are an essential aspect of dimension parsing. These values represent sizes that are greater than zero and are critical in many web design scenarios.
The parsing algorithm is as follows:
- Input: Start with the input string.
- Parse Value: Parse the input string using the rules from section 2.3.4.4 to obtain a dimension value.
- Error Handling: If parsing the value results in an error or if the value is zero, return an error.
- Determine the Type: If the value is a percentage, return it as a percentage. Otherwise, return it as a length.
This process allows developers to handle dimension values effectively, ensuring that non-zero sizes are correctly processed.
Lists of Floating-Point Numbers
Web development often requires handling lists of floating-point numbers, such as when specifying coordinates, dimensions, or other numerical data in a comma-separated format. The HTML standard defines the rules for parsing these lists.
The parsing algorithm for lists of floating-point numbers is as follows:
- Input: Begin with the input string.
- Position: A pointer initially set to the beginning of the input.
- Collect Numbers: Collect a sequence of non-whitespace characters and parse them as floating-point numbers. Separate these numbers using the comma (U+002C) character.
- Error Handling: If parsing any of the numbers results in an error, set the value to zero.
- Result: Return a list of the parsed floating-point numbers.
This algorithm is crucial for handling arrays of numerical data in web applications, as it ensures that developers can accurately interpret and use these values in their code.
Lists of Dimensions
Lists of dimensions are used in various web design contexts, allowing developers to specify multiple values associated with units like percentage, relative, or absolute. Parsing such lists is defined in the HTML standard as well.
The parsing algorithm for lists of dimensions is as follows:
- Input: Start with the raw input string.
- Processing: Remove trailing commas if present and split the string into tokens separated by commas.
- Result Initialization: Create an empty list to store pairs of numbers and units.
- Parse Each Token: For each token, follow the steps to determine the number and unit. If no unit is specified, it is considered an absolute value.
- Store Results: Add the parsed number and unit as a pair to the result list.
- Return: Return the list of number/unit pairs.
This algorithm enables developers to handle lists of dimensions efficiently, facilitating responsive web design and dynamic content layout.