HTML assumes the ISO Latin-1 character set. This is an 8-bit, 256-character character set that is an extended form of ASCII. Most browsers support it. However, it is safest to stick to 7-bit printable ASCII if possible.
Just four characters are special in the HTML syntax: <>&". These can be escaped as follows:
HTML is a free-form text language that treats any run of whitespace characters (spaces, tabs, newlines) into a single space. By default, all text is "paragraphed".
Comments in HTML code are delimited by <!-- and -->. For example:
<!-- This is a comment in some HTML text. --> <!-- This is a second comment line. -->
Comments don't nest. Formally, comments are allowed to span multiple
lines, but some browsers can't cope with multiple-line comments.