Sp 46

From Htmlpedia
Jump to: navigation, search

OpenSP: Character not allowed in prolog

Cause:

The document prolog contains impermissible characters.

Example:

Bad

This text throws one error for each character <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


Good

<!-- Comments and whitespace are okay. -->

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Good

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Solution:

Remove the illegal characters.

The "document prolog" is everything that comes before the doctype declaration at the beginning of an HTML or XML document. In general, you should not put anything at all into the prolog.

There are some exceptions.

  • It's okay to put whitespace characters (space, carriage return, linefeed, tab) into the prolog.
  • HTML-style comments are okay too.
  • An XML declaration such as the one in the in the third example is valid.

However, it's best to avoid even these, due to some bugs in Internet Explorer.

  • If there are any characters at all in the prolog of an HTML document, Internet Explorer will render the page in Quirks Mode instead of Standards Mode, which will change the appearance of the page.
  • On a related note, if your web server sends "Content-Type: application/xhtml+xml" to Internet Explorer, then Internet Explorer will not render the page at all.

Occasionally you may run into a situation in which the validator complains about characters in the prolog which do not appear in your preferred text editor. These characters, which appear as the characters  when you view the source from a browser, are a Byte Order Mark. In documents encoded using the UTF-16 character encoding format, the Byte Order Mark specifies which digit in a given byte is most important. When improperly applied to documents encoded in UTF-8, however, the Byte Order Mark causes the validation error that brought you to this page. Many common text editors, including Notepad, silently add a BOM to documents that are encoded in UTF-8. In order to remove the BOM, you may need to use a hex editor.

References: