here are some interesting statsfrom google’s “reader” blog:>
% of errors | Error description |
---|---|
15.6% | Input claims to be UTF-8 but contains invalid characters. |
14.9% | Opening and ending tags mismatch |
13.9% | An undefined entity is used (e.g. `` `` in an XML document without importing the HTML set) |
7.8% | Documented expected to begin with a start tag, but no ``<`` was found |
5.7% | Disallowed control characters present |
5.5% | Extra content at the end of the document |
4.2% | Unterminated entity reference (missing semi-colon) |
4.2% | Unquoted attribute value |
3.8% | Premature end of data in tag (truncated feed) |
3.3% | Naked ampersand (should be represented as ``&``) |
2.1% | XML declaration allowed only at the start of the document |
1.8% | Namespace prefix is used but not defined |
0.75% | Comment not terminated |
0.64% | Attribute without value |
0.17% | Unescaped ``<`` not allowed in attributes values |
0.11% | Malformed numerical entity reference |
0.11% | Unsupported/invalid encoding |
0.10% | Comment must not contain '--' |
0.10% | Attribute defined more than once |
0.07% | Char out of allowed range |
0.03% | Comment not terminated |
0.02% | Sequence ``]]>`` not allowed in content |