Ticket #1211 (closed enhancement: wontfix)
The regular expression in Xinha.RE_doctype needs correction
| Reported by: | guest | Owned by: | gogo |
|---|---|---|---|
| Priority: | normal | Milestone: | 0.96 |
| Component: | Xinha Core | Version: | trunk |
| Severity: | normal | Keywords: | |
| Cc: |
Description
Hi,
I intend to use (after modifications) the Equation plugin for inserting mathematical formulas written in MathML. I am experimenting with the old HTMLArea now.
I discovered that the editor does not deal correctly with the following DOCTYPE definition:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd" [ <!ENTITY mathml "http://www.w3.org/1998/Math/MathML"> ]>
And my attention was attracted by the line 435 (SVN revision 1001) in XinhaCore?.js which says:
Xinha.RE_doctype = /(<!doctype((.|\n)*?)>)\n?/i;
This regular expression apparently does not fit.
I have found a good source of information to solve the problem: Robert D. Cameron "REX: XML Shallow Parsing with Regular Expressions" http://www.cs.sfu.ca/~cameron/REX.html
Using the idea from there I have reach to this regular expression:
Xinha.RE_doctype = /<!doctype(([ \n\t\r]+([A-Za-z_:]|[^\x00-\x7F])([A-Za-z0-9_:.-]|[^\x00-\x7F])*([ \n\t\r]+(([A-Za-z_:]|[^\x00-\x7F])([A-Za-z0-9_:.-]|[^\x00-\x7F])*|"[^"]*"|'[^']*'))*([ \n\t\r]+)?(\[(<(!(--[^-]*-([^-][^-]*-)*->|[^-]([^]"'><]+|"[^"]*"|'[^']*')*>)|\?([A-Za-z_:]|[^\x00-\x7F])([A-Za-z0-9_:.-]|[^\x00-\x7F])*(\?>|[\n\r\t ][^?]*\?+([^>?][^?]*\?+)*>))|%([A-Za-z_:]|[^\x00-\x7F])([A-Za-z0-9_:.-]|[^\x00-\x7F])*;|[ \n\t\r]+)*]([ \n\t\r]+)?)?>)?)|(([ \n\t\r]+)?>)/i;
In my editor the same variable is HTMLArea.RE_doctype.
Wow, it is complex, but it works so far. I am proposing altering Xinha.RE_doctype according the statement above.
Regards. Ivan Tcholakov
