Bug: Topic text is handled inconsistently with respect to protective encoding
I might have gotten somewhere confused, but it appears to me that there may be some old or partially completed stuff hanging around in the TWiki core.
Let me trace how text is passed around from view thorugh the various stages of the edit cycle:
- view
- Text is taken from file
- Text is rendered and substituted for
%TEXT% in template (for display)
- Alternatively, in raw view mode, text is encoded to protect
&, %, <, and > and tabs are converted to 3 spaces
- (Except in DEVELOP, where the conversion to 3sp is not done)
- edit
- Text is taken from file or optionally from URL parameter
text
- Text is run through
decodeSpecialChars if taken from URL parameter text (the "special characters" are &, <, >, the quote character, and a sequence of newlines)
- Text is encoded to protect
&, <, and > and tabs are converted to 3 spaces
- (Except in DEVELOP, where the conversion to 3sp is not done)
- Text is substituted for
%TEXT% in template (for inclusion in textarea)
- Text will be passed in URL parameter
text (a textarea input element) to save script
- preview
- Text is taken from URL parameter
text
- In Text, 3 spaces are converted to tabs
- Text is rendered and substituted for
%TEXT% in template (for display)
- An other copy of text is run through
encodeSpecialChars (for later saving)
- Text will be passed in URL parameter
text (a hidden input element) to save script
- save
- Text is taken from URL parameter
text (may come from textarea or hidden)
- Text is run through
decodeSpecialChars
- In Text, 3 spaces are converted to tabs
- Text is written to file
In above trace, one can observe the following inconsistencies:
- When text is shown in raw form (in raw view mode and edit mode), sometimes the
% character is protected, sometimes not.
- When text is passed as URL parameter from edit to the save script, it is not encoded, but when it is passed from preview to the save script, it is encoded.
The difference in (1) should be resolvable easily. Either
% needs to be protected or not, and then it should be done so everywhere. I suggest that we write a function that is called from everywhere.
The difference in (2) is trickier. As it is not applied when passing data in a URL parameter from preview to save, but not from edit to save, it appears that most of the encoding may not really be needed, otherwise the edited text would always be messed up.
However, according to reports in
SectionalEditPlugin (reported by
MarioFrasca), there is on problem with passing text in a URL parameter:
- On Firefox (and possibly other browsers, it is claimed that all browsers other than IE are affected), all leading and trailing
\n are chopped off when posted as the value of a hidden input element.
From these reports and what I see in the code I conclude that most of the protective encoding is not needed for (2), but that we need to protect leading
\n for non-IE browsers. The problem reported by
MarioFrasca does not show up for going through the preview script (as there the protection through
%_N_% translation takes effect, albeit it might be overkill to replace all
\n, rather than just the leading ones). However, it does show up when saving directly, as we obviously cannot ask the user to type
%_N_% whenever
\n is meant.
The consequence appears to be (I could not verify this as I don't have other browsers than IE, but the bug reports on
SectionalEditPlugin are convincing) that whenever text is edited and saved, all leading and trailing newline disappear unless we protect them somehow. The current solution has a consistent loss of one single leading
\n character in both the edit-save and edit-preview-save cycles. I don't quite know what the spec says about preservation of leading newlines, though.
Either way, with respect to (2) I suggest:
- We need to clarify whether protection of
&, <, >, the quote character, and a sequence of newlines is required when passing as URL parameters is required.
- We need to figure out an uniform way to protect leading and trailing newline when passing the edited text to the save script.
Finally, I would like to understand the difference between the two kind of protections:
-
&, <, >, and the quote character (for parameter passing)
-
&, <, >, and % (for display in text area)
Test case
Environment
--
ThomasWeigert - 18 Mar 2005
Impact and Available Solutions
Follow up
Discussions on leading \n bug moved to
SomeBrowsersLoseInitialNewlineInTextArea. Let's keep this for discussion about the general problem of collapsing all the different encodings.
Thomas, note that there are other protective encodings for form fields. It is those encodings that I tried to collapse with the "standard" encodings.
--
CrawfordCurrie - 19 Mar 2005
Guide to related discussions:
The most relevant text I could find in the
HTML standard is in
section 17.13.4
of the
W3C recommendation. It appears that form input data is by default encoded as
application/x-www-form-urlencoded and should, therefore, need no protective encoding except for the
& (see also the discussion
Ampersands in URI values
).
There may be some issues hidden with respect to our platforms, when at times there are
\r\n at the end of lines, but other times there are just
\n. When an
HTML page is submitted, the expectation is that (like all MIME transmissions),
CRLF is used to separate lines (see the quoted standard). I wonder if the leading newline loss (see
SomeBrowsersLoseInitialNewlineInTextArea) is victim of this?
Either way, it appears that much of the protective encoding applied is unnecessary.
--
ThomasWeigert - 19 Mar 2005
Moved some material to the more relevant SomeBrowsersLoseInitialNewlineInTextArea --TW
- Actually, I think that the moved material is more relevant here. in first instance I myself had written it there because it was an answer to some discussion there, but, as it was generalizing the discussion, and this topic was intended about the more general issue, I removed it from there and pasted it here. well, let's decide in which topic to discuss the way we can address our consistent or inconsistent treatment of text encoding before we look for solutions. --MF
- Please see how the related areas of this discussion have been factorized for easier treatment. --TW
Thomas, you write "form input data". do I get you right reading you so:
the content of a TEXTAREA.? or are you speaking more in general also of what I call
the value attribute of a type HIDDEN INPUT element? in the remainder of this contribution I assume you solely meant the first thing, here is the reason:
as far as I could understand, the P in PCDATA (content type) stands for "processed" so we don't need to care about that (exception made for the
some browsers lose initial newline in text area problem). if we want to keep our
text untouched, we have to process it ourselves before we put it into an attribute of type CDATA. I hope that this what we are discussing here...
as I see it now, we have two choices:
- we assume the user agent to adhere to the w3 description in the most free way. (worst case of standard behaviour)
- we make a survey of current user agents and describe the common grounds.
at the moment twiki does the first thing (protecting
[\ \t\n] whitespace), while the
SectionalEditPlugin and its siblings, that is where the discussion originated, do not do either thing.
whichever the choice, as we receive the
text value in the
save script, we can:
- make
save aware of the type of data received (PCDATA or CDATA).
- take additional initialization care so that the CDATA processed by us is processed back to PCDATA so that the
save script does not see the difference.
- apply the transformation in any case, making sure that t*t = t (applying more than once has no further effect)
at the moment twiki does the third thing.
my earlier change proposals were based on insufficient knowledge of the w3 reccomandations, so please disregard them. sorry and thanks!
--
MarioFrasca - 20 Mar 2005
Excuse me for stating the obvious, I need to formalise a bit to get my head round this. Looking from the server side I need all data exchanges to be content preserving.
- If I pass
$text in a form field to the browser, and the browser posts back without any intermediate changes, I must be able to recover $text exactly as originally written.
- This applies to INPUT type="text" (single line edit), INPUT type="hidden" (buffer) and TEXTAREA (multi-line edit) elements.
The attached
CGI script allows you to explore the exchange when there is no encoding applied, for these three type. Install it in your twiki bin directory, it has no dependencies. The encoding in the "Recieved" strings is just # and then the ASCII decimal code for non-\w characters.
A small amount of experimentation reveals how brain-dead the browsers really are. It is notable that when you pass a linefeed ( ) as the first character, then it gets treated just like a \n (at least in Firefox it does). The encoding/decoding of other characters (e.g. ©) seems to be ok, though I am left wondering what would happen if the browser locale was different to the server locale.
Try these:
- enter A in the textfield
- URL-escapang things on the URL line, and see the effect on the received text
--
CrawfordCurrie - 20 Mar 2005
The discussion in this topic applies to all situations where input is passed from client to server via form fields or URL parameters (really the same thing). However as noted, there are two situations in the standard:
-
PCDATA (passed by textarea fields) which supports multiline text
-
CDATA (passed by everything else) which does not guarantee to preserve linefeed or newline characters.
A wrinkle in this discussion is that apparently some (or all?) browsers drop the leading newline in textarea, albeit this seems not licensed by the standard.
--
ThomasWeigert - 20 Mar 2005
Crawford, Thomas, let me put that part here again: since the leading
\n problem is closed, we can let that topic alone...
From
http://www.w3.org/TR/REC-html40/interact/forms.html
I read:
In general, a control's "initial value" may be specified with the control element's value attribute. However, the initial value of a TEXTAREA element is given by its contents,
17.4 the INPUT element
<!ELEMENT INPUT - O EMPTY -- form control -->
<!ATTLIST INPUT
...
value CDATA #IMPLIED -- Specify for radio buttons and checkboxes --
...
17.7 the TEXTAREA element
<!ELEMENT TEXTAREA - - (#PCDATA) -- multi-line text field -->
so in our save script we are receiving the text from a PCDATA or a CDATA, not knowing which was the case...
from the same source (
http://www.w3.org/TR/REC-html40/types.html#type-cdata
) I read:
User agents may ignore leading and trailing white space in CDATA attribute values
(e.g., " myval " may be interpreted as "myval"). Authors should not declare
attribute values with leading or trailing white space.
...
CDATA is a sequence of characters from the document character set and may include
character entities. User agents should interpret attribute values as follows:
- Replace character entities with characters,
- Ignore line feeds,
- Replace each carriage return or tab with a single space.
I think that it is time to refactor this page again, respecting the discussion guide above. who dares?
--
MarioFrasca - 20 Mar 2005
Mario, the save script receives data as
PCDATA when invoked from edit, but as
CDATA when invoked from preview.
Therefore, care has to be taken that preview encodes the text to avoid
any newline from being lost. Edit need not to worry, except for the leading newline apparently being lost by some (or all?) browsers.
Note that the text you quote is about
any whitespace characters, not just leading or trailing newlines, as
CDATA should not even contain newlines. However, as we have seen, browsers do pass some newlines along, but not always the leading ones.
--
ThomasWeigert - 20 Mar 2005
Thomas, it is good to see that we agree on the whole line. what you are stating here summarizes things once again. in my opinion any of us may go on and rewrite the description of the problem...
please, don't use undefined terms. I did not manage to find any definition of what you call "url-parameter"... can we stay on PCDATA, CDATA, input element, textarea, value, type, hidden... thanks. goodnight,
--
MarioFrasca - 20 Mar 2005
Crawford, here is the result of my testing using your nice
bin/exchange script:
| Browser |
Input |
Received |
| Text |
Textarea |
Hidden |
| Firefox |
abc#13;#10;#13;#10;def#13;#10;#13;#10; |
abc |
abc#13;#10;#13;#10;def#13;#10;#13;#10; |
abc#13;#10;#13;#10;def |
| IE |
abc#13;#10;#13;#10;def#13;#10;#13;#10; |
abcdef |
abc#13;#10;#13;#10;def#13;#10;#13;#10; |
#13;#10;abc#13;#10;#13;#10;def#13;#10;#13;#10; |
This confirms and clarifies what we have been seeing:
- While IE passes in hidden fields exactly what it was handed, Firefox drops leading and trailing newlines
- Both browsers drp the leading newlines in textarea, but keep the rest intact
- Firefox only keeps the first legal string of characters in text, while IE filters out illegal characters.
This, for example, implies that IE will not loose any leading newline inserted in the edit topic textarea, but Firefox will.
I also ran a test with blanks and there both browsers are consistent and pass the full character string through unaltered.
From this it is not obvious what these browsers actually implement, as they interpret the
W3C spec quoted earlier differently for blank whitespace vs. newline whitespace.
The consequences for us are clear: must encode linefeed when passing through hidden elements. We probably should make it spec that the textarea does not have leading newlines, otherwise we will end up with having to use your javascrip trick or something similar.
--
ThomasWeigert - 21 Mar 2005
Here is the same experiment for non-alpha characters:
| Browser |
Input |
Received |
| Text |
Textarea |
Hidden |
| Firefox |
abc~!@#$%^&*()_+>?/.,-=:"< |
abc~!@#$%^&*()_+>?/.,-=:"< |
abc~!@#$%^&*()_+>?/.,-=:"< |
abc~!@#$%^&*()_+>?/.,-=:"< |
| IE |
abc~!@#$%^&*()_+>?/.,-=:"< |
abc~!@#$%^&*()_+>?/.,-=:"< |
abc~!@#$%^&*()_+>?/.,-=:"< |
abc~!@#$%^&*()_+>?/.,-=:"< |
This seems to indicate that we are encoding much too much. The only problem I had was with
< which messed up the whole output afterwards if followed by certaom characters, such as
? or
/.
My conclusion is that we should encode (in TWiki::Render::encodeSpecialChars) only newlines and
<.
--
ThomasWeigert - 21 Mar 2005
Thanks for the excellent analysis. Before we make
any code changes, I want to have testcases in place. We need to eliminate any chance of this re-occurring in the future. I can test firefox, konqueror and mozilla, and IE on checked-in code. Is this a sufficient set, or do we need to add opera and/or safari?
--
CrawfordCurrie - 21 Mar 2005
Crawford, Thomas, my idea is that in TWiki::Render::encodeSpecialChars we could add a parameter which specifies if we are encoding for a PCDATA or for a CDATA. in the second case we should encode
also newlines and tabs. otherwise only, as also Thomas says,
<.
also: if you want to eliminate any chance that this reoccurs in the future, we cannot base the decision on the current browsers alone but also on the w3c reccomandations.
one more thought: may it be so that the
<textarea> is parsed including the first following
\n, if present? this would explain the behaviour... just a thought, could be tested with a xml parser... maybe I'll check it later.
cheers,
--
MarioFrasca - 21 Mar 2005
Reasonable, except that there is no way to know what you are encoding for. In the places that method is called, all it knows is that it is about to replace %TEXT% with the value. The only solution is to replace %TEXT% with %CDATA_TEXT% and %PCDATA_TEXT% as appropriate in the templates. Clunky, but nothing better springs to mind. I'd have preferred %CDATA{"!%TEXT%"}% but that won't work as the %TEXT% substitution is done after common tags expansion is finished. (why? good question. make a mental note to research that)
BTW I also prefer a separate function in this case; it is better practice to avoid an "if" statement. As they say, "every IF adds a BUT".
--
CrawfordCurrie - 21 Mar 2005
There is nothing we can do about
PCDATA, as we do not have a way of intercepting the passing of that data other than by Crawford's javascript trick, which I don't think is worth it.
For
CDATA, it appears all we need to do is encode newlines and
<, I think.
There is no need to differentiate different text types in the templates.
Oh, and by the way, we should deploy the protective encoding systematically where it is needed (i.e., whereever we pass text containing these characters in URL parameters).
--
ThomasWeigert - 21 Mar 2005
Crawford, why do you say that there is no way for us to know which encoding we should produce? when we are filling in a CDATA field, we know it is a CDATA field, don't we? I can't think of an example where we do not. I'm thinking of
edit putting the
text inside the
textarea.
edit knows it is a PCDATA field, so it knows it should only protect a the
< character (or whatever, anyway no
\n nor tabs). similarly,
preview knows it is putting the
text inside the value of a hidden input element, that is, it is passing CDATA so it should protect also other characters at risk (the
\n and
\t)... you surely have reasons for stating what you state, but I don't understand...
--
MarioFrasca - 21 Mar 2005
Mario, what you say above makes only partial sense in the context of twiki (or other web tools for that matter).
The edit script is generating the edit topic, I cannot do anything about the encoding of the text area (other than using the javascript trick Crawford talked about earlier). The contents of the textarea (or any other input field for that manner) is passed by the browser to the server (using the standard encoding rules defined for forms). We cannot intercept that.
The only control we have is when dealing with hidden fields, as these are generated by the scripts (and, of course, the initial values of the other fields, albeit there the concern is that it has to display right for the user).
Thus the focus has to be on
- encoding hidden fields properly, and
- not relying on browsers when assembling the resultant final text into what is saved or displayed.
Mario, it would be fruitful for this discussion if you familiarized yourself with the internal working of twiki. In particular, how data is passed between browsers, scripts, and servers throughout the view-edit-(preview)-save cycle.
--
ThomasWeigert - 21 Mar 2005
<rant mode>
Thomas, in an earlier contribution signed by me which you factored away I wrote something on the line: we are here to make TWiki a better tool and the internet a better place. do we agree on this? so please lower the tone when you're replying to my posts or avoid replying altogether. thanks.
</rant mode>
now to the point:
When we are talking about what the server passes to the browser, it knows what kind of data it is encoding. At least this is my understanding and this is the reason for asking Crawford why he states what he states, so Crawford please if you can explain what you meant, I'll read you with interest. as things are now, I too don't see the need for distinguishing PCDATA_TEXT and CDATA_TEXT in the templates.
More interestingly,
when we're talking about what the client passes to the server, in fact there
is a problem in the
save script, where it does not know if what it receiving has been passed as the content of a
textarea or as the value of a hidden input. (see my contribution in this very topic, version 1.4)
so actually my remark to Crawford is that it is on the other side that I see problems. I've
been experimenting with a very small modification in the only place where TWiki uses
new CGI...
save expects a
text PCDATA, which is what it does receive as the client postes the form built by
the
edit script. on the other hand,
save can be called after the client postes the form built by
the
preview script, which holds a CDATA
text. if we call the parameters differently, say
'hid_text' and 'text', a
very limited change in UI.pm does the trick:
if( $ENV{'DOCUMENT_ROOT'} ) {
# script is called by browser
$query = new CGI;
+ if ($query->param( 'hid_text') ) {
+ $query->param( 'text' => &recoverFromCDATA($query->param('hid_text') );
+ }
# SMELL: The Microsoft Internet Information Server is broken with
# respect to additional path information. If you use the Perl DLL
# library, the IIS server will attempt to execute the additional
you could even think about looping over the whole post and recovering from CDATA all posted data whose name matches /^hid_/... in the remainder of TWiki we would only need to deal with the PCDATA encoding.
well, this is it for today, keep it cool,
--
MarioFrasca - 21 Mar 2005
Mario, what this topic addresses is the problem that in many different places there are encodings applied to parameters passed
from the browser to the server which are not consistent. See for example, TWiki::Render::encodeSpecialChars, but then there are other places were these encodings are applied manually. E.g., the
SectionalEditPlugin code which I inherited, had a function which applied a totally different encoding, missing the protection of newlines, but protecting many irrelevant things). In this topic we have discussed how we can make this encoding more consistent, and also,
what it really should be.
Your concern as expressed by above suggestion appears to be to avoid applying a decoding unnecessarily. This is an appropriate performance concern, but probably has not much impact on overall TWiki performance. But you are right in above suggestion, if we were to differentiate whether data came from the edit script (via savemulti) vs. preview script, we could avoid applying the decoding step in the former situation, as the text is not encoded when it comes from the textarea input (is a
PCDATA).
The other issue that we have been discussion, which belongs in the other topic,
SomeBrowsersLoseInitialNewlineInTextArea, is that when data is being passed from the textarea, some browsers drop the leading newline. That behavior is what Crawford and I were concerned about when we were saying that you cannot protect that from happening other than by resorting to javascript trickery.
--
ThomasWeigert - 21 Mar 2005
I'd still like to get to the point where we decide
- What should the appropriate encoding be that is applied to hidden text?
- Can we replace all occurrences of this encoding by a standard function provided in TWiki?
- Should we differentiate in the save script were the data is coming from to avoid one unneeded decoding?
--
ThomasWeigert - 21 Mar 2005
- We require
hidden and textarea to be content-preserving
- Additional encodings can be applied to
hidden (it's hidden, after all
) but not textarea.
- We can live with textfield (input type="text") stripping leaving/trailing [\r?\n]
I can't see a solution to
textarea other than
JavaScript. The encoding used with hidden can be any reversible encoding. For example,
s/([^ -~]|%)/'%'.sprintf('%02x',ord($1))/ge; (reverse
s/%(\d\d)/chr($1)/ge;
The current encodings are a mess, and need sorting out (some reasons:
urlEncode encodes \n as <br />, for goodness sakes!
encodeSpecialCharacters collapses \r's either side of \n's!
entityEncode only handles a subset of 7-bit characters!)
--
CrawfordCurrie - 22 Mar 2005
Crawford, hold you: 4 questions in a row! what was the problem with
textarea? you mean the loss of the first
\n?
SomeBrowsersLoseInitialNewlineInTextArea, hasn't that been sorted out? or did I miss anything?
--
MarioFrasca - 22 Mar 2005
It appears that the initial \n in textarea
is sorted out; I thought we still had different behaviour with IE, but that appears from your test results not to be the case.
I've been looking at the encode/decode functions, and I think we need these:
-
entityEncode - encode to HTML entities - entityDecode
-
urlEncode - encode to %nn e.g. %10 - urlDecode
However the existing implementations are crap, and need generalising.
The current uses of
encodeSpecialChars can be replaced with:
-
encodeCDATA($text,$hidden) encode for CDATA decodeCDATA($text)
The
$hidden flags when the data is being encoded in a hidden field. In this case, it will add a unique byte sequence e.g. 0xDEC0DE to the start and end of the value and then
urlEncode it. If
decodeCDATA sees this byte sequence, it will apply the decoding. This technique needs to be applied evenly to
$text and form values - anything where a
hidden may be used.
There is an outstanding question about the encoding used to protect characters in field data stored in topics. This currently uses a subset of encodeSpecialChars. I think we should increment the data format number and change this to entity encoding. What do you think? (the major impact of this would be that older versions of TWiki would not correctly read filed values from data generated by this version).
--
CrawfordCurrie - 22 Mar 2005
I think that I have to think about it... sounds all quite reasonable, but I don't yet see
where we are right now. ...must experiment with the data stored in the topics, I'll be back.
about encoding for hidden, it really is encoding to CDATA, no?
--
MarioFrasca - 22 Mar 2005
Crawford, regarding your suggestions/question above:
- I do not think we should encode text area as this requires the use of Javascript. In such a central area as the editing box we cannot rely on Javascript, I believe. That means, we need to live with the leading
\n being dropped.
- Thus the only encoding of URL parameters needs to be for hidden fields. These need to encode all characters that cannot be in
CDATA and also characters that may cause a mess otherwise, based on observation. I believe these are linefeeds, newlines, and <.
- We need to agree on the algorithm here, e.g., whether
\n and \r are combined, etc.
- The "hidden" flag above is unnecessary, as no other input fields should be encoded (there is no way of sensibly doing that).
- The adding of byte sequences to the front or back seems unnecessary for hidden fields. You need to encode all the problem characters, even if observation teaches us that the browsers we looked at only appear to mess with the leading and trailing newlines. I cannot see anything in the W3C spec that would guarantee us that such remains that way.
Thus, with respect to parameters, it appears to me that the only thing to do is to find all the places where we apply such encoding and use a standard function.
Secondly, we need to do the same analysis for encoding to protect rendered text, which is a second can of worms.
--
ThomasWeigert - 22 Mar 2005
Ah, were it that simple. When you are editing a topic, and hit "change form", the $text is encoded in a hidden parameter for passing to the "change form" oops script. This text value is subsequently passed back to
edit when you select the new form type. Without the
hidden encoding, it would appear that the content of the editing textarea changed in mid flow as it lost it's leading newline. If we take the stance the leading/trailing newlines are fair game
everywhere, then this is a moot point. However this feels like dodging the issue.
As for other characters in the hidden fields, entity encoding should suffice.
The
entityEncode function is used for protecting rendered text in
raw mode. It needs to be used for
verbatim as well. I think those are the only two places that matter - unless someone else knows differently?
--
CrawfordCurrie - 22 Mar 2005
when you are editing a topic and don't see the "change form", you ask someone to help. the same you do when a native English speaker uses slang. or when a tuesday feels like monday.
--
MarioFrasca - 22 Mar 2005
Sorry, "Replace form" (or "Add form" if the topic doesn't have a form).
OK, despite an influenza-fuelled haze, I have performed the following experiment:
- Replaced all inline HTML with calls to CGI (e.g. CGI::start_form)
- Removed all encoding of CDATA and PCDATA (to let CGI deal with it). This involved moving the textarea out of the edit.tmpl into code, so I could leverage the CGI encoding.
- Deleted methods
encodeSpecialChars and decodeSpecialChars
I've only tried it on firefox so far, but it works perfectly (i.e. it works the same as it did before)
Interestingly enough, the move to using
CGI for
HTML composition has radically improved readability of the code
--
CrawfordCurrie - 22 Mar 2005
this sounds good... (?) why don't you put it somewhere we can test with other browsers? if your influenza allows you... you know that my system can be at disposition, if necessary. how does
CGI-encoded CDATA look like? I'm curious. all right, more sleepy than curious, but still curious.
... if it is the
CGI producing the textarea, does it include the extra leading
\n?
--
MarioFrasca - 22 Mar 2005
Crawford, you have proposed two different things, this last one, letting
CGI do all the work sounds interesting...
the previous one: two different functions for encoding for CDATA and PCDATA, I would say that you apply first the one (
toPCDATA) and if the data has to go into a value attribute, you encode it with
toCDATA, in fact, no, there is no need for that second parameter. well, unless you want to have one single function and keep a second parameter to specify if you are encoding for CDATA or PCDATA. I'm still just talking about the encoding at server side. on client side, I don't see the problem. the browser passes the any CDATA as it has received it, and the server knows how to redecode it. after the data has been reduced to PCDATA, it can be redecoded... but enough about this, I'm a lot more interested about the
CGI doing all the work... (lazyness, what a good property)
--
MarioFrasca - 23 Mar 2005
CGI is now doing all the work (on
DevelopBranch). Please watch it like a hungry owl, ready to swoop down on anything that's wrong. There were a number of mysterious and unexplained encoding/decoding steps that I never fully understood and have now removed; when we see it go wrong, we can recode them and this time explain them with a comment (you can't make an omelette without breaking eggs)
--
CrawfordCurrie - 24 Mar 2005