Motivation
Perl 5.8 onward, a string has the UTF-8 flag turned on or off. TWiki has been processing data with the UTF-8 flag off for the most part. Meanwhile the
CGI module these days assumes that the UTF-8 flag to be set correctly - a UTF-8 string has the UTF-8 flag turned on. Without decode_utf8(), character corruption happens in various cases.
Description and Documentation
So far, TWiki::Render and TWiki::Form have decode_utf8() (used only if charset is UTF-8). This is insufficient for non-ASCII characters in TWiki Forms to be handled correctly, for which TWiki::Form::* and TWiki::UI::Save need to have decode_utf8() in addition. Though this is relatively a small change, this changes TWiki core's behavior significantly enough for plugins such as
AutoSectionsPlugin needing to know whether the additional decode_utf8()'s are there or not so that it doesn't cause character corruption. So $TWiki::Plugins::VERSION needs to be changed to '6.11' (so far it's '6.10').
Examples
Impact
Implementation
--
Contributors:
Hideyo Imazu - 2019-10-31
Discussion
This looks good.
It would be helpful to add some documentation in the release notes. I created a placeholder
TWiki.TWikiReleaseNotes06x02.
--
Peter Thoeny - 2019-11-01
I implemented and tested it. Then I realized that this doesn't work well. To support Unicode characters well, we have no option but handling Unicode characters internally (= turning on UTF-8 flag). This is no easy task. I hope I will be able to come up with a plan.
--
Hideyo Imazu - 2019-11-07