javascript's escape and encodeURI vs. PHP $_POST

I just stumbled upon an odd encoding issue with a web application.

Basically, data is coming into our PHP application through a Javascript's XMLHttpRequest (ajax). The data is sent as a standard form encoding (application/x-www-form-urlencoded), and picked up by PHP using the $_POST array. Any strings in form POST request are 'urlencoded', also known as Percent-encoding. As an example, this will turn a space into the often-seen %20.

Normally everything in the $_POST and $_GET arrays is already decoded, so when you're dealing with these arrays you don't really have to think about this. This time however, I was dealing with some non-latin unicode characters and for some reason they were never decoded and ended up in de database as raw url-encoded strings.

Doing a bit of research led me to the following: normally any special character is encoded as %XX, X and X being 2 hexadecimal values. These values simply represent bytes. The values I got were different altogether and took the form %uXXXX. I just assumed this was part of standard uri-encoding for unicode characters, so I was still a bit shook-up to see that PHP didn't just pick them up.

After a bit of research, I found out that the unicode representation was rejected by W3c, which is probably also why the PHP authors decided to not implement this. Javascript actually has 2 different methods to do percent-encoding, namely:

escape("☢"); // returns %u2622
encodeURI("☢"); // returns %E2%98%A2

Guess which one we were using?

Even though the %u syntax is arguably better to represent unicode characters, W3c seems to have voted against the syntax for backwards compatibility reasons. Before this happened the escape method was already adopted in javascript which in turn caused me to stumble upon this problem and write an article about it.

The more you know..

Web mentions

Comments

  • Rob

    Rob

    Wow imagine that, once again a web developer is completely screwed because of stupid people who use stupid charecters in their own special stupid languages! Here in Belgium with two languages we deal with these sorts of problems all the time. That's why next to my keyboard on my desk I have a big red button labeled: "Kill all french speaking people and their stupid language" I press it at least once per day. If I close my eyes and see the nukes going off I actually start to feel better! regardless sending text to PHP from a JS via Ajax there are some weird encode issues. I remember having this issue with the pipe character '|'. I thought in addition to encodeURI there was also another encodeXXX function floating around did the correct thing.
  • Evert

    Evert

    Well, I think the original character sets were rather discriminatory against non-english language. This isn't just about a few accented characters from french, but includes chinese, japanese, etc.. Ignoring there's people who don't just use the 26 letter alphabet kinda reduces your market.
  • Geoff

    Geoff

    Great information... good article... but what's with link to the article... in the article? Seems a bit loopy (pun intended) and redundant.
  • Evert

    Evert

    Yea sorry about that.. I was a bit tired and it made me giggle. I was hoping it would throw people off :)
  • Rene Pot

    Rene Pot

    Might come in handy someday! Thanks!
  • Md. Arif Ul Hoque

    Md. Arif Ul Hoque

    Thanks for great post.
  • Ben Klang

    Ben Klang

    From my reading of encodeURI() it does encode most characters, but not several important ones. Perhaps most importantly, encodeURI() does not encode ampersand (&) where escape() does. For at least my uses, this makes encodeURI() not a drop-in replacement for escape. However, there is encodeURIComponent() that does cover the remaining characters. So far I think this is a winner, but perhaps someone will correct me.
  • Real ls magazine preview

    Real ls magazine preview

    Real, http://www.oyax.com/lsmagazinepreview ls magazine preview, stql, http://www.oyax.com/howtocheckoutahoax how to check out a hoax, xsa, http://www.oyax.com/lsdreamsmagazines ls dreams magazines information, 842, http://www.oyax.com/howtocarveapumpkin how to carve a pumpkin, =-(((, http://www.oyax.com/alexandrefrotapeladonagmagazine alexandre frota pelado na g magazine, jwnvlf,
  • barbie com dress up games

    barbie com dress up games

    Best Wishes!, http://soundcloud.com/barbie-dress-up-games All about barbie dress up games, >:O, http://soundcloud.com/barbie-com-dress-up-game barbie com dress up games free, hsiwe, http://soundcloud.com/vera-wang-wedding-dresse vera wang wedding dresses online, 8P, http://soundcloud.com/toddler-easter-dresses toddler easter dresses information, 49858, http://soundcloud.com/horse-dress-up-games Only horse dress up games, 8-PPP, http://soundcloud.com/gothic-wedding-dresses Cheapest gothic wedding dresses, 975065, http://soundcloud.com/dress-up-barbies-games dress up barbies games online, :-[[[, http://soundcloud.com/celebrity-dress-up-games Buy celebrity dress up games, 54779, http://soundcloud.com/buy-womens-designer-shoe buy womens designer shoes online free, hny, http://soundcloud.com/cool-dress-up-games cool dress up games, :D,