I just stumbled upon an odd encoding issue with a web application.
Normally everything in the $_POST and $_GET arrays is already decoded, so when you're dealing with these arrays you don't really have to think about this. This time however, I was dealing with some non-latin unicode characters and for some reason they were never decoded and ended up in de database as raw url-encoded strings.
Doing a bit of research led me to the following: normally any special character is encoded as %XX, X and X being 2 hexadecimal values. These values simply represent bytes. The values I got were different altogether and took the form %uXXXX. I just assumed this was part of standard uri-encoding for unicode characters, so I was still a bit shook-up to see that PHP didn't just pick them up.
escape("☢"); // returns %u2622 encodeURI("☢"); // returns %E2%98%A2
Guess which one we were using?
The more you know..