Unicode nearing 50% of the web

According to a recent post from the Google Blog, Unicode nearing 50% uptake on the web. A rather steep graph as well:

unicode uptake graph

This is pretty good news. I've had the 'pleasure' of working with a number of integration project where the 3rd party was still using iso-8859-1 (aka latin-1). Usually when this is the case, its not by choice but because of their software's default settings (Browsers, MySQL, etc.). I for one hope non-unicode charsets will soon be a thing of the past.

One other note in the post was about ligatures, such as fi and the dutch ij. If this is the first time you heard about these, you might be surprised to see that you can (likely) only copy-paste ij as a whole, and not just the i or j. It's one unicode character, not two. It just made me wonder: what kind of software would generate these, and more importantly why?