The problem with password_hash()
PHP 5.5 introduced a new set of functions to hash and validate passwords in in PHP: password_hash(), password_validate() and friends.
These functions have several things going for them:
- They have a great API.
- They solve a problem that is solved incorrectly often in PHP, making many PHP applications vulnerable.
For projects where I’m able to require PHP 5.5 as a minimum version, I use these functions, and for projects that require PHP 5.4, I use password-compat library, which implements the exact same API in PHP and does so quite excellently.
However, the initial introduction and rfc for these functions made me uneasy, and I felt like a lone voice against many in that I thought something bad was happening. I felt that they should not be added to the PHP engine.
I think that we should not extend the PHP engine, when it’s possible to write the same API in userland, or there are significant benefits to do it in PHP, such as performance.
Since the heavy lifting of the password functions is done by underlying libraries that are already exposed to userland-PHP, it didn’t make sense to me to expose it as well in the core.
There’s several drawbacks to writing things in C for PHP:
- The release schedule is tied to the release schedule of PHP. Lots of people can’t or won’t update their PHP, so if a vulnerability or bug is introduced in the functions, they have no easy option to upgrade.
- I’d argue that it’s easier to introduce buggy code in unmanaged languages such as C, as opposed to PHP.
- By adding it to the language, it also extends the ‘PHP specification’ and forces alternative implementations such as HHVM to duplicate the API, adding more work for everyone involved and also increasing the surface of potential vulnerabilities again in the future.
I can’t think of any technical reasons why it would not be better written in PHP. In fact, there is a PHP version that does the exact same thing, and is actually also recommended on php.net for older PHP versions.
So what are the non-technical benefits?
- Adding these functions to PHP may give them more legitimacy.
- Adding the functions to PHP perhaps give them a broader audience and more visibility.
I think those benefits are perfectly valid, and especially considering that this is a security-related topic, probably outweight the drawbacks of adding it to the PHP engine.
But it also illustrates that the PHP community has a problem. Python, a language in many respects similar to PHP also comes with a large list of default modules containing API’s that python developers can generally depend on.
The difference is that many of these modules are written in Python, and not C. Why are we not ‘eating our own dogfood’ in PHP? Perhaps PEAR was once that, but there’s no real replacement.
If code for PHP is required to be written in C to be considered legitimate and dependable, I think we need to admit we have a problem.
Article title is misleading but other than that, I agree with your claims about implementing something as important as password_hash() into the core is a bad idea.
Good point on the title.
Lud Akell •
Man, this is exactly what I thought! This text is appearing on google's searches about password_hash, but it doesn't necessarily have to do with the function. There is no problem with the function itself.
Michael D Johnson •
I wholeheartedly agree. I rather like Python's methodology here. I don't know about details in core, but I can't imagine it being hard to just preload some standard PHP library, a la autoload. If done right, it always exists in the opcache and it should be in some sort of read-only memory, shared among instances.
The upside of course is more widespread use of *proper* password crypting of such a crucial feature. And should it ever become insufficient, then upgrading PHP isn't the only option. With its clear cut API, it's kinda trivial to provide a drop-in replacement. Monkeypatching works as well as `#define password_hash password_hash2`.
Pythons bundled module system, btw, comes with its own set of issues. The clusterf*** that was urllib* is why it hasn't widely catched on for web apps. And that's only slowly being displaced by requests. Same goes for the original PEAR bundling. Most of the elected few core packages just had unlovely APIs.
Evert, I'm disappointed in this. You mention the actual specific reason it was added to the core and then failed to address it. One correct implementation in the entire history of PHP isn't a very good track record. One of the primary reasons that PHP has a bad rap is because it makes it so hard to do security. It was added to the core because userland developers, except for a couple, failed at the task.
None of this is news to you.
One of the things I tried to address in this blog post, and did (probably more clearly) on twitter, is that I feel that adding the password functions to core is the best possible decision for PHP today. What I'm hoping is that a few years down the road, the implementation could have been written in PHP and be perceived as just as good.
So yes, I see the benefits as I pointed out in my article. The article was not specifically about the password functions, it's about the last sentence:
> If code for PHP is required to be written in C to be considered legitimate and
dependable, I think we need to admit we have a problem.
Pádraic Brady •
I think Python is the most compelling point. PHP tried this, sort of, with PEAR, but it was kept outside in its own silo where it was technically optional, oversight was imperfect, and its distribution mechanism became its undoing.
What nobody would want to see, however, is having a PHP Standard Library that starts displacing anything more than what might be described as essentials to the language. We don't really want to leave behind all the gains we've made with packages and the rapid erosion of NIH and protectionism.
Imagine if PHP added some full featured PHP package to core, and then some standard came along that everyone is excited to adopt, and then all we hear for the next decade is "backwards compatibility" while it sits there like a lump ignoring all progress beyond its borders and largely preventing adoption of said standard. Which means writing alternatives beyond PHP and letting its Standard Library go rot.
That's pretty much the nightmare scenario, but that could work within certain restrictions to prevent PHP taking on monopolistic tendencies around an overreaching Standard Library.
I'm not sure it's necessarily about legitimacy and dependability, but probably more about accessibility. Put simply, adding something as important as *good* password hashing into the core means more people will get it right without the need to have the foresight to search for a 3rd party library.
Bear in mind that if a developer doesn't realise the importance of proper hashing, they probably don't realise that there are 3rd party libraries that will do a better job of it than they ever could - hence so many people are still using broken crypto like MD5 because it's there by default and it's easy to use.
I for one think it's a great idea. It doesn't matter how many years into the future you look, there will always be amateur coders who don't really know what they're doing, and providing them with the tools to do important jobs from the get-go is a step towards making the web a more secure place.
I wholeheartedly agree with everything you said. But my main problem is, why does it have to be in C when it could have been in PHP?
I think the current PHP project does not allow core extensions to be created in PHP, and I think that's a major flaw.
I would love to see PHP go in a direction where a function like password_hash is part of PHP, part of the php documentation, always available and written in PHP.
Some very good points here, it's changed the way I think about it.
The on going password hashing competition https://password-hashing.ne... will result in a new standard soon. One of the finalists battcrypt has been specifically designed to use an existing PHP primitive and hence work well in userland PHP. It is a shame that even if it is chosen it will probably be implemented in C to work with the password_hash API or else be seen as illegitimate.
An advantage of a large stdlib, which PHP has, is that you can get to a point where you can build something reliable without reinventing the wheel either on your own or by pulling in a bunch of userland packages, which might conflict dep-wise to the point that you're either forced to use (and maintain) older versions, or have to do like Guzzle did and swap namespaces so you can run the newer, BC-incompatible version alongside the older one.
Don't get me wrong, I love Composer and haven't written any PHP extensions (once PHP 7 drops I may get into that biz...who knows?). But in the particular case of password_* being able to point to a best practice in core, "Make sure you have a reasonably recent runtime and then just use this function," is a huge step forward security-wise for the language that powers a huge portion of the web, dev'd on by folks who vary experience-wise.
Another example here is the RFC for a reliable, real CSPRNG in core. Becuase too many people are using random(), shuffle(), mt_rand() and uniqid() improperly. You can pull in randomlib...or you can point people to use a vetted function, with polyfills available for older versions, that is a sound building block.
On the other side of the issue are things that are amazing when standardized, but are not matters of flat-out security so you can implement them in PHP or as extension (the latter for performance). The PSR-7 web server interface is a solid example of this. No need to put it into core because folks who want it will opt into it, and folks who don't aren't doing something horribly insecure by default (SQLi can happen in PSR-7 too). Same thing for the PSR-3 logging interface (in contrast to PHP's internal error and exception handling system, to which new levels have been added in recent history, which is something you want to be able to rely on).
By the way, Composer (which is itself in userland and is aggressively updated) seems to me to be a solid replacement for PEAR. Maybe I'm missing something though.
In sum, while I agree with the last sentence of your post, please pick on some other piece of core. Heck, pick on uniqid(); I think it fits your description of "why should this be in core" a lot better than password_hash() and password_verify() unless I'm mistaken.