December 03, 2008

Forking and MySQL connections

For some of our long-running processes we use PHP. It makes total sense from our perspective, because we can re-use all our existing business logic from our main PHP web application.

To make things more efficient, I recently started some work on using forks and have a couple of worker processes around.

This application is essentially the core of our transcoder. The parent process would retrieve new jobs from the queue and fire up a number of workers to actually transcode the file. The main problem is that the parent opens up a MySQL connection and fires off some queries. After the child process is done, it actually closes the MySQL connection regardless of if it was actually used or not.

This means I'll have to close all mysql connections before forking, and re-connecting right after. No big deal, but still at least a bit annoying.

<?php

$db = new MySQLi('hostname','user','password');

if (pcntl_fork()) {

    $status = 0;
    // parent
    pcntl_wait($status);

   $result = $db->query('select version()');
   if ($db->error) echo $db->error;

} else {

   // the child process does nothing and exits gracefully

}

?>

Output:

MySQL server has gone away

Web mentions

Comments

kvz • Dec 02, 2008
In PEAR I have created System_Daemon (http://pear.php.net/package/System_Daemon), which may help you fork your processes, log to files and generate init.d files. This does not fix your MySQL issue, but may at least ease up the forking.
Evert • Dec 02, 2008
I'll definitely have a look, thanks!
lfx • Dec 03, 2008
One of the dirty solution is not to close connection, MySql drop it by itself after some idle time. So why bother? :)
Evert • Dec 03, 2008
Because I need a connection :) even after forking
Andrey • Dec 03, 2008
In my opinion this is not a MySQL specific problem but a PHP one. You create a resource or object in the parent and it gets copied in the child. Once the child finishes Zend will call destructors. Can you try that with files or sqlite DBs?

Andrey
Evert • Dec 03, 2008
Its actually explicitly mentioned in the mysql manual.

http://dev.mysql.com/doc/mysql/en/gone-away.html
Dave • Dec 03, 2008
Do you absolutely HAVE to have an existing connection BEFORE you fork the process? I would avoid that at all costs from my own experience, where the connection is not copied to the child process. The same applies to any open resources as well. I think it may be that part of PHPs PCNTL / POSIX layer is broken. Another thing, if you register a destructor before forking, it will be called twice - by the child and the parent so you can get very weird results sometimes.

Best practice (from my own experience) is to fork your process and THEN do the processing once it has forked.

Finally; don't forget to: declare(ticks=1); and register a signal handler; otherwise your only way of killing the script will be a "kill -9 PID" with no chance of graceful exiting. Which is very likely not what you want!
Evert • Dec 03, 2008
Hi Dave,

I do very much need the MySQL connection. In essence I have one managing process that monitors a Queue from MySQL. As soon as new jobs come in, it will fire up workers. These workers stay around for a limited amount of time, and die when they are done.

So the processing does happen in the child-process, I just need (minor) mysql queries before they get started.

I personally avoid destructors altogether, I haven't yet found a case where I really needed one because there is no way to explicitly destroy an object.

As for the signal handling.. thats interesting. I do get zombies if I don't properly do pcntl_wait(), but I have no problem just killing them when the parent process is also gone. I personally just really dislike 'ticks', and they are also considered depreciated since 5.3. I believe the alternative is the non-blocking signal checkers also introduced in 5.3.
Dave • Dec 03, 2008
Then how about instead: forking the process into the "master" and then the master forking child processes as needed once it itself has been daemonised? i.e. calling pcntl_fork() twice. That way you can setup your monitoring environment and then launch your child processes. Or you could launch the child processes as CLI scripts that are simply controlled via the daemonised script? That is a method I've used in the past.

I have not looked at PHP 5.3 yet, but previously ticks where required; without them you could not set a custom signal handler and where forced to rely on the defaults which meant no clean way of killing your script - hence my comments!

Looks like I will have to go do some reading up on PHP5.3s PCNTL / POSIX changes.
Dave • Dec 03, 2008
Follow-up and going kind of off-topic too (sorry):

According to the PHP manual pcntl_signal relies on ticks to attach a custom signal handler (http://ca.php.net/manual/en/function.pcntl-signal.php) but ticks are being removed (just as you said) as of PHP6 (http://ca.php.net/manual/en/control-structures.declare.php#control-structures.declare.ticks) as declare() is being repurposed to set the character encoding.

So in that case, how on earth do you attach a signal handler?? Unless I am missing something very obvious, as of PHP6 there will be no ability to do this - unless pcntl_signal() is updated - but there are no examples or updates to the docs reflecting the state of declare(ticks) or what the "new" method is.
Evert • Dec 03, 2008
using php5.3 you can check for signals in a non-blocking fashion.. Ticks in general are a bit of a hack, and a little nasty I feel :). Sleep()ing the master and checking for signals every x seconds will work fine for my purposes. Quite frankly, I'm not monitoring for signals at all at this point. For my purposes, I have trouble finding a reason to do this.. but maybe I'm completely missing something.

as to your other point, I would be able to fork the 'master', but then I'd need to setup some IPC/semaphore stuff to communicate the jobs..

That all being said, disconnecting/reconnecting to mysql is really cheap. It also happens for every single http request. I don't really have any issues doing this, so its not much of a concern for me right now :)
Geoff • Dec 03, 2008
In my experience, using popen() or proc_open() works much better than forking. Yes, it's a slightly more expensive operation, but you don't have to worry about the child destroying the parent's resources.
MagicalTux • Dec 06, 2008
I personally had to fight this, and found the best solution is to close all mysql connections after a fork.

I have a SQL object with a forked() static method that will close all open SQL handles and notify all existing SQL instances objects.

When another query is ran, each instance will automatically reconnect on an "as needed" basis.
Evert • Dec 06, 2008
Our MySQL class also only makes the connection upon hitting the first query.

However, I close all connections right before I fork.. Seemed to be the best way to go about it.