Well, after months of instability with my hunchentoot-based webserver, I finally, once again, got around to trying to figure out the source of the instability was. I had come to blame SBCL's sb-ext:run-program functionality as I was able to fairly reliably crash the server using apachebench. I was also seeing sporadic crashes somewhat randomly after the server being up for a week or so. So, this was a pretty strong hit that it had something to do with sb-ext:run-program. Folks who were much more knowledgeable than I about the SBCL internals, including Francois-Rene Rideau and Gabor Melis, looked at cleaning up possible sources of race conditions and generally robustifying sb-ext:run-program but none of the fixes seemed to make the situation better. Compounding my difficulties was the fact that I was running the server on FreeBSD, which doesn't see quite the level of SBCL testing/hacking that, say, linux does, so I thought it possible that there may be a bug either in the way SBCL handles signals on FreeBSD or in FreeBSD itself. Finally, I got around to replicating, roughly, my setup on another computer. In this case a MacOS box which, when subjected to the same stressful conditions, gave me a helpful error message that said something about being unable to open a pipe or perhaps that there were too many open pipes. This got me thinking "wait a minute, I'm just calling the program via sb-ext:run-program and getting a stream to read data back from the program; who's closing the stream and getting rid of the process?" Then it dawned on me that perhaps nobody was and perhaps these processes were sticking around, consuming scarce resources, like pipes, and, eventually, causing the server to crash. Sure enough, waiting for the process to finish and then closing the process cleared up my problem.
I should point out that SBCL's sb-ext:run-program has an argument that seems relevant here, which is the :wait arugment. One can specify :wait t which will wait until the process has finished. This seemed to work in some cases, but fail in others. Eventually, it occurred to me that it was failing in the cases where the output was larger than in the cases where it was succeeding. I think what was going on was that the external program was writing data to the stream which would fill up some buffer, which then blocked waiting for data to be read, which wasn't going to happen until after the process returned. There could be something else, going on here, but it seems to me that :wait t, while somewhat in spirit what I want, isn't going to do it from my. In this case, I'm just launching a process and expecting to get some data back from it, this isn't, say, a window manager that's going to live on for the life of the SBCL process, or beyond. But, :wait t didn't seem to do what I need either, so I was back to :wait nil. Now that I figured out I needed to close the process I came up with:
(defmacro with-input-from-program ((stream program program-args environment)
&body body)
"Creates an new process of the specified by PROGRAM using
PROGRAM-ARGS as a list of the arguments to the program. Binds the
stream variable to an input stream from which the output of the
process can be read and executes body as an implicit progn."
#+sbcl
(let ((process (gensym)))
`(let ((,process (sb-ext::run-program ,program
,program-args
:output :stream
:environment ,environment
:wait nil)))
(when ,process
(unwind-protect
(let ((,stream (sb-ext:process-output ,process)))
,@body)
(sb-ext:process-wait ,process)
(sb-ext:process-close ,process)))))
#-sbcl
`(error "Not implemented yet!"))
which I can use a la with-input-from-string to read the data from the
external process:
(with-input-from-program (in path nil env)
(loop for line = (chunga:read-line* in)
until (equal line "")
do (destructuring-bind
(key val)
(ppcre:split ": " line)
(setf (hunchentoot:header-out key) val)))
(let ((out (flexi-streams:make-flexi-stream
(tbnl:send-headers)
:external-format tbnl::+latin-1+)))
(copy-stream in out 'character)))
and now the server seems a lot happier.



