poll can set both POLLIN and POLLHUP. We were checking for
POLLHUP first, which meant we often missed the "exit" command. This
meant the LightProcess would incorrectly report that it was exiting
because it lost its parent, rather than because it was told to exit.
Fixing this so its easier to diagnose what was going on.
Otherwise, the SIGCHLD handler will trigger a server shutdown.
The only time this is bad is when LightProcess::Close is called
from the segfault handler, because it triggers a server shutdown
while we're dealing with the initial crash, which almost always
leads to misleading core dumps.
I was learning from @jdelong and he said that you should use
double quotes for local includes and angle brackets for library
includes. I asked why our code was the way it was, and he said he wanted
to clean it up. I beat him to it :)
Conflicts:
hphp/runtime/base/server/admin_request_handler.cpp
hphp/runtime/vm/named_entity.h
I think the cause of our issues with random failed-to-exec
errors is this SIGCHLD handler. When the parent process asks a light
process to do a waitpid operation, we now can get EINTR for two
reasons. It used to use signals only for implementing timeouts with
SIGALRM, but now it appears possible due to the SIGCHLD. This could
lead us to report the waitpid as failing---which it looks can have
ExecFuture thinking running==false and reading the exit code as -1.
Also, now that we have a SIGCHLD handler (are we sure we want to keep
this?), using waitpid() without an EINTR loop is probably similarly
broken in the parent process. (Also, isn't it basically incorrect to
use wait functions without an EINTR loop in general, though?) I
didn't add an EINTR loop in do_waitpid in this diff because it's using
signals for timeout (it really should be checking a volatile
sigatomic_t that gets set by the handler, though)
Also, this means using Process::Exec is suspect in general. All uses
looked like debugger or tests only (and when hphpc runs hhvm in a
subprocess), so I didn't nuke them (yet). I deleted some random
network.h dead code that uses it, though.
This should only happen if there's a bug in our code and the
child process crashes, or if one gets killed by the OOM killer. In
either case, it's probably not safe for the parent process to continue
uninterrupted, so shut down.
g++-4.7.1 treats "FOO"bar as a c++-11 literal operator, even
if bar is a macro with an expansion such as "BAR" - so add a space
after the quote (this seems like a bug, and I fixed a bunch of these
a while ago, but we just added a slew of PRI*64 macros which break
under 4.7.1).
Also, it warned that "explicit by-copy capture of 'this' redundant"
for a lambda declared [=, this] - so I removed the this.
We also needed more than the 60 levels of template expansion that was
allowed by the makefile.
Per @mwilliams' suggestion, this is the first stage in a staggered approach to replacing int64 with int64_t. More precisely I inserted "typedef ::int64_t int64;" in util/base.h and dealt with the consequences.
This change is mostly for FB internal organizational reasons.
Building is not effected beyond the fact that the target now
lands in hphp/hhvm/hhvm rather than src/hhvm/hhvm.