Previously it returned a bool to say whether or not it had
succeeded, but always initialized the iter. The other iterators
branch on failure.
This adds the branch target which brings it closer to the other
iterators, and avoids the need to do a (pointless) CIterFree on
the failure path just to keep the validator happy.
I've also added a translation for it.
This was barely a demonstrable win in sandbox mode when I first wrote
it and it's not even used in hhir. It's probably been bitrotting and is causing
crashes for some people. Time to say goodbye.
Most of the SpillStacks we do are just to keep the in-memory VM stack
clean in case a helper call throws an exception. Many of these exceptions are
vanishingly rare, so let's stop making the fast path slower for them. This diff
adds support for catch traces, which are just like normal exit traces with one
major exception: they're never jumped to from translated code. If we're
unwinding a TC frame and discover that the current rip has a catch trace
registered, the unwinder will execute it before resuming unwinding.
The unwinder parts were fairly straightfoward, then I ran into a bunch of
issues with SetM. The SetElem instruction sometimes consumes a reference to one
of its inputs, which is bad news for our optimizations. To make everything work
again, I changed SetElem to throw an exception in cases where it would've
previously decreffed the value input. This is a special exception that the new
unwinding code recognizes. When one of these is caught, the catch trace
executed for that TC frame will finish up the vector instruction and push the
value provided by the exception on the stack. This allows us to optimize most
traces under the assumption that SetM's output is the same as its input,
letting the catch trace clean things up if the helper decides that's not the
case and throws.
There is one common case where SetM's output isn't the same as its input:
setting an offset in a string base returns a new String. Luckily, we can detect
most of these at compile time so SetElem returns a new StringData* when the
base is known to be a string. If the base might be a string, SetElem will
return nullptr if the base wasn't a string, and a StringData* if it was. We
test the output in these cases and side exit if the return value of SetElem is
non-null (I have yet to see this side exit happen outside of artificially
created test cases).
Once I got things working in the vector translator, I went through every other
call to spillStack and exceptionBarrier, replacing them with catch traces as
appropriate.
Functions like array_filter and array_map need "withRef" semantics,
and while it can be simulated in php via copy-on-write, doing so
is very inefficient.
This adds
SetWithRefLM
SetWithRefRM
- similar to SetM but binds the value if it was a reference. L reads a local, R reads a return value
WIterInit
WIterInitK
WIterNext
WIterNextK
- essentially the same as the corresponding opcodes without W, but the value local is set by reference if the array element was a reference.
Will be needed for array_filter/array_map etc
This sets things up so that if we define a builtin in systemlib, we rename
the corresponding c++ builtin with the prefix __builtin_, so its still available
(in case the php builtin wants to delegate some edge cases, and to make
it easy to run comparisons between the php and c++ implementations).
Also did a little reorganization to get rid of Func::isPHPBuiltin,
and use an Attr to identify functions as builtins. C++ builtins can
still be identified by checking the Func::info() method. This is needed
to allow builtin methods defined in php (such as array_map) to lookup their
arguments in the correct context.
The change to stop zeroing locals during RetC was too
aggressive---we stopped doing it during unwinding also. When
unwinding, we need to zero locals and $this because another
destructing object may still run debug_backtrace, and we won't have
the *pc == OpRet{C,V} trick to tell us to ignore the junk.
Make sure m_received is always null at ContSend, ContRaise,
and ContNext, and this information to generate better code for these
bytecode instructions.
This diff changes HHVM so that arrays are considered to implement the
Traversable and KeyedTraversable interfaces (which have no methods).
The idea is that these interfaces will be useful for parameter type
constraints for PHP code that wants to be compatible with both arrays and
collections (and possibly other objects that implement these interfaces).
The Func*'s in the native func unit are all persistent, which means
that we can strip them, and skip the Unit::merge call on every
request. But we didnt set the flag to compact the unit if the only
things that needed stripping were Func*'s. Fixed that, and also fixed
it so all the builtins are marked persistent even in non-repo-auth
mode (funcs are not if rename function is enabled, because we implement
fb_rename_function by swapping target cache entries).
This adds a new trace macro to allow tracing to the ring buffer in release builds. It also performs normal tracing, too, like TRACE(n, ...). Converted a number of trace/log messages in the debugger to the new macro so we have this data when we get a core file. I've converted things that we've found useful lately, but this will be adjusted over time quite a lot as we discover new things that help us find problems more quickly, or find messages that turn our to be useless or spammy.
I'm committing my work on HphpArray in bite-sized pieces for easier review. This piece replaces calls to NEW with a factory method. There are many problems with operator new, starting with the fact that the allocator cannot communicate properly with the constructor.
The newly introduced factory method should return ArrayData but that causes many issues right now, so I left that step to a future diff.
This diff introduces a new "eager" vmreganchor, which we call
from some of the heaviest users of vmreganchor. For an eager
reganchor, we save the rbp, pcOff and spOff at the call site
in the MInstrState, and the reganchor itself reads it off
from the MInstrState (instead of following the rbp chain
and looking up the hash table). Currently, the set of functions
which use eager vmreganchor is based on profiling.
Fix a sigfault when we eval a snippet of PHP for the debugger. If there is no activation record (between the end of a request and PSP, before first function execution of the request, etc.) we'd segfault setting up the invocation. Fixed to tolerate that case.
I was learning from @jdelong and he said that you should use
double quotes for local includes and angle brackets for library
includes. I asked why our code was the way it was, and he said he wanted
to clean it up. I beat him to it :)
Conflicts:
hphp/runtime/base/server/admin_request_handler.cpp
hphp/runtime/vm/named_entity.h
It turned out a lot of the namespace stuff still worked. The biggest thing for the first pass is that we don't fallback to the global function or constant if there isn't a namespaced one.
Also, when a constant has a ##\## anywhere in it it throw an error when it isn't defined, instead of assuming the string.
PolicyArray splits the ArrayData implementation in two parts. ArrayShell implements the baroque ArrayData implementation in terms of a much smaller statically-bound core that concerns itself exclusively with the storage strategy for the array. This way the two aspects can be worked on separately, and different stores can be easily plugged into the given ArrayShell.
To give a better perspective, the featured SimpleArrayStore has about 340 lines all told (including solid documentation), whereas ArrayShell has some 1100 lines. The shell can be reused with other stores of unbounded sophistication. The store needs to implement only 22 primitives, some of which are trivial. Basically a new store implementation saves 1100 of difficult-to-get-right lines of code right off the bat, and only needs to focus on implementing 22 well-defined primitives.
Things to watch for when reviewing:
- Is the store API small/expressive enough? What should be added/removed?
- Are various forms of duplication present? If so, how could code be factored better?
- Any subtle change in semantics from HphpArray and friends that should be minded?
Known issues:
* Currently ArrayShell is defined as:
class ArrayShell : public ArrayData, private SimpleArrayStore { ... };
when in fact it should be:
template <class StorePolicy>
class ArrayShell : public ArrayData, private StorePolicy { ... };
Extenuating circumstances related to allocator design prevent use of templates at this point. I'll work on these in parallel with the review.
* growNoResize is almost always prefaced by a test-and-grow. This sequence should be encoded in a function.
* The growth policy is spread all over the place in a subtle form of duplication.
* The current store design makes almost no effort to be particularly efficient.
This func took 2 params and ignored the second one. While I was there I made the functions call eachother. Hopefully inlining is smart enough to let me clean this code.
I got a bit carried away, but I think this is better.
In order to ensure that we can continue to interpret code when setting up a step out, even if we're stepping out to jitted code, we set the saved RIP in each ActRec on the stack to the retFromInterpretedFrame() helper. However, this is wrong for generator functions. There is a different variant of the function for generators, so use that when appropriate. Also added a check to ensure we only changed return addresses to jitted code. Mark thought of a rare case where we could have a return address to a C++ helper in there, so leave those alone.
The debugger's Instrument command does not work with the JIT; any instrumentation requests will be ignored for jitted code. We believe that no one is using this at this time. I'm starting with a small diff to disable the command and remove the impact it has on the interpreter for now, just in case we're wrong. Once the command has been disabled for a few weeks I'll come back and remove all of the code (task 2376711).
InstHelpers was just pure dead code, so I nuked it.
Rather than test RuntimeOption::EvalJit and 5 thread locals to determine
whether or not to run the jit on each re-entry, maintain one thread local.
Make various RequestInjectionData fields private to ensure that the jit
flag is kept in sync.
This cleans up the code a lot, and takes it out of various hot
paths. It will impact perf for requests where intercepts are used,
but no longer penalizes requests that don't use intercepts.
We're paying for some null checks here each time we do a
backtrace. Inspected all usage sites of the other StaticStrings to
ensure adding "consts" shouldn't change behavior (similar to that
ternary operator issue @smith ran into). Also remove a bunch of
temporary smart pointers that we don't need anymore.
We already look up the PC for every frame as we're taking the
backtrace, so we can just skip them when going through a backtrace
frame. Also, there is currently a guarantee that if an exception
propagates through a frame where pc was pointing at RetC, all the
locals and the $this pointer have already been decref'd, so we don't
need to use a null check to determine this.
This improves both Next and Out to avoid interpreting and stepping everything between when they start and finish. Out now lets the program run free until a pseudo-breakpoint at the return site it hit. Next continues to single-step the source line being stepped over, but now lets the program run free under calls made from that line.
The logic in Next regarding "calls made from that line" is extremely generic. We don't look at, say, call opcodes and decide to do something special. Rather, when we find we're off the original source line and a frame deeper we setup a "step out" operation much like the Out command then let the program run free. When we reach our return point, we continue stepping like normal. This accounts for not just calls, but iterators, and anything else that causes more PHP to run under the original source line.
This change moves the flow control logic down in to the respective cmds: Next, Step, Out, Continue. These cmds get a crack at executing at various points in the interrupt/command processing path. These cmds now own setting up the last location filter, whether they need VM interrupts, and whether they're done or not.
There are two dynamic_cast<HphpArray*> instances in bytecode.cpp that cause code to fail when using other types of arrays. However, the necessity of those casts seems to have disappeared in the meantime because simply removing them compiles and runs. I have ran full 'fbmake runtests' after removing the first cast (all passed). Then I removed the second cast (in lexical order) and ran the tests again. This time test/zend/good/ext-hash/hash_file_basic.php failed, but when I ran it again in separation it passed.
I did an experiment turning off all stack checks, and it
looks like it does cost something. This seemed like an easy way to
turn it off some of the time, but maybe we should resurrect the SEGV
handler.
We store a bunch of data into an ActRec on the stack, then
call a php-level function that calls a C++ function to copy it into an
ActRec on the heap. This should hopefully be a little better.
We can't box a non-ref value for a ref param because
we could be modifying an array with refCount != 1.
zend warns, skips the call and returns null. For now, this
makes us warn, but do the call (without modifying the array).
A file scope flag controls whether to skip the call or not,
and should be changed (and eliminated) once all our tests pass.
With the flag turned on, we still dont match zend's behavior,
because its happy to go ahead and call the function with no
warning in the case of a literal array parameter
(cuf('foo', array(1))), even though it warns and skips it when
the array is in a variable ($a=array(1); cuf('foo', $a)).
Sets up the translator analyze pass to create a Tracelet for
the callee at every statically-known FCall. If the callee has an
appropriate shape (in this diff, it must be a function consisting of
"return $this->foo" for a declared property), we can inline it in
HHIR. Restructures the IR relating to frames some so we can eliminate
the stores relating to ActRec in this simple case (see the comments in
dce.cpp and hhbctranslator.cpp for details). Includes partial support
for inlining callees with locals, but it's disabled for now because
they will keep the frame live.
This is a partial step towards merging the HPHP::VM namespace
up into its parent. To keep it reviewable/mergeable I'm not doing
everything at once here, but most of the code I've touched seems
improved. I've drawn an invisible line around the jit, Unit and
its cohort (Class, Func, PreClass, etc.); we'll get back to them
soon.
Add a lot of comments to the debugger based on my current understanding of it. These may change in the future as we learn more, but they're helpful right now.
Also moved a few small things around in the code to clarify their purpose or scope. I.e., making a few things private, renaming a few functions, etc. No real logic changes, though. Also minor dead code removal. Also a few lint errors.