When object support was first added to HHVM, a class named "Instance"
was introduced (deriving from ObjectData) to represent instances of user
defined classes. Since then, things have evolved and HPHPc and HPHPi have
been retired, and now there really is no needed to have ObjectData and
Instance be separate classes anymore.
As a first step towards merging ObjectData and Instance together, this diff
puts their definitions in the same .h file and puts their implementations
in the same .cpp file. A few small changes were necessary to fix issues
with cyclical includes: (1) Repo/emitter related parts of class.cpp and
class.h were moved to class-emit.cpp and class-emit.h; (2) the contents of
"vm/core_types.h" was moved to "base/types.h"; and (3) a few functions that
didn't appear to be hot were moved from .h files and the corresponding .cpp
files.
Updates continuations to allow yielding of a key-value
pair from a generator. Adds bytecode instructions (PackContK,
ContKey) for using the new feature, and adds IR instructions
(ContUpdateIdx, ContIncKey) to help get it down to the metal
(in particular, ContIncKey attempts to keep the current use-cases
as fast as possible).
Instead of doing name-based checks, using a flag in the runtime seems better. I tried to have a flag on the FuncEmitter that was set when you made these weird functions, but there was a few situations that needed to do name-based checks
* .hhas files have no control over anything but the name
* the parser would have to pass the info through the AST to the emitter
Instead I opted for the simpler approach for now until we want to solve those.
While debugging a sandbox crash, I spent some time looking at a huge
sequnce of conditional masks, compares and branches that didnt seem
to belong in the code I was debugging. Finally realized that it was
a reffiness check. It looked way too complicated, so I investigated.
Part of the problem was that we were avoiding a malloc in the case
of a zero param function at the expense of an extra check.
Instead, this diff always sets up at least 64 bits worth of m_refBitVec.
But by using the space set aside for the pointer in Func::m_shared it
avoids a malloc for any function with fewer than 65 arguments, and
avoids the numParams check for the first 64 parameters.
In addition, the existing code was spitting out a generic test for
the guard condition - (mask & bits) == value - where mask and value
are known constants. Since the most common case is that value == 0
(all the parameters are expected to be by value), we can usually omit
the compare. In addition, since most functions only have a small
number of parameters, we can usually get away with 8 bit, or 32
bit operations.
The result is that for a typical function (fewer than 64 args, args
expected to be by value) the reffiness guard is now
test <mask>, Func::m_refBitVec[0]
jne exit
Rather than:
move <mask>, reg1
xor reg2, reg2
cmp 1, Func::m_numParams
jnl ok
test AttrVarArgs, Func::m_attrs
jne exit
jmp done
ok:
load reg3, Func::m_refBitVec[0]
and mask, reg3
cmp reg3, reg2
jne exit
done:
I need to walk a live eval stack, handling FPI regions and
generators, and duplicating that logic in yet another place seems
wrong. This makes the Stack::toString stuff use a reusable one---at
some point I'll see if the unwinder can use it as well, but it works
also for my use case. Also, Stack::toString was broken for mutable
array iterators---fixes that. And use a union now for struct Iter.
And don't include bytecode.h from func.h.
The codebase had several namespace-level static data definitions and function definitions. Using namespace-level "static" in a .h file is near-always a bad idea, as follows:
- for simple types, static is implied. Example: "const int x = 42;" is the same as "static const int x = 42;"
- for aggregate types, static linkage implies that a copy of the aggregate will appear in every compilation unit that includes the header.
- for functions, static means the function will have a separate body generated in each compilation unit including the header.
- in several places functions were defined 'static inline', which is just as bad. 'inline' is just a hint which means the function may end up having a body (and actually more due to 'static').
True, gnu's linker has means to remove duplicate definition, but that's not always guaranteed or possible (think e.g. static functions that define static data inside, ouch). So static is useless at best and pernicious at worst. We should never, ever use static at namespace level in headers. I will create a bootcamp task for a lint rule.
I expected the performance to be neutral after the change, but in fact there's a significant drop in instruction count and therefore a measurable reduction in CPU time: https://our.intern.facebook.com/intern/perflab/details.php?eq_id=431903
I was going to #include translator.h in a header I had for
talking to the region selector thing and decided to just get this over
with instead. (It shouldn't need to #include that.) Found a few
other unused things to remove while at it.
Will be needed for array_filter/array_map etc
This sets things up so that if we define a builtin in systemlib, we rename
the corresponding c++ builtin with the prefix __builtin_, so its still available
(in case the php builtin wants to delegate some edge cases, and to make
it easy to run comparisons between the php and c++ implementations).
Also did a little reorganization to get rid of Func::isPHPBuiltin,
and use an Attr to identify functions as builtins. C++ builtins can
still be identified by checking the Func::info() method. This is needed
to allow builtin methods defined in php (such as array_map) to lookup their
arguments in the correct context.
- Do the actual work of implementing idx() in PHP
- Remove the old C++ builtin (which needs to be done at the same time)
- Remove OSS test dependencies on idx()
I was learning from @jdelong and he said that you should use
double quotes for local includes and angle brackets for library
includes. I asked why our code was the way it was, and he said he wanted
to clean it up. I beat him to it :)
Conflicts:
hphp/runtime/base/server/admin_request_handler.cpp
hphp/runtime/vm/named_entity.h
This cleans up the code a lot, and takes it out of various hot
paths. It will impact perf for requests where intercepts are used,
but no longer penalizes requests that don't use intercepts.
I did an experiment turning off all stack checks, and it
looks like it does cost something. This seemed like an easy way to
turn it off some of the time, but maybe we should resurrect the SEGV
handler.
This is intended so reflection can be used (via getTypehintText and getReturnTypehintText) to regenerate code the user annotated with types. Essentially using reflection to intrispect code in order to generate type safe
(hack safe) code. That is particularly important for the tools that do dependency injection. The runtime should be oblivious to the change as the rich type annotation is currently only stored for the sake of reflection. For
functions the values are in the shared portion which is cold and should also take care of traits.
This is the first part of the work to expose type constraint and generic all the way to reflection. This first DIFF exposes the return type with generic types coming next.
We need to make sure that we execute the code
corresponding to the actual Func*. This re-uses the
prolog array to hold the entry points for the cloned
Func*.
In Zend 5.3 they decided that closures should inherit the ##$this## from the containing scope. This brings us close to paraity with them. The remaining thing is to make
function() use ($this) {}
fatal after we purge it from WWW.
I got bit by a really nasty bug where I cloned a ##Func## and didn't set the ID. Then the JIT happily used it and I got really weird segfaults.
Making the ##m_funcId## private turned out to be really hard (~30 callsites) so @andrewparoski said the ##translate()## and ##retranslate()## were the most important places. We just have to remember to do it. If there are other important places I'll just bite the bullet and make it private with an accessor.
This should save days of many future engineers.
For closures, I'm going to clone the ##__invoke## method and set the runtime ##m_cls## to be whatever the containing class was (or nullptr if there isn't one). Instead of finding all the callsites that decRef the ##ActRec##'s ##m_this## based on ##isMethod##, I think it is clenaer to make ##isMethod## reflect the state of the ##Func*##
Instead of having the body of the closure be in the ##__invoke()## on the ##Closure## class, instead we make an anonymous function on the real class and put the body there. The signature for this function is:
function methodForClosure$1234($arg1, $arg2, ..., $use1, $use2, ...)
and then ##__invoke## now just takes all the params that were passed to it, puts them as the first args to the anonymous function, then takes all the use variables it had saved up and passed them in as the next params.
I tried to not have an ##__invoke## at all, but I ended up basically doing the same parameter and use var repacking in iopFCall (and would have had to do it in x86 code too). I opted for doing the rejiggering in bytecode. If I did it in raw PHP I think it would have been much slower with many ##func_get_args()## and array operations.
This diff refactors some of the VM's logic for iterators (with a focus on
mutable iteration), delivering several improvements:
1) MIterCtx was renamed to MArrayIter, and the m_key and m_val fields
were eliminated.
2) Eliminated the need for MArrayIter to dynamically allocate a
MutableArrayIter object, and removed other layers of indirection as
well.
3) Reduced the size of HPHP::VM::Iter from 64 bytes down to 32 bytes.
4) Removed the "if (siPastEnd())" check when adding a new element to an
HphpArray or a ZendArray.
5) Moved all of the iterator logic into a single .cpp file.
This diff reworks FullPos's to point to current element instead of pointing
to the next element. It also splits up the IterFree instruction into two
instructions (IterFree and MIterFree). These changes allowed various logic
to be simplified and data structures to be reduced in size. There is
definitely more opportunity for refactoring, but I know the JIT helpers for
iteration have been carefully tuned and so I'll leave further refactoring
for future diffs.
Finally, I spent a little time cleaning up the bytecode spec a bit, mostly
with respect to iteration.
This change is mostly for FB internal organizational reasons.
Building is not effected beyond the fact that the target now
lands in hphp/hhvm/hhvm rather than src/hhvm/hhvm.