For PhpFiles, there was a mixed management system, where references from
translated code were held until the code was made unreachable, while
references from interpreted code were held for the duration of the
current request. Under the new scheme, PhpFiles are always treadmilled.
They are owned by the FileRepository, and so need to be ref-counted
because the FileRepository can have the same PhpFile under multiple
paths. But we don't ref-count the *uses* of PhpFiles anymore.
Classes still need to be ref-counted, but as soon as their Unit
goes away, most of their internals can be freed. We just need to
hold onto them if derived classes are referencing them. Even in
that case, the next time we try to instantiate the derived class,
we can kill any that reference this Class.
There were also lots of holes where references were not dropped,
or owned data structures were not destroyed. eg a Class's methods
were not destroyed when the Class was destroyed; there were
several paths where an entry was erased from the file map, but the
corresponding PhpFile was not decRef'd - or worse, a null entry
was left in the map (something we had asserts to check for).
This tries to make the handling more consistent.
When object support was first added to HHVM, a class named "Instance"
was introduced (deriving from ObjectData) to represent instances of user
defined classes. Since then, things have evolved and HPHPc and HPHPi have
been retired, and now there really is no needed to have ObjectData and
Instance be separate classes anymore.
As a first step towards merging ObjectData and Instance together, this diff
puts their definitions in the same .h file and puts their implementations
in the same .cpp file. A few small changes were necessary to fix issues
with cyclical includes: (1) Repo/emitter related parts of class.cpp and
class.h were moved to class-emit.cpp and class-emit.h; (2) the contents of
"vm/core_types.h" was moved to "base/types.h"; and (3) a few functions that
didn't appear to be hot were moved from .h files and the corresponding .cpp
files.
Adds an extension command to the debugger which provides
heap tracing functionality, and adds the framework needed to get
heap traces. The heaptrace command can dump the current heap to the
debugger session or it can save a GraphViz specification to a file.
I'm hoping to add a backend which can put the graph in GML format,
so it can be explored interactively. I also hope at some point to
see this integrated into FBIDE if we can do that.
Since TypedValue::operator= is dangerous, something like
tvTeleport is usually what you want to use, but it doesn't work with
temporary TypedValues (e.g. return values of things like make_tv or
cellAdd), because it took the arguments by pointer. This also means
we can change it to take parameters by value later without updating
callsites. This diff does the tvDup family and changes tvTeleport to
tvCopy. I'll gradually get the other ones done, but I just need these
for now to work with temporaries for changing SetOp to not use Variant
arithmetic.
It was used to indirect from an array to a Class's propVec during
86pinit methods, but this isn't a significant technical obstacle.
We now just pass in a regular HphpArray and pull values out of it after
invoking the method.
Part 2 will do the same for 86sinit methods (it's a slightly more
involved change, and I want to keep it maximally broken down for
revertibility). Part 3 will delete NVTW itself.
Instead, just check the class is defined and is the class we
expected at translation time (side exiting if not), and just use the
scalar value directly. If the class is Persistent or a parent of the
current context we don't need to check the Class* either.
We want to throw exceptions if someone does "instanceof Foo",
where Foo is a type alias. The easiest way to make sure the
instanceof bitmask optimization doesn't get in the way of this is to
not allow it on non-unique classes, assuming this is ok perf-wise.
Instead of doing name-based checks, using a flag in the runtime seems better. I tried to have a flag on the FuncEmitter that was set when you made these weird functions, but there was a few situations that needed to do name-based checks
* .hhas files have no control over anything but the name
* the parser would have to pass the info through the AST to the emitter
Instead I opted for the simpler approach for now until we want to solve those.
I noticed that directorty structure of hphp/system was a bit scattered, so
I consolidated things to reduce the total number of folders and to put
related things together with each other.
This diff moves the contents of "hphp/system/classes_hhvm" into
"hphp/system", it moves the contents of "hphp/system/lib" into
"hphp/system", moves "hphp/idl" to "hphp/system/idl", and moves the
contents of "hphp/system/globals" into "hphp/system/idl".
This reverts commit 2e9677b7c3f37e9627b9cbc9a6ddec82a10e7215.
Third time is the charm. I hid it from reflection, but I missed `get_class_methods`.
The diff betweenn this and what was reverted is https://phabricator.fb.com/P2217891 and then I did https://phabricator.fb.com/P2217904 because it looked like it should be done.
nvSet() only casts the value from TypedValue* to const Variant&; do it
at callsites. Inlined array_setm_ik1_v0() and array_setm_s0k1_v0() into
their only remaining callsites in translator-runtime.cpp.
We were being too conservative in determining whether to use
the static property cache when accessing a property from outside the
class. In fact we'd never use it in that case, because we required
that the class not need initialization (and any class with static
properties needs initialization!).
We can relax this restriction to just enforce that the class not
define sinit or pinit methods -- scalar properties are fine.
The codebase had several namespace-level static data definitions and function definitions. Using namespace-level "static" in a .h file is near-always a bad idea, as follows:
- for simple types, static is implied. Example: "const int x = 42;" is the same as "static const int x = 42;"
- for aggregate types, static linkage implies that a copy of the aggregate will appear in every compilation unit that includes the header.
- for functions, static means the function will have a separate body generated in each compilation unit including the header.
- in several places functions were defined 'static inline', which is just as bad. 'inline' is just a hint which means the function may end up having a body (and actually more due to 'static').
True, gnu's linker has means to remove duplicate definition, but that's not always guaranteed or possible (think e.g. static functions that define static data inside, ouch). So static is useless at best and pernicious at worst. We should never, ever use static at namespace level in headers. I will create a bootcamp task for a lint rule.
I expected the performance to be neutral after the change, but in fact there's a significant drop in instruction count and therefore a measurable reduction in CPU time: https://our.intern.facebook.com/intern/perflab/details.php?eq_id=431903
Func's are allocated using Util::low_malloc, so need to be
free'd with Util::low_free, rather than boost::checked_deleter.
In addition, there is some munging of the Func address going on,
so we really need to call Func::destroy instead.
Will be needed for array_filter/array_map etc
This sets things up so that if we define a builtin in systemlib, we rename
the corresponding c++ builtin with the prefix __builtin_, so its still available
(in case the php builtin wants to delegate some edge cases, and to make
it easy to run comparisons between the php and c++ implementations).
Also did a little reorganization to get rid of Func::isPHPBuiltin,
and use an Attr to identify functions as builtins. C++ builtins can
still be identified by checking the Func::info() method. This is needed
to allow builtin methods defined in php (such as array_map) to lookup their
arguments in the correct context.
I'm committing my work on HphpArray in bite-sized pieces for easier review. This piece replaces calls to NEW with a factory method. There are many problems with operator new, starting with the fact that the allocator cannot communicate properly with the constructor.
The newly introduced factory method should return ArrayData but that causes many issues right now, so I left that step to a future diff.
Changed the Set of all interfaces imlemented by a Class with an IndexedMap. Also changed the classof (and thus instanceof) to lookup in that map by name to see if an interface is imlemented by a class. That replaces the
previous mechanism that would walk the class hierarchy dereferencing PreClass in order to see if the interface was in the class inheritance. The new model seems to lead to better performance in terms of CPU instructions, lodas
and stores. The latest perflab also showed improved CPU time though that may be just lucky noise.
We have also removed the Class::classof (PreClass*) method which leads to a slightly cleaner code.
I was learning from @jdelong and he said that you should use
double quotes for local includes and angle brackets for library
includes. I asked why our code was the way it was, and he said he wanted
to clean it up. I beat him to it :)
Conflicts:
hphp/runtime/base/server/admin_request_handler.cpp
hphp/runtime/vm/named_entity.h
I did an experiment turning off all stack checks, and it
looks like it does cost something. This seemed like an easy way to
turn it off some of the time, but maybe we should resurrect the SEGV
handler.
We can't box a non-ref value for a ref param because
we could be modifying an array with refCount != 1.
zend warns, skips the call and returns null. For now, this
makes us warn, but do the call (without modifying the array).
A file scope flag controls whether to skip the call or not,
and should be changed (and eliminated) once all our tests pass.
With the flag turned on, we still dont match zend's behavior,
because its happy to go ahead and call the function with no
warning in the case of a literal array parameter
(cuf('foo', array(1))), even though it warns and skips it when
the array is in a variable ($a=array(1); cuf('foo', $a)).
The frontend appears to alread pessimize its prediction of
object property types whenever an UnsetContext might affect a given
property. We can use this to avoid checking for KindOfUninit when
doing specialized prop gets in the vectortranslator. This is
hopefully going to be worth more than it sounds, because it will let
us avoid a spillStack, which is a use of the frame and will prevent an
inlined getter from removing the ActRec (unless we implement some kind
of sinking for frames).
This is the first part of the work to expose type constraint and generic all the way to reflection. This first DIFF exposes the return type with generic types coming next.
Adds runtime support for non-class typehints. Typedefs are
introduced using type statements, and autoloaded via a new autoload
map entry. Shapes are parsed but the structure is currently thrown
away and treated as arrays at runtime. This extends the NamedEntity
structure to sometimes cache 'NameDefs', which are either Typedef*'s
or Class*'s. VerifyParamType now has to check for typedefs if an
object fails a class check, or when checking non-Object types against
a non-primitive type name that isn't a class.
Access to TypedValue.m_aux must now be via TypedValueAux. For now,
TypedValueAux is an empty subclass with accessors to m_aux, which is
now private. Once RefData.m_tv is moved out from under RefData._count,
we can move TypedValue.m_aux to TypedValueAux.
Removed unnecessary initialization of m_aux.u_hash from c_Vector.
This diff removes initializing stores to TypedValue._count, renames
_count to m_aux, and makes m_aux a union with members typed
and named according to their specialized uses. The few remaining
uses of that field for random tweaks are more obvious and easy to
grep for.
TypedValue no longer extends Value, (allowing m_data to move to a
different offset in the future), and Variant now extends TypedValue,
so we only have to maintain one definition.
HphpArray now explicitly uses TypedValue.m_pad instead of overlapping
TypedValue with an anonymous struct, again so we don't have to maintain
another structure to match TypedValue's layout.
The JIT's were using offsetof(TypedValue, _count) all over the place
for access to String/Array/Object/RefData::_count. Instead, use
FAST_REFCOUNT_OFFSET.
This diff updates the parser and runtime to support using collection
literals in initializer expressions for instance properties, static
properties, parameters, and static locals.
The runtime as-is was able to correctly handle collection literals in
initializers for static properties, parameters, and static locals. However,
for instance properties I needed to way to make it so that each instance
got a fresh copy of the collection literal.
To achieve this, I added an attribute to indicate that a property requires
'deep' intiialization. When this attribute is set, the Class machinery
will not call setEvalScalar() on the initial value (the value produced by
86pinit), and it will make a deep copy of the initial value when a new
instance of the Class is allocated.
Instead of having the body of the closure be in the ##__invoke()## on the ##Closure## class, instead we make an anonymous function on the real class and put the body there. The signature for this function is:
function methodForClosure$1234($arg1, $arg2, ..., $use1, $use2, ...)
and then ##__invoke## now just takes all the params that were passed to it, puts them as the first args to the anonymous function, then takes all the use variables it had saved up and passed them in as the next params.
I tried to not have an ##__invoke## at all, but I ended up basically doing the same parameter and use var repacking in iopFCall (and would have had to do it in x86 code too). I opted for doing the rejiggering in bytecode. If I did it in raw PHP I think it would have been much slower with many ##func_get_args()## and array operations.
In the case of a race to create a new class, we did
that, corrupting jemalloc's data structures in strange ways
(it didnt notice, even with a debug build, but bad things
happened later)
This change is mostly for FB internal organizational reasons.
Building is not effected beyond the fact that the target now
lands in hphp/hhvm/hhvm rather than src/hhvm/hhvm.