Merge ContDone+ContExit into ContRetC with added support of passing results. This variable passing mechanism is not exposed to the PHP as the ReturnStatements in generators do not contain result expression. However, this is exposed by restored hphp_continuation_done() built-in to allow experimentation.
The idea is that once we introduce ContYield opcode (merge of all opcodes used by YieldExpression), we could change ContRetC and ContYield to leave result and done-status on the stack and leave it up to the caller (ContNext/ContSend/ContRaise) to fill in Continuation fields. This will make these opcodes more generic and useful for other things, while allowing us to move some properties to the VM and kill opcodes like ContCurrent.
Split up ConvToDbl into ConvArrToDbl, ConvBoolToDbl, ConvIntToDbl,
ConvObjToDbl, ConvStrToDbl and ConvGenToDbl, so that different flags
can be set for each instruction.
Continuation::send() uses m_received field to transfer the value inside
the continuation. This field remains set until the continuation is
iterated the next time.
Unset this field early by ContReceive so that the memory can be
reclaimed.
VectorTranslator is now good enough that all of this code can be
removed without causing a perf regression. Instructions, loads, and stores are
all slightly up, but CPU time looks like noise.
ZendPack::unpack() always returns a signed int32_t,
regardless of actual storage type being unpacked. For 32bit
unsigned types ('L', 'N', 'V', and 'I'(on I32 systems)) this
means overflowing in the helper and we have to explicitly
recast it to a uint64_t to get the data back out.
For LNV, this was already handled, it was missed for I.
Of the four horsemen of the SmartAllocator, ArrayData was the only virtual
call. This meant an extra layer of indirection when coming from the TC
to allow the c++ compiler to emit its virtual call, and slightly larger
callsites when using non-generic paths.
While we're moving in this direction, consolidate ArrayData introspection
on its type enum. isSharedMap() was previously implemented with a vtable
slot, and we had no way of asking if an ArrayData was a NameValueTable.
If a GenArrayWaitHandle is imported, the current implementation imports
only the active child. Try to import other children as well to increase
parallelism.
enterContext() can throw an exception if a cross-context dependency
cycle is found. It is safe to ignore such exception when importing
non-active children. The import will be attempted and fail again once
onBlocked() reaches the dependency that causes the cycle.
Tx64 only translates CGetS when you do it with the context
class the same as the class being looked up, to avoid the need for
accessibility checks. The IR can translate it more often, but was
using its fast path even when it was not safe to do so: if a previous
(safe) access had allocated a targetcache entry, later accesses would
be able to get to the property without an access check. For now just
limit the optimiation to safe cases.
Verify sanity of input array of dependencies in advance. Callers get the
error earlier and it also makes it easier to work with the array outside
of the main loop.
Our ext_hhvm generated code is casting TypedValue* to Value*
on the assumption that the offset of TypedValue::m_data is 0.
Fix this assumption, and also while in the same code, replace
some (t == KindOfString || t == KindOfStaticString) with
IS_STATIC_STRING(t), which does a single bit test instead of
two comparisons.
Many destructors were going through a C++ trampoline that did
nothing but turn a C call into a method call. Get rid of these where
possible. ArrayData still uses a virtual release, which is unfortunate
but cannot be helped at the moment.
change callers of determine_charset to check for nulls, instead of calling html_supported_charset (to pre-validate charset name) and then calling this (which has to run through a list of names anyway).
A few simple refactorings and optimizations for array_data. Perflab results (CPU time %) are on the desired side of noise: 22/23 show reduced times, 12/23 green, no red, best -2.2%, worst +0.3%.
In an unlikely situation a user of ext_asio may tamper with Continuation
before passing it to the ext_asio extension. Let's fail with an
exception if this happens. Previously, a user bug would stay unnoticed,
but would not harm the ext_asio code.
This check will be needed once we implement optimistic execution. With
optimistic execution, we iterate continuation and defer construction of
ContinuationWaitHandle until the first blocking event occurs. During
this phase, a standard dependency loop detection is skipped and the code
would try to iterate a continuation that is already being iterated.
I got to do a few optimizations:
* burn in the number of use vars
* not write out uninitialized use vars
* use the staticness of the method as a hint for the incRef
There is not a clear path forward to safely using MMX registers as
scratch storage. Doing so spoils the state of the legacy x87 FPU, and the
x64 ABI uses the x87 FPU to implement long double. Our options are to
either:
1. Prohibit use of long double in source code (or x87 instructions in
machine code). Given that we have a dynamically linked binary that
includes system libraries outside our control, open source that needs
to be patched, fbcode, etc., this could prove difficult.
2. Always execute an FPU resetting instruction before transitioning from
TC code to C++. For all I know, these instructions are cheap, but it
still seems like an unfortunate overhead to impose everywhere; since
we don't want to mark every helper with a "reset fpu" call, we'd
probably end up bloating callsites.
3. Admit this doesn't work.
In the absence of evidence this matters for perf, I lean towards 3.
They both returned the late static bound class, not the context
class. This meant that eg "constant('self::FOO')" was actually
returning what "constant('static::FOO')" should have done.
In addition, we often want the Class*, not its name, so
change them to return Class*. The remaining places that then
read the name from the Class* should be fixed to use the Class*
directly (in a later diff).
Finally, noticed that while "defined()" was recently fixed to
support "static::", "constant()" was not. Pulled out a common
function to find the correct Class*.
In the case where m_this was already in a register,
and known to be non-zero, we would fail to zero the ActRec's
m_this field, which could result in a dangling reference
to $this being captured by debug_backtrace.
Put the store on the presumably cold path. Fix DecRefThis to
do the same.
Avoid punning TypedValue* to String&/Array&/Object& in FCallBuiltin
(all three implementations). Our native function calling conventions
require passing pointers into a TypedValue for these types, and
pointers-to-scratch for return values.
In the HHIR case, I removed the optional "return pointer" argument
from the IR CallBuiltin instruction. The C++ value-passing ABI details
are now handled in cgCallBuiltin and are no longer exposed in the IR.
The argument types are still PtrTo*, but we handle the address fixups
in CodeGenerator.
The callback passed to asio_set_on_{failed, started}_callback was
never null because the way types for input parameters work in
extensions. Null from user PHP code was converted to a stdClass
object, triggering an exception.
ConvToArr has at least two variants: one that holds on to the object being converted and the other that does not. Having separate opcodes allow this distinction to be made. Making separate opcodes for each type of operand makes for a more consistent IR.
The interpreter fix was different than the jit/ir fix because ##translateFPushObjMethodD()##'s ##i.inputs[0]->rtt.valueClass()## is null (with a TODO in the code to make it not null). It then goes into the slowpath.
Another one like this and I think I should have a different attribute for static closures :(
The only places where ReturnStatement is constructed are:
- onReturn(check_yield=true) -> not allowed in generator
- onReturn(check_yield=false) -> coming from transform_yield_break, right after creating hphp_continuation_done()
- MethodStatement, end of function call -> hphp_continuation_done() is created at end of generator in prepare_generator()
Emitter is emitting ContExit in ReturnStatements used in generators. As
can be seen from the analysis above, it's always preceded by emitting
ContDone from hphp_continuation_done(). Let's emit ContDone inside the
ReturnStatement directly and kill usage of hphp_continuation_done().
transform_yield_break() becomes a simple onReturn(check_yield=false), so
let's inline it into onYield and create ReturnStatement directly. After
this change, check_yield flag is always true and can be killed.
ContExit was also used after emitting a generator method in case the end
of method is still reachable. ContDone is added so that the generator is
properly closed. I believe this is never actually used, as MethodStatement
creates ReturnStatement at the end of method anyway.
I directly copied continuations for this, but they never have params. Closures sometimes have params, and the closure itself will be the first local AFTER those.
Added a new IRInstruction, DecRefNZOrBranch, that decrefs but
branches out of the trace if the reference count is about to go to
zero. This guarantees that no destructor will run, and thus no memory
side effects on trace. Tracked the last value available in memory in
TraceBuilder and used it to convert DecRef instructions to
DecRefNZ. These 2 changes allow us to eliminate more IncRef-DecRef
pairs, in particular cases due to SetS, SetG & SetM; for example:
85: SetS
(20) t12:Cls = LdStack<Cls> t10:StkPtr, 1
(21) t13:Str = LdStack<Str> t10:StkPtr, 2
(23) t15:PtrToGen = LdClsPropAddr t12:Cls, t13:Str, Cls(0)
(24) DecRef t13:Str
(25) t16:PtrToCell = UnboxPtr t15:PtrToGen
(26) t17:Cell = LdMem<Cell> t16:PtrToCell, 0
(27) t18:Obj = IncRef t11:Obj
(28) StMem [t16:PtrToCell]:Obj, t11:Obj
(36) DecRefNZOrBranch t17:Cell -> L4
L5:
(38) DefLabel
86: PopC
(39) DecRefNZ t18:Obj
In the above example, memory tracking and DecRefNZOrBranch allowed us
to change instruction (39) from a DecRef to a DecRefNZ. Subsequent
dead-code elimination will remove the IncRef instruction (27) and
DecRefNZ instruction (39).
This diff does not handle SetM.
Unhack the parser and introduce YieldExpression that emits the
equivalent set of opcodes that were emitted by bunch of
expressions/statements generated by parser before.
YieldExpression expects evaluation stack to contain just the value
being yielded, so {,List}AssignmentExpression need to evaluate RHS
first. The previous code had the same behavior.
This will let us consolidate continuation-related opcodes and make
them less tied with continuation objects.
Alias manager does not know whether generator parameters are passed by
reference. This didn't matter, because every generator had at least one
function call (hphp_continuation_done()) that pretty much disabled unused
variable elimination.
This diff fixes that, lets us get rid of artificial function calls in
generators and will allow later improvements in alias manager.
There is a runtime option to filter out notices and warnings,
but strict_warnings were left out. Bundle them with notices.
We raise a lot of strict_warnings; and when we fix hphpiCompat
(to match zend better) we will raise a lot more, so this could
matter.
Most of this is pretty boring and mechanical. I added
VectorProp and VectorElem flags to help deal with the increasing
number of vector-related opcodes.