This diff addresses what we called "step 1" in the task: simply ensure that any C++ exceptions that escape a destructor get rethrown and can continue to propagate naturally. The exception is remembered on the thread, and rethrown when we check for surprises later. If multiple destructors let C++ exceptions escape the last one to escape will be the one rethrown at the next surprise check.
This also ensures that C++ exceptions prevent more PHP code from running, by omitting calls to __destruct methods as we unwind the stack.
Finally, this also enables surprise checks for OnFunctionExit unless we're unwinding, in which case surprises remain unchecked so they can propagate later.
This is different than Zend's behavior, where destructors do run as fatals unwind.
My recent change to now/heredoc processing resulted in variables in heredocs mistakenly being tacked onto the end of the previous line in the output. This tends not to be a problem when emitting things that get parsed by something else, like emitting PHP. But it's noticable when emitting raw text.
Generators should specify function names as the name instead of the
full name for consistency with Zend 5.5 (and more importantly because
this breaks PHPUnit), this was exposed by the backtrace removal of
file and line info
As a user, this often annoyed me that I couldn't get any debugging info
out of the name and then @ahupp asked for it.
I didn't really know what to name closure ones as the name is done at
emission time. So I went with ##{closure}##. I'm not sure how I feel
about that since it will be uncorrelated with the emitted name.
While I was in there, I deleted a lot of unused code.
I'm totally open to other names and ideas.
I think the only case where we currently don't spillstack as
part of a vector translation is CGetM of a defined property. However,
this can still raise exceptions in the unlikely case of an unset
property. For now, just spill before any vector translation---a
follow up diff relaxes this to not do it in cases where we know the
property can't be undefined. Presumably later we'll push the
spillstacks into the unlikely paths for these translations (or put
them in unwind handlers)---currently we can't handle control flow
where the join point has a different stack.
A small change to optimize very simple binary operators in the parser. This avoids building very large parse trees for super-large expressions and folds binary operators involving two scalars directly in the parser. I've limited this to simple scalars since it's easy to prove they don't have anything too interesting going on in the other analysis phases leading up to BinaryOpExpression's normal folding. This works for all binary operators.
This is the first part of the work to expose type constraint and generic all the way to reflection. This first DIFF exposes the return type with generic types coming next.
Adds runtime support for non-class typehints. Typedefs are
introduced using type statements, and autoloaded via a new autoload
map entry. Shapes are parsed but the structure is currently thrown
away and treated as arrays at runtime. This extends the NamedEntity
structure to sometimes cache 'NameDefs', which are either Typedef*'s
or Class*'s. VerifyParamType now has to check for typedefs if an
object fails a class check, or when checking non-Object types against
a non-primitive type name that isn't a class.
If the global was not defined, it would try to return
a pointer into the MInstrCtx, which was not passed in.
We dont actually need it though, because in that case we can
just return a pointer to init_null_variant.
The specialized RetC sequence in tx64 destroyed each local,
and then set it to null in case onFunctionExit captured a
backtrace. But if the destructor of a sub-object captures
a backtrace, its already too late.
The same code in hhir didn't null out the local at all.
I found a bajillion more tests in the Zend repo, and I'm going to have to namespace things. I think this solution will also be nice for the porting our C++ strings. One directory per package and then when you want to run them all you run the top-level.
Fixed a few issues in the lexer for heredocs and nowdocs. The source file is read in 64k chunks, and any time a doc was split across a buffer boundary the lexer would fail to consume the doc properly. Modified the lexer to refill the file buffer when it is used, or when more data is needed in a variety of cases. Also fixed a number of other corner cases where we'd fail to recognize the doc end label or other special characters. The old code was also a bit over- and under-flowy.
This is a first pass looking for suggestions.
348 / 1268 passed (27.44%)
I modified the script, then ran:
./tools/import_zend_test.py /tmp/php-5.4.11/
which created all the zend tests. When they send out a new update, this script is idempotent so running it on the new tests should just make the working directory in a good enough state to commit.
We need to make sure that we execute the code
corresponding to the actual Func*. This re-uses the
prolog array to hold the entry points for the cloned
Func*.
Fairly straightforward. The only interesting part is the
addition of ElemUX, which behaves similarly to ElemDX in terms of
side-effects (but not exactly the same).
TranslatorX64 interps cases that it expects to fail but I
didn't like that, so the hhir version never punts or interps. The fast
path code reuses the stuff Jordan added for InstanceOfD and should be
the same as tx64's, modulo some branch fusing that doesn't appear to
matter in perflab.
It now handles filenames in strings that have been
var_dumped, without having to special case them. I also renamed it to
default.filter and changed verify to run it on every test. This lets us
remove a lot of symlinks that used to point to filepath.filter and
changed the expected output for a bunch of tests.
We are inconsistent with how many spaces happen after ##:## Errors have one, but warnings and notices have 2. I went with 1 because that is what Zend does.
The code was assuming that the result of the assignment
would be the value assigned, but its actually a new string
containing the first character of the value assigned. The result
was that the SetM had already decRef'd the rhs, and then the
jitted code decRef'd it again. Since that last decRef was a decRefNZ,
we would often get away with simply leaking both the original rhs and
the value of the SetM. But the new testcase crashes in a DEBUG build
without the fix.
The vector translator produced a type that was
boxed-array-or-string, which we couldnt guard on.
Since applying a SetM to a string will almost always produce
a string, change it to report string instead (which will still
be guarded on).
Tx64 only translates CGetS when you do it with the context
class the same as the class being looked up, to avoid the need for
accessibility checks. The IR can translate it more often, but was
using its fast path even when it was not safe to do so: if a previous
(safe) access had allocated a targetcache entry, later accesses would
be able to get to the property without an access check. For now just
limit the optimiation to safe cases.
In the case where m_this was already in a register,
and known to be non-zero, we would fail to zero the ActRec's
m_this field, which could result in a dangling reference
to $this being captured by debug_backtrace.
Put the store on the presumably cold path. Fix DecRefThis to
do the same.
The interpreter fix was different than the jit/ir fix because ##translateFPushObjMethodD()##'s ##i.inputs[0]->rtt.valueClass()## is null (with a TODO in the code to make it not null). It then goes into the slowpath.
Another one like this and I think I should have a different attribute for static closures :(
Most of this is pretty boring and mechanical. I added
VectorProp and VectorElem flags to help deal with the increasing
number of vector-related opcodes.
I added the check for this in the interpreter but ##f_array_map## re-enteres the VM via a different path than FCall. Here is the equivilent check for the VM.
PHP added this in 5.4 so that you can say your closure shouldn't capture ##$this##. https://wiki.php.net/rfc/closures
Should we add it? Many of their unit tests use it.
Instead of putting a boolean somewhere I used the same attr framework and set the static bit there. Thoughts? It only needs one change in the ##FPushFunc##.
In Zend 5.3 they decided that closures should inherit the ##$this## from the containing scope. This brings us close to paraity with them. The remaining thing is to make
function() use ($this) {}
fatal after we purge it from WWW.
The simplifier was not promoting the boolean to integer in some
cases. I hit this for the add-zero case, and code inspection revealed
similar bugs in sub-zero and multiply-one cases.
Any vector instructions that take a pointer to a base and might modify
it are now flagged with MayModifyStack. Any that actually do modify the stack
(this can be determined by looking at the input types) will produe two dests:
their original result and a new StkPtr. getStackValue calls into VectorEffects
to extract the new type and/or value. The instructions currently assume that
the stack cell they might be modifying is already synced to memory. This may
change in the future when we don't have to do a full SpillStack before the
helpers.
This mimics what TranslatorX64 does in translateSetMArray,
but it does it with fewer helpers and (often) fewer instructions in
translated code. I also found a bug in both jits and the interpreter
when dealing with arrays that hold refs to themselves. The new test
case exercises the fix, which involved a bit of refactoring of the
refcounting logic.
Enabling VectorTranslator while punting to tx64 is no longer a
regression so I removed the punt in emit().