This implements the "last" unimplemented methods in ArrayShell. The sorting-related methods need no implementation because PolicyArray scales to HphpArray before sorting.
They aren't used anymore, except to implement isVectorData(),
which we can make pure-virtual and implement more efficiently
in each subclass of ArrayData.
The in-situ allocation case can be easily detected by checking m_data == m_inline_data.slots. Therefore there's no need for two distinct flags to track current and future allocation strategy.
The codebase had several namespace-level static data definitions and function definitions. Using namespace-level "static" in a .h file is near-always a bad idea, as follows:
- for simple types, static is implied. Example: "const int x = 42;" is the same as "static const int x = 42;"
- for aggregate types, static linkage implies that a copy of the aggregate will appear in every compilation unit that includes the header.
- for functions, static means the function will have a separate body generated in each compilation unit including the header.
- in several places functions were defined 'static inline', which is just as bad. 'inline' is just a hint which means the function may end up having a body (and actually more due to 'static').
True, gnu's linker has means to remove duplicate definition, but that's not always guaranteed or possible (think e.g. static functions that define static data inside, ouch). So static is useless at best and pernicious at worst. We should never, ever use static at namespace level in headers. I will create a bootcamp task for a lint rule.
I expected the performance to be neutral after the change, but in fact there's a significant drop in instruction count and therefore a measurable reduction in CPU time: https://our.intern.facebook.com/intern/perflab/details.php?eq_id=431903
ArrayData gets on a diet, down to 32 bytes
devirtualized destructor call in HphpArray::release()
aggressively inlined constructors for ArrayData (do away with defaulted arguments)
eliminate a dead memset() for the hashtable in HphpArray(uint size, const TypedValue* values)
I've run a bunch of experiments on HphpArray speed and am still yet to have a major breakthrough. However, there are a few clear small winners that I'm submitting with this diff. CPU instructions, CPU loads, and CPU stores are all as green as the Eternal Hunting Grounds. Speed is drowned in noise but I sususpect will be measurable on these changes combined.
PolicyArray pased all tests but failed in production. After browsing on my own box I figured that what happened was the method addLval() with a string key is called in production on many pages, but never in the tests. @ptarjan maybe you could look into adding such a test?
This diff still does not instantiate the policy-based array (I did instantiate it during testing).
I was learning from @jdelong and he said that you should use
double quotes for local includes and angle brackets for library
includes. I asked why our code was the way it was, and he said he wanted
to clean it up. I beat him to it :)
Conflicts:
hphp/runtime/base/server/admin_request_handler.cpp
hphp/runtime/vm/named_entity.h
PolicyArray splits the ArrayData implementation in two parts. ArrayShell implements the baroque ArrayData implementation in terms of a much smaller statically-bound core that concerns itself exclusively with the storage strategy for the array. This way the two aspects can be worked on separately, and different stores can be easily plugged into the given ArrayShell.
To give a better perspective, the featured SimpleArrayStore has about 340 lines all told (including solid documentation), whereas ArrayShell has some 1100 lines. The shell can be reused with other stores of unbounded sophistication. The store needs to implement only 22 primitives, some of which are trivial. Basically a new store implementation saves 1100 of difficult-to-get-right lines of code right off the bat, and only needs to focus on implementing 22 well-defined primitives.
Things to watch for when reviewing:
- Is the store API small/expressive enough? What should be added/removed?
- Are various forms of duplication present? If so, how could code be factored better?
- Any subtle change in semantics from HphpArray and friends that should be minded?
Known issues:
* Currently ArrayShell is defined as:
class ArrayShell : public ArrayData, private SimpleArrayStore { ... };
when in fact it should be:
template <class StorePolicy>
class ArrayShell : public ArrayData, private StorePolicy { ... };
Extenuating circumstances related to allocator design prevent use of templates at this point. I'll work on these in parallel with the review.
* growNoResize is almost always prefaced by a test-and-grow. This sequence should be encoded in a function.
* The growth policy is spread all over the place in a subtle form of duplication.
* The current store design makes almost no effort to be particularly efficient.