We need to add the include dir of the both srcdir tree and builddir tree
to AM_CPPFLAGS if the header files is generated. Rather than inflate
AM_CPPFLAGS, move mpif90model.h.in to src/include since it is already in
AM_CPPFLAGS.
Then internal code need deal with both the types from ABI header and the
actual types used by MPICH. Generate mpi_abi_internal.h from mpi_abi.h
by renaming all MPI_ typenames into ABI_ prefix. Generate mpi_abi_util.c
to initialize an internal table for builtin datatypes and ops
conversions.
Generates:
src/binding/abi/mpi_abi_internal.h
src/binding/abi/mpi_abi_util.c
Add a port of cpi.c to CUDA. Use a GPU kernel to compute the partial
areas at each process, then sum them with a final MPI_Reduce from device
memory into CPU memory. This is intended to be used as smoke test for
functioning GPU support.
It is unnecessary to put embedded library to deep nested paths. Put them
all in modules dir, similar to how mpich does it.
The hwloc path was previously broken when we rearranged the source tree.
It didn't show in our testing because by default, we reuse the hwloc
compiled by main mpich.
Practically one would never need to run extractcvars with different
source directories. Hard code the directory list in the extractcvars
script, simplify a few unnecessary steps.
The script only need substitute @abs_srcdir@. Since practically we only
run the script from mpich top directory during autogen, let's remove the
extra autoconf step.
This is similar to many autogen scripts, e.g. the binding generation
scripts, which always assumes running from top directory.
Rather than searching for the main hydra code, move the main code into
mpiexec folder and flatten its directory layout.
There was a --with-hydr-ui configure option that were designed to build
different versions of mpiexec. This design seems never panned out.
Remove the option for now.
The interface generation are carried out in autogen stage and it is easy
to simply re-run the script if necessary. The MAINTAINER_MODE adds
complexity without much benefit. And it is fragile as we make changes to
the generation mechanism. Remove for now.
These two functions are not fortran bindings. It is much easier to
directly maintain them.
TODO: generate these fortran related C bindings with
maint/gen_binding_c.py.
Removes src/mpi/coll/allgather/allgather.c,
src/mpi/coll/allgather/allgather_init.c,
and src/mpi/coll/iallgather/iallgather.c etc.
Run /maint/gen_coll.py in autogen.sh.
GPU tests have big latency during initialization -- several seconds and
worse with more processes and devices. Therefore having generating many
small tests makes the overall gpu testing time impractically long.
The solution is to run more gpu tests inside a single test. Since the
benefit of script generation is lost in this case, it is easier to just
directly use a static testlist.gpu.
NOTE: the amount of gputests currently is incomplete. I'll add more as I
keep tuning the tests.
We have not been using rlog for quite a while. With the newer MPI Tool
facility and events interface, it is very unlikely to have renewed
interest in rlog. Thus, removing it from source tree.
Remove generation of src/util/logging/common/state_names.h in
maint/extractstates.in. The header file is only needed by rlog, which
has been broken and haven't been in use for a while. It is very unlikely
that we will ever need review rlog again.
Instead of generate these proxy functions by buildiface script, split
them into separate c code.
There is no real difference between three diffrent versions of proxy
functions (comm, type, win), merge them to simplify the code.
The old f08 code uses proxy function in Fortran. However, the
call-by-value interface in Fortran is not consistently supported by
gfortran or ifort. Since we need cross the C/Fortran boundary at one
point anyway, having the proxy function in C is much simpler. This
commit let f08 binding to use the same f90 proxy interface.
The new infrastructure allows selection logic to be specified as json
files, instead of being hardcoded in the C files. A generic json file
is provided, which is converted to a C string and used internally if
no external json file is specified.
This commit includes a mostly ground-up rewrite of Hydra. The new code
uses better abstractions, more scalable data structures, and more
efficient internal communication. Support for hierarchical launches is
added for improved startup times on large numbers of nodes. This code
should be considered experimental. It may change drastically, and will
likely be renamed.
Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>
Make changes to the structure of the CH4 shared memory modules to allow
non-direct builds (where the shared memory module code is inlined
"directly" into the MPI-level code). This makes the "glue" code (now
just called `src`) required and makes the POSIX code required as well
(in fact, it's not used directly in the CH4-level code in places).
This simplifies the autoconf code since we will always assume that the
`shm/src` code will be the starting point for shared memory calls. In
the future, when we have multiple submodules in shared memory that can
be built together, we will probably need to reintroduce some of the
autoconf magic within the shared memory code itself (instead of at the
CH4 layer).
This assumes that the decision to inline will always happen at the CH4
layer (and not inside the shared memory code itself). When we have
multiple shared memory modules in the future, they should all be inlined
underneath the shared memory code (if they will be inlined at all).
Signed-off-by: Min Si <msi@anl.gov>
Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>