2390 Commits

Autor SHA1 Mensagem Data
Ken Raffenetti ec4cc89f5e build: Move benchmark generation to autogen.sh in testsuite
These are only needed by the testsuite. If mydef is unavailable for some
reason, e.g. running autogen.sh from a testsuite-only tarball,
regeneration of the benchmarks will be skipped.
2024-10-16 11:01:24 -05:00
Hui Zhou 96cf5c4ee3 test/coll: fix typo in allgather_gpu.c
The argument of oddmem and evenmem are mistakenly swapped.
2024-10-04 12:22:03 -05:00
Hui Zhou 99c6adff2a test/bench: add support for device memory
Add device memory support using mtest_common utilities. This will add
the dependency to utility libraries, which the makefile already
imports.

However, this will remove the simpliicity of building single
source with mpicc or mydef_run. If one doesn't need test device memory,
one can simply comment off "$include macros/mtest.def" to restore the
simplicity.
2024-10-01 22:43:35 -05:00
Hui Zhou f2add2bed1 test/bench: add Makefile and testlist
"make testing" in test/mpi/bench should work.
2024-10-01 22:43:23 -05:00
Hui Zhou e4d96f828e test/runtests: add TestBench result check
This check does not capture output (thus test results will show in
console log) and only checks for exit code - zero means success and
nonzero means failure.

We'll use this check for benchmark tests.
2024-10-01 22:43:23 -05:00
Hui Zhou 6633f0a001 autogen: convert mydef code in autogen
We could add rules to directly work with mydef code in Makefile, but
convert the code in autogen removes the mydef dependency.

Also fix a spelling error.
2024-10-01 22:42:55 -05:00
Hui Zhou 30f2bbd438 test/mpi: add p2p benchmarks in test/mpi/bench
Add point-to-point benchmark code in MyDef. The tests have automatic
warm-ups and adjusts number of iterations for measurement accuracy.
It produces latency measurements with standard deviations and equivalent
bandwidths.

MYDEF_BOOT=[topsrc_dir]/modules/mydef_boot
export PATH=$MYDEF_BOOT/bin:$PATH
export PERL5LIB=$MYDEF_BOOT/lib/perl5
export MYDEFLIB=$MYDEF_BOOT/lib/MyDef

To run:
    mydef_page p2p_latency.def  # -> p2p_latency.c
    mpicc p2p_latency.c && mpi_run -n 2 ./a.out

Alternatively use mydef_run (uses settings from config):
    mydef_run p2p_latency.def

Next commit will add "make testing".
2024-10-01 22:42:00 -05:00
Ken Raffenetti 098c0f0bb3 test/mpi: Fix argument passing to pt2pt_large test
Need to prefix arguments meant for test binaries with arg=.
2024-09-17 13:17:04 -05:00
Ken Raffenetti e14ba22f80 test/mpi: Fix assertions in mpit_isendrecv.c
A typo caused this assertion to not actually check equality (and always
evaluate to true).
2024-09-04 12:54:08 -05:00
Yanfei Guo 1ddd46aa4d test: fix warning in format specifiers 2024-08-27 12:03:40 -05:00
Hui Zhou 2a680b9529 test/xfail: nonblocking3 10 is too stressful for xpmem
The machine runs the xpmem tests, yuzu, doesn't have enough cores and it
is too stressful to run 10-process nonblocking3 test.
2024-08-12 16:17:50 -05:00
Hui Zhou 7d4f51ca2c test/threads: lower the stress for mt_probe_sendrecv_huge
It is stressful to issue many fi_read in the get huge path especially
under the thread-contention case. Lower the count to ensure more
reliable CI results. It should be sufficient to test the correctness
with smaller count.
2024-08-12 16:17:50 -05:00
Hui Zhou 9c48a130ad test: fix typos related to testlist.collalgo 2024-08-12 16:17:50 -05:00
Hui Zhou b0e16278c8 test/configure: disable coll-error tests by default
Add configure option --enable-coll-error-tests.

Disable the collective length error tests by default.
2024-08-08 14:45:00 -05:00
Hui Zhou 3f8eacdb2b test/info: add prints in impls/mpich/info/memory_alloc_kinds.c
Print messages about the errors.
2024-07-16 22:09:58 -05:00
Hui Zhou 4ee98961eb test/configure: autoconf check use_hydra option
Skip tests with hydra-only option when MPIEXEC is not hydra.
2024-07-16 14:08:38 -05:00
Hui Zhou 343bb66771 test/jenkins: xfail bcasttest using pipelined_tree
The pipelined_tree algorithm can generate lots of unexpected messages
when large messages consists many chunks. Thus, I think the algorithm
without control on in-fly chunks are questionable.
2024-06-13 14:06:58 -05:00
Hui Zhou e1cd32c600 test: check MTestGetStressLevel in rma/manyget.c
Large amount one-sided MPI_Get has similar issue as the pingping tests
-- accumulating unhandled messages at the receiver side and may result
in running out of receive buffers.

Run less iterations by default and increase stress if MPITEST_STRESS is
set.
2024-06-13 14:06:58 -05:00
Hui Zhou 0cc44a5cd6 test: check MTestGetStressLevel in pingping test
The pingping tests quickly accumulates large amount unexpected message on
the receiver side with eager small messages. This is more a performance
issue than a correctness issue. Add barrier by default and set
MPITEST_STRESS to run stress tests e.g. in nightly.
2024-06-13 14:06:58 -05:00
Hui Zhou 755682f038 test/mtest: add MPITEST_STRESS env
Add environment variable MPITEST_STRESS to control the level of stress
testing. Increase stress levels runs some tests with more iterations,
larger counts, or more processes.
2024-06-13 14:06:58 -05:00
Hui Zhou 845a96418c test/io: fix i_noncontig_coll2.2
The test uses MPI_Gatherv to gather procnames into a `char **procname`
and uses `int *disp` to capture the difference between pointers. It
overflows on 64-bit systems when two malloc'ed pointers differ more than
INT_MAX. We fix it by using a single malloc'ed procname_buffer.

It is not clear why we didn't catch this issue in previous CI testing.
2024-05-29 07:05:15 -05:00
Hui Zhou 07decd2408 test/config: replace AC_HELP_STRING
AC_HELP_STRING is deprecated since autoconf 2.70 and need be
replaced by AS_HELP_STRING.
2024-05-28 12:30:46 -05:00
Hui Zhou 49caba2dbd config: remove AC_PROG_CC_C99
Since autoconf 2.70, AC_PROG_CC will enable c99 by default and will warn
about AC_PROG_CC_C99.
2024-05-28 12:30:46 -05:00
Yanfei Guo d85ab90476 romio: format cleanup
Fix code format checking for pre-commit hook.
2024-05-22 09:56:36 -05:00
Rob Latham 38471c0204 romio: "large count" improvements
Huge patch set touching almost all of romio, but should be much fewer
places where we store potentially large values in an int.  Passes
'-fsanitize=undefined' and also reduces '-Wshorten-64-to-32' warnings.
2024-05-14 15:08:13 -05:00
Wei-keng Liao 5fd586398a test case for large count / large size I/O
Derived from a pnetcdf-generated workload
2024-05-14 15:08:13 -05:00
Xi Luo a3eb635a48 test/xfail: xfail ipc read allgather and allgatherv tests
ipc_read allgather and allgatherv fails on CUDA and HIP as
MPL_gpu_imemcpy is not implemented
2024-05-09 09:48:36 -05:00
Xi Luo 49f25a13c0 test/coll: add tests for ipc_read allgather and allgatherv
Add allgather_gpu and allgatherv_gpu to test the ipc_read algorithms.
2024-05-09 09:48:36 -05:00
wkliao a68350b483 more on fixing detection of python version 3 2024-05-01 07:53:35 -07:00
Hui Zhou 28964c3ecd test/rma: enhance win_shared_query_null.c
Add test to cover the issue of query a single process window.
2024-04-27 09:01:30 -05:00
Hui Zhou 54ad8ebcd9 test: modify the error tests for user error handler
It is not clear that after a MPI function calls the user error handler,
whether it should return MPI_SUCESS or not. Update the tests not to
assume the MPI will return the error code after invoking the error
handler.

Rationale: if user want the MPI function to return error code, they
should use MPI_ERRORS_RETURN instead. Otherwise, we assume the user
error handler "resolved" the error. User always can throw an exception
in their error handlers.
2024-04-25 21:53:33 -05:00
Thomas Gillis 3dd97a9d73 test/part: improve coverage of part comm testing
- add prime numbers to detect possible aggregation issues
- decrease the number of partitions to realistic numbers
2024-04-23 12:14:05 -05:00
Thomas Gillis 8b02901930 test/part: strided datatype test for partitioned communication
This test asserts the behavior of (possible) message aggregation and
correctness when sending the partitions
2024-04-23 12:14:05 -05:00
Hui Zhou 4d50f00c87 test: extend to batchRun to xfail tests due to mem
Move the check "$test_opt->{mem} > $g_opt{memory_total}" to LoadTests so
that the skip happens before RunMPIProgram or AddMPIProgram, so that it
works for batchRun.
2024-03-14 14:37:09 -05:00
Hui Zhou 7aec2772d0 test: enhance coll_large test
Add options to test more than INT_MAX count of contig and non-contig
datatypes.

To allow the tests to run with minimal memory, we use MPI_CHAR as the
basic type. Modify macros CTYPE and MPITYPE to test different basic
types.
2024-03-14 11:51:37 -05:00
Hui Zhou 7679615a12 test: enhance pt2pt_large test
Add options to test more than INT_MAX count of contig and non-contig
datatypes.

To allow the tests to run with minimal memory, we use MPI_CHAR as the
basic type. Modify macros CTYPE and MPITYPE to test different basic
types.
2024-03-13 12:29:50 -05:00
Hui Zhou 38958f74b4 test/configure: remove --enable-large-tests option
Different machines have different amount of memory and a single option
is too coarse to control what tests that can be safely run.

Remove the --enable-large-tests option and rely on the per-test "mem="
option to control whether the tests will be run. The default
MPITEST_MEMORY_TOTAL is set to 4 GB (the maximum for 32-bit nodes). Thus
all those tests require more than 4GB memory will be skipped by default.
For CI testing, the job script or the system should set
MPITEST_MEMORY_TOTAL appropriately to enable those large tests that can
fit.
2024-03-13 12:29:48 -05:00
Hui Zhou 9b3ebd5c93 test: remove largetest guard for some datatype tests
The option --enable-large-tests enables those tests that require large
amount of memory. However, the tests -- large_count and large_type --
merely tests datatype creation and does not require large memory.

Always enable both tests. Add check in the tests so it skips on 32-bit
systems.
2024-03-13 12:23:07 -05:00
Ken Raffenetti 3d07ddeb79 test/coll: Use a single GPU per process in reduce.c
The reduce.c test used rank in comm to specify the GPU device used in
memory allocation. While iterating over multiple comms, the device could
change. This is an uncommon use-case, and one that is not yet supported
by UCX. Change the test to use world rank to specify the device. If
desired, we can add a multi-GPU test separately to verify that
capability.
2024-03-07 14:21:41 -06:00
Hui Zhou 663ca35b5f test/f08: add test for status conversion functions
Add test for
    MPI_Status_f082c
    MPI_Status_c2f08
    MPI_Status_f082f
    MPI_Status_f2f08
    MPI_Status_f2c
    MPI_Status_c2f
in C, and test for
    MPI_Status_f082f
    MPI_Status_f2f08
in Fortran.
2024-03-07 12:44:14 -06:00
Hui Zhou 1d4f5e17b8 misc: prefix warnings message with "MPICH"
Applications and users need to tell that messages are from MPICH.
2024-02-26 23:06:36 -06:00
Jan Ciesko 011b080479 Add Qthreads support 2024-02-19 16:22:20 -06:00
Hui Zhou 523fdf9e86 test/pt2pt: add send_tests.c
Add a test to cover all modes of basic send, including eager/rndv,
expected/unexpected, whether to use ssend, or any source/any tag
receive. With odd/even testing, it also tests both shmmod and netmod.
2024-01-03 10:03:37 -06:00
Hui Zhou c97a98a158 test/spawn: add namepub_conn test
Add a simple name publishing test using MPI_Open_port, and verify using
MPI_Comm_accept and MPI_Comm_connect.
2023-12-13 20:54:31 -06:00
Hui Zhou 1329092c15 test: enable testing mpi-abi 2023-12-12 22:42:43 -06:00
Hui Zhou 54ae15cb1c test: suppress warnings under abi build
We should not assume handles to be of integer type.
2023-12-12 22:42:43 -06:00
Hui Zhou 42270f6ad6 test: fix misc compilation warnings
* Try avoid print handle as int
* MPI_MESSAGE_NULL may not be MPI_REQUEST_NULL
* MPI_Comm may not be the same as MPI_Win
* Cast to (long long) to print MPI_Count
2023-12-12 22:42:43 -06:00
Hui Zhou 2d935de1eb test/f77: remove the extra declaration for MPI_Aint_add
Now we added the declaration of MPI_Aint_add and MPI_Aint_diff in
mpif.h, we do not need to declare them in the test. They should have
been fixed in mpif.h in the first place.
2023-12-07 22:03:57 -06:00
Hui Zhou fd2eaac882 test/pt2pt: add test recvnull
Test MPI_Irecv and MPI_Recv from MPI_PROC_NULL should return status with
MPI_SOURCE=MPI_PROC_NULL and MPI_TAG=MPI_ANY_TAG.
2023-11-30 13:37:54 -06:00
Hui Zhou 06b8659992 test: xfail cxx/fortran ucx tests that uses MPI_Open_port
CXX binding is deprecated. The frotran tests does not use configure
macros. Xfail those MPI_Open_port tests for ch4:ucx for now.
2023-11-18 19:59:50 -06:00