20090 Commits

Autor SHA1 Mensagem Data
Hui Zhou dc5e6185d7 Merge pull request #7072 from hzhou/2407_errhan
errhan: fix NULL reference in MPIR_Err_return_session_init

Approved-by: Ken Raffenetti
2024-08-07 10:48:22 -05:00
Hui Zhou 7335a4efc8 errhan: fix NULL reference in MPIR_Err_return_session_init
Fix a potential NULL errhandler case in MPIR_Err_return_session_init. It
is tricky to handle error when everything may be uninitialized.
2024-08-07 10:43:15 -05:00
Hui Zhou 920ff67a3f Merge pull request #7085 from hzhou/2408_pmi_hwloc
mpir/pmi: fix use of PMIx_Load_topology

Approved-by: Gengbin Zheng
Approved-by: Ken Raffenetti
2024-08-05 11:04:44 -05:00
Hui Zhou 1a6aee94c5 mpir/pmi: fix use of PMIx_Load_topology
The "ptopo.source" from PMIx_Load_topology may not be "hwloc". Newer
version of pmix will append the version of hwloc, e.g. "hwloc:2.9.0".
Thus, we need to use strncmp instead of strcmp.

In case PMIx_Load_topology fail, we have the option of fallback. But we
may want to believe that PMIx_Load_topology should work and see its
error if it fails.
2024-08-03 08:45:51 -05:00
Hui Zhou c04284b9e5 Merge pull request #7081 from hzhou/2407_gather_init
coll: fix MPI_Gather_init using the sched_binomial algo

Approved-by: Ken Raffenetti
2024-08-01 14:37:18 -05:00
Hui Zhou 64a150f8dd coll: fix MPI_Gather_init using the sched_binomial algo
We need schedule the local copy for persistent collective to work.
2024-08-01 10:58:20 -05:00
Hui Zhou 00af4b112f Merge pull request #7070 from hzhou/2407_yaksa_reduction
coll: add MPIR_CVAR_ENABLE_YAKSA_REDUCTION_THRESHOLD

Approved-by: Ken Raffenetti
Approved-by: Gengbin Zheng
2024-07-25 14:34:26 -05:00
Hui Zhou 37cda22eab coll: add MPIR_CVAR_YAKSA_REDUCTION_THRESHOLD
We call MPIR_Typerep_reduce_is_supported to determine whether we do
collective host buffer swap in reduce and allreduce. We may want to make
better decision based on message size, thus we are adding the count to
the parameters.

Add a cvar to disable yaksa reduction for large messages.
2024-07-25 14:30:26 -05:00
Rob Latham 0be7e7ab45 Merge pull request #7071 from roblatham00/gpfs-large-count-fixup
romio: Gpfs large count fixup
2024-07-25 14:28:36 -05:00
Rob Latham c9f5523e32 romio: 64 bit warnings on tests
A start at updating some ROMIO tests while maintaining MPI-3 support.
2024-07-25 13:56:55 -05:00
Rob Latham 082d7cda1c romio: update GPFS allocations for 64 bit changes
fixes: pmodels/mpich#7068
2024-07-25 13:56:55 -05:00
Hui Zhou 2de40142e7 Merge pull request #7076 from hzhou/2407_ipc_dt
ch4/ipc: fix premature free of src_dt_ptr

Approved-by: Ken Raffenetti
2024-07-25 10:43:51 -05:00
Hui Zhou 301b45bcb3 ch4/ipc: fix premature free of src_dt_ptr
Apparently the datatype may not be held while it is in
MPIR_Ilocalcopy_gpu. Save the datatype and only free it in
MPIDI_IPC_complete.
2024-07-25 10:38:53 -05:00
Hui Zhou 090e935fe7 ch4/mpidig: move unexpected mrcv fields into union
Move path-specific fields into a union. This avoids confusion on the
purpose of these fields especially when we add additional
path-dependent fields.
2024-07-25 10:38:53 -05:00
Hui Zhou 6918174fdd Merge pull request #7051 from rithwiktom/check_composition_fix
coll: Check constraints before using composition algo

Approved-by: Hui Zhou
2024-07-25 08:59:55 -05:00
rithwik.tom 3145bc3331 ch4/coll: Fix memory limit check for Allgather and AlltoAll 2024-07-24 15:02:20 -05:00
rithwik.tom b1e4ff25a4 ch4/coll: Check constraints before using composition algo
The memory constraint condition is not checked when selecting the
composition algo for Allgather and AlltoAll using a tuning file.
2024-07-24 15:02:20 -05:00
Hui Zhou 6b79140249 Merge pull request #7075 from BKitor/bkitor/rocm6_poitnerGetAttr
mpl/gpu: Update MPL_gpu_query_pointer_attr

Approved-by: Ken Raffenetti
2024-07-24 13:21:26 -05:00
Benjamin Kitor 05a20e3a45 mpl/gpu: Update MPL_gpu_query_pointer_attr
Handle hipPointerGetAttributes changes introduced in ROCm 6
2024-07-24 11:26:02 -05:00
Hui Zhou b9e13b6009 Merge pull request #7065 from zhenggb72/putipc
mpl/ze: use zeMemPutIpcHandle to release IPC handles

Approved-by: Hui Zhou
2024-07-24 10:15:00 -05:00
Gengbin Zheng f3da0881d9 mpl/ze: use zeMemPutIpcHandle to release IPC handles
Use zeMemPutIpcHandle to release IPC handles from zeMemGetIpcHandle
instead of calling close() which leads to crash at finalize step.
Also clean up caches in zeMemFree hook functions.
2024-07-24 09:14:10 -05:00
Hui Zhou 73476b282f Merge pull request #7007 from dycz0fx/release-gather-reduce
ch4/posix: Fix release gather reduce

Approved-by: Hui Zhou
2024-07-24 08:18:58 -05:00
Xi Luo a3f0593095 coll/release_gather: fix uninitialized fields
MPIDI_POSIX_Bcast_tree_type and MPIDI_POSIX_Reduce_tree_type are used
without proper initialization. Initialize the values in
MPIDI_POSIX_nb_release_gather_comm_init.
2024-07-22 13:14:22 -07:00
Xi Luo 33171e63f3 coll/release_gather: remove relaxation in release_gather_release
As the parent can be changed in release_gather_release for reduce, each
rank needs to wait until the parent has updated the flag; otherwise, a
rank could access the memory that is not available to use.
2024-07-22 13:14:22 -07:00
Xi Luo 51bf31bb3c coll/release_gather: add reduce tree and kval for large messages
Add MPIR_CVAR_REDUCE_INTRANODE_MSG_SIZE_THRESHOLD so MPIR_CVAR_REDUCE_INTRANODE_TREE_KVAL
and MPIR_CVAR_REDUCE_INTRANODE_TREE_TYPE are used when the message size is smaller than
or equal to this threshold; while MPIR_CVAR_REDUCE_INTRANODE_TREE_KVAL_LARGE and
MPIR_CVAR_REDUCE_INTRANODE_TREE_TYPE_LARGE are used when the message size is larger than
this threshold.
2024-07-22 13:14:22 -07:00
Hui Zhou 2fc3a6336e Merge pull request #7037 from hzhou/2406_vci
coll: nonblocking collective request to use per-comm vci

Approved-by: Ken Raffenetti
2024-07-22 09:31:46 -05:00
Hui Zhou 12621c7bfb ch4/ofi: add MPIR_CVAR_CH4_OFI_ENABLE_INJECT
The ofi injected messages may require explicit progress to kick the
message out. But this is difficult to control due to lack of request
handles. This is problemetic when user issue inject then immediately
dive into computation, or when they move on to a different dedicated vci
and neglect to progress the previous vci, which, by all indication,
has completed and doesn't require progress.

Add the cvar as a remedy.

We still believe this is a libfabric issue.
2024-07-22 09:17:42 -05:00
Hui Zhou c722a9bace coll: use MPID_Request_create_from_comm
Use MPID_Request_create_from_comm in MPIDU_Sched_start and
MPIR_TSP_sched_start to create nonblocking collective request in the
per-comm vci.
2024-07-22 09:17:42 -05:00
Hui Zhou e0fbd58685 mpid: add MPID_Request_create_from_comm
For nonblocking collective, if the internal pt2pt operations are using
non-0 vci, we need create the nonblocking collective request in the same
vci pool, or it won't be progressed effectively. In fact, current
multi-vci nonblocking collectives relies on global progress to work.

Add MPID_Request_create_from_comm to create request from the per-comm
vci when it is enabled.
2024-07-22 09:17:42 -05:00
Hui Zhou 88e6e22e41 Merge pull request #7060 from zhenggb72/init-scale
misc: Improve scalability of MPI_Init

Approved-by: Ken Raffenetti
2024-07-17 14:59:22 -05:00
Hui Zhou 0329863459 errhan: make sure builtins are initialized in session init
MPI_Session_init accepts builtin error handlers from parameters, but the
error systems may not initialized yet. Make sure the builtins are
initialized so we can recognize, for example, MPI_ERRORS_RETURN.

The MPIR_Process.memory_alloc_kinds may not be initialized yet,
initialize them separately for now.
2024-07-16 22:11:50 -05:00
Hui Zhou d87b628f54 session: fix getting MPIR_Session_get_memory_kinds_from_info
Now we check for session info before the actual init,
MPIR_Process.memory_alloc_kinds may not have been initialized yet when
we call MPI_Session_init. Leave the temp var memory_alloc_kinds NULL if
it is not provided in session info hints, and inherit it from
MPIR_Process.memory_alloc_kinds after it is initialized.
2024-07-16 22:11:41 -05:00
Hui Zhou fc0d5a685d session: check info before MPIR_init_thread
In MPI_Session_init, do not go through MPIR_init_thread is there
are errors in the info parameter.
2024-07-16 22:09:58 -05:00
Hui Zhou aa5373d63e init: move MPIR_pmi_finalize into mpir_init.c
Since we call MPIR_pmi_ini in mpir_init.c, move MPIR_pmi_finalize into
mpir_init.c for consistency.
2024-07-16 22:09:58 -05:00
Hui Zhou 3f8eacdb2b test/info: add prints in impls/mpich/info/memory_alloc_kinds.c
Print messages about the errors.
2024-07-16 22:09:58 -05:00
Hui Zhou 4ee98961eb test/configure: autoconf check use_hydra option
Skip tests with hydra-only option when MPIEXEC is not hydra.
2024-07-16 14:08:38 -05:00
Hui Zhou 810fb1da81 mpir/pmix: cleanup pmix_fence_nspace_proc
Pass in const namespaces and parent rank instead of the whole
pmix_proc_t. For one, pmix_proc_t may be big too pass in parameters; for
two, pmix_proc_t is too opaque for the semantics here.
2024-07-16 14:08:22 -05:00
Hui Zhou a3023ae735 mpir/pmix: fix session re-init
We need make sure globals pmix_proc, pmix_wcproc, and pmix_parent stay
const after init.

This fixes the session_re_init and spawn_rootargs tests.
2024-07-16 14:07:46 -05:00
Hui Zhou 47b09995f8 hydra: use dynamic buffers in kvs commands
The PMI KVS commands may exceed static buffer size due to long value
length. This fixes following error:

    mpiexec: src/pmi_wire.c:810: PMIU_cmd_output_v2:
        Assertion `!PMIU_cmd_is_static(pmicmd)' failed.
2024-07-16 14:07:46 -05:00
Gengbin Zheng 92d3764917 hwloc: try PMIx_Load_topology to load topology
PMIx uses PMIx_Load_topology to load hwloc topology to avoid multiple
process try to redundantly probe hardware and create congestion.

Add MPIR_pmi_load_hwloc_topology as a wrapper function for
PMIx_Load_topology. It provides fallback implementation when
PMIx_Load_topology is not available.
2024-07-16 14:07:46 -05:00
Gengbin Zheng 1ff17b5c51 mpir_pmi: add MPIR_pmi_barrier_only for PMIx does not collect data
Add a lightweight version of MPIR_pmi_barrier() for PMIx which does not
collect data. Calling MPIR_pmi_barrier_only() in MPII_Init_thread.
Also add MPIR_CVAR_CH4_INIT_SKIP_PMI_BARRIER which by default skip
the barrier.
2024-07-16 14:07:46 -05:00
Gengbin Zheng 2979e710fe ch4: fix various bugs for MPIR_CVAR_CH4_ROOTS_ONLY_PMI mode 2024-07-16 14:07:46 -05:00
Ken Raffenetti 4bfb25cfde Merge pull request #7066 from raffenet/hip-free
mpl/gpu: Fix typo in HIP memory hook

Approved-by: Hui Zhou <hzhou321@anl.gov>
2024-07-16 11:14:25 -05:00
Ken Raffenetti 53d23c0a14 mpl/gpu: Fix typo in HIP memory hook
Accidental double locking leading to deadlock in the free memory hook.
2024-07-15 10:55:36 -05:00
Ken Raffenetti 4c372ab562 Merge pull request #7062 from raffenet/yaksa-module
submodules: Update yaksa

Approved-by: Hui Zhou <hzhou321@anl.gov>
2024-07-12 09:49:50 -05:00
Ken Raffenetti 907edef298 submodules: Update yaksa
pmodels/yaksa#249: util: disable leaked handles warning by default
pmodels/yaksa#253: hip: add configure check for hipPointerAttribute_t.memoryType
2024-07-12 09:49:33 -05:00
Hui Zhou 0b75e39fd0 Merge pull request #6829 from dycz0fx/json_gpu
Select GPU-optimized collective algorithms with JSON

Approved-by: Hui Zhou
2024-07-11 19:55:28 -05:00
Xi Luo e3d99b0810 ch4: support json for gpu collectives 2024-07-11 14:36:51 -07:00
Xi Luo 4affd43246 posix: support json for gpu collectives 2024-07-11 14:36:51 -07:00
Hui Zhou 16c1eeb7b7 Merge pull request #7040 from hzhou/2306_shm_options
ch4/config: refine --with-ch4-shmmods options

Approved-by: Ken Raffenetti
2024-07-11 15:12:33 -05:00