468 Commits

Autor SHA1 Mensagem Data
Hui Zhou 7a5f0f9d39 mpir/pmi: protect 3rd-party pmi from job attributes
Define macro PMI_FROM_3RD_PARTY if linked with a 3rd party pmi
library such as cray, Slurm, and openpmix. They often run into issues
when quieried with nonexistent job attribute keys such as
PMI_process_mapping, PMI_hwloc_pmi, etc.

We have a few patches recently to workaround openpmix with non-existent
job attributes.

Cray PMI used to work, but we encountered a hang with PALS v1.3.4.

This commit skips querying these keys as job attributes to bypass these
potential issues. They are often not supported by 3rd party pmi anyway.
2024-10-09 14:46:43 -05:00
Hui Zhou 2afc14fa0d config: add options to suppress some warnings
PGI compilers, now nvc, spits many warnings by default, some of them
not worth to fix. For an example, the warning,
branch_past_initialization force splitting declaring and initializing
local variables with local scopes.

Suppress them by options instead.
2024-09-24 12:14:16 -05:00
Yanfei Guo b9f2188568 configure: cleanup arch specific fast options
1. Separate arch related fast options to `sse2`, `avx` and `avx512f`.
   The names represent the corresponding compiler "-m" flags. The `sse2`
   is enabled by default because it is widely supported. These options
   are consolidated in MPL's configure. Note, these flags will only be
   added to the CFLAGS when building MPL, not parent library.

2. The MPL_Memcpy_stream function is uninlined to accommodate for this
   change. Both AMD and Intel architecture show no performance
   difference uninline it (data included in PR).
2024-08-15 12:39:48 -05:00
Hui Zhou bbd586f9bf configure: disable tag error bits by default
Propagating errors in collectives is a very fragile and complex way of
collective error checking and deadlock prevention. It also takes away 2
bits from the valuable tag space. Set the default to disable.

[This is a testing commit].
2024-08-08 14:45:00 -05:00
Gengbin Zheng 92d3764917 hwloc: try PMIx_Load_topology to load topology
PMIx uses PMIx_Load_topology to load hwloc topology to avoid multiple
process try to redundantly probe hardware and create congestion.

Add MPIR_pmi_load_hwloc_topology as a wrapper function for
PMIx_Load_topology. It provides fallback implementation when
PMIx_Load_topology is not available.
2024-07-16 14:07:46 -05:00
Yanfei Guo 3dcd227208 mpl: Add SSE2 version of non-temporal memcpy
non-temporal store also available in SSE2 instruction set, which should
be generally available on x86 arch. The SSE2 version serves as a
fallback if AVX is not available.
2024-07-02 21:32:09 -05:00
Hui Zhou 49caba2dbd config: remove AC_PROG_CC_C99
Since autoconf 2.70, AC_PROG_CC will enable c99 by default and will warn
about AC_PROG_CC_C99.
2024-05-28 12:30:46 -05:00
Hui Zhou 99bf411132 errhan: remove error message level
Remove the configure option
--enable-error-messages={all,generic,class,none}. Support "all" by
default.

attr: pass thru user callback error

Return from attribute callback function are supplied by user and
potentially are meaningful to users. Direct return the code rather than
creating new code.
2024-04-03 10:04:01 -05:00
Ken Raffenetti 7c5f41102a configure.ac: Remove use of -commons,use_dylibs
The macOS linker in Xcode 15 only accepts -commons,use_dylibs in
-ld_classic mode. Historically we used this option for building MPICH
with the ifort compiler, which is no longer supported on macOS. Remove
the use of this flag at build time and also from the wrapper
scripts. See more discussion at
https://gitlab.com/petsc/petsc/-/merge_requests/7341#note_1810586723.
2024-03-13 09:55:48 -05:00
Ken Raffenetti 4371df303e env: Use LDFLAGS without MPICHLIB_LDFLAGS in wrapper scripts
MPICHLIB_LDFLAGS should only be used to build the MPICH library.
2024-03-11 09:17:42 -05:00
Ken Raffenetti 2f4bbe1300 Revert "Remove MPICHLIB_LDFLAGS/LIBS"
This reverts commit
4d93cefa8f. pmodels/mpich#6904 has a valid
use-case that we should support. An additional commit will modify the
compile scripts so these flags will not leak out of the build.
2024-03-11 09:17:42 -05:00
Hui Zhou 9859e8b74f configure: use flat_namespace to build libmpifort.dylib
On the latest mac, it requires to build dylibs with -Wl,-flat_namespace
in order for common block variables, such as MPI_IN_PLACE, to be
recognized.
2024-02-23 14:59:18 -06:00
Hui Zhou 30347861fd env: fix mpicc scripts
These were broken earlier when we consolidating the abi scripts.
2024-02-20 17:44:28 -06:00
Hui Zhou edbe7c1c0a configure.ac: fix spelling 2024-02-20 17:44:26 -06:00
Lisandro Dalcin f3e092ab69 abi: Single source mpi{cc|cxx}[_abi] with -mpi-abi option
* Add option -mpi-abi to mpicc/mpicxx to use the standard MPI ABI.
* Implement mpicc/mpicc_abi and mpicxx/mpicxx_abi from a single source
script to ease maintenance, avoid code duplication, and prevent the
code from becoming out of sync.
2024-02-19 19:26:59 +03:00
Hui Zhou 96b5aaf63c build: add --disable-doc configure option
Add a configure option to skip installing manpages, html docs, and pdf
docs.
2024-02-14 12:19:41 -06:00
Hui Zhou d24ad0f7b3 config: fix typo in help text for --with-wrapper-dl-type
It should be --with-wrapper-dl-type instead of --enable-wrapper-dl-type.
2024-02-05 15:11:38 -06:00
Hui Zhou 9053d16474 mpi.h: add MPI_REAL2 and MPI_COMPLEX4
These two types are optional in the standard. We should support them if
the compiler does.
2024-01-09 20:36:56 -06:00
Hui Zhou 8ca9ebd26c misc: move mpif90model.h.in to src/include
We need to add the include dir of the both srcdir tree and builddir tree
to AM_CPPFLAGS if the header files is generated. Rather than inflate
AM_CPPFLAGS, move mpif90model.h.in to src/include since it is already in
AM_CPPFLAGS.
2023-12-18 16:00:38 -06:00
Hui Zhou 7542ead643 datatype: move MPI_Type_create_f90_xxx to C bindings
Always build these routines as they are part of the C APIs. When the
fortran binding is disabled, simply return MPI_DATATYPE_NULL.
2023-12-15 10:35:33 -06:00
Hui Zhou e5b8534b17 romio: add --enable-mpi-abi to romio build
* Pass --enable-mpi-abi to romio configure
* Build libromio_abi.so and libpromio_abi.so when enabled
* Include mpi_abi.h instead of mpi.h when BUILD_MPI_ABI is defined
* internally adio.h will include mpio.h without abi, and
  romio_abi_internal.h with abi. I.e. mpio.h is skipped ifdef
  BUILD_MPI_ABI.
2023-12-12 22:42:43 -06:00
Hui Zhou 9786ab2bba ABI: add build files
Use --enable-mpi-abi configure option to enable build and installing
libmpi_abi.so.
2023-12-12 22:42:43 -06:00
Hui Zhou 859441675a fortran/use_mpi: fix makefile for PMPIBASEMOD
Typos from copy pasting resulted in missing pmpi_base.mod rules.
2023-10-30 13:49:29 -05:00
Hui Zhou cea5a2d2d7 configure: check Fortran ISO_C_BINDING
ISO_C_BINDING is an intrisic feature added since Fortran 2003. Check
this feature so we can provide the generic interface using Type(c_ptr),
e.g. in MPI_Alloc_mem.
2023-10-25 16:55:08 -05:00
Hui Zhou 4be0b8f56f cxx: replace HAVE_ROMIO in mpicxx.h.in
The HAVE_ROMIO macro is defined in mpichconf.h, not available in
mpicxx.h. Instead, use autoconf macro HAVE_CXX_IO and do substitution
into "#if 1" or "#if 0" at configure time.

The PAC_HAVE_ROMIO macro is called inside the enable_romio branch, thus
won't work with --disable_romio. It is simple enough to put in
configure.ac directly.
2023-10-20 11:29:00 -05:00
Ken Raffenetti 30b1baa17a mpir_pmi: Add workarounds for older OpenPMIx versions
PMIX_INFO_LOAD and PMIX_ERR_NOT_IMPLEMENTED are deprecated but still
used by some client implementations. Our internal client will not
support them, so add workarounds.
2023-10-12 15:48:18 -05:00
Ken Raffenetti 63f7c825c2 configure: Update PMI options to support embedded PMIx client
Enable PMIx client support in the embedded libpmi.so when there is no
external PMIx library.
2023-10-12 15:48:17 -05:00
Hui Zhou 54bdf44257 request: add MPICH_DEBUG_REQUEST
When MPICH_DEBUG_REQUEST is defined (config option to be added), we add
an info string to the request objects that can be used to update
readable debug information to the request.
2023-10-03 09:29:06 -05:00
Hui Zhou 4471c39afd configure: disable protocols if using external pmi
In the cases both external pmi library and builtin pmi library are
configured, disable the conflicting pmi protocols in the builtin libpmi.
2023-09-25 12:55:09 -05:00
Hui Zhou 2403683394 configure: allow independent pmi options
Allow users to force --with-pm and --with-pmilib options. It allows
users to build hydra even with external pmi library or build embedded
libpmi concurrently with an external pmi library (e.g. libpmix). This
may not work due to conflicts, but it is at user's descretion.

Use may load 3rd party pmi libraries with:
   --with-pmi1=path   (libpmi.so)
   --with-pmi2=path   (libpmi2.so)
   --with-pmix=path   (libpmix.so)

As of this commit, following options are more or less independent:
   --with-pmi=default|pmi1|pmi2|pmix
   --with-pmilib=default|no|mpich|install
   --with-pm=default|no|gforker|remshell|hydra

The default option can be internally overwritten. with_pmilib=no will
not build internal(embedded) libpmi. with_pm=no will not build internal
pm.

A special option is --with-pmi=slurm. This is mainly because Slurm installs
header as #include "slurm/pmi.h".
2023-09-06 17:10:33 -05:00
Hui Zhou 68dce30446 pmi: remove USE_PMI{1,2,X}_API
These configure macros are no longer used since we support runtime PMI
versions.
2023-09-06 09:58:26 -05:00
Hui Zhou 3b698f9db7 configure/pmi: remove --with-pmi=oldcray option
Now we can directly link with cray pmi using --with-pmi=path or
--with-pmi2=path.
2023-09-06 09:58:26 -05:00
Hui Zhou e782914cd0 configure/pmi: fix checking external pmi library
Make sure to set LIBS before checking PMI capability.

Also disable built-in pm if user is configuring with external pmi
library. Hydra can be installed separated if needed.
2023-09-05 16:46:51 -05:00
Hui Zhou 4a3c30bc43 pmi: check if PMI2_Set_threaded exist
Skip PMI2_Set_threaded if it does not exist. For example, it does not
exist from Cray's libpmi2.so.
2023-09-05 15:09:05 -05:00
Hui Zhou 8fe676b596 pmi: check whether PMI2_keyval_t is missing
Cray PMI does not define PMI2_keyval_t. Check it in configure and
typedef INFO_TYPE PMI2_keyval_t if needed. This will allow mpich to compile using
Cray PMI. The PMI2_keyval_t is only used in PMI2_Job_Spawn. The name
publishing API always uses NULL for its info pointers. It is possible
that our INFO_TYPE is incompatible with Cray PMI's internal type, which
will break PMI2_Job_spawn. If so, this need be fixed in the future,
possibly on the Cray side.
2023-09-05 15:09:05 -05:00
Hui Zhou 986e9541f3 configure: only substitute fortran files if configured
If user add --disable-fortran option, make configure not depend on the
fortran files that maybe missing from the autogen.
2023-07-26 11:18:04 -05:00
Hui Zhou cd9dd949dc config: check and define ENABLE_PMI[12X]
Define ENABLE_PMI1, ENABLE_PMI2, or ENABLE_PMIX if available. They can
be simultaneously defined and allow users to use runtime pmi selections.
2023-07-12 10:43:06 -05:00
Mikael Simberg cf362c3947 Use more specific -Werror=type-safety flag to check if type_tag_for_datatype is qualifier-agnostic 2023-07-11 10:05:10 -05:00
Hui Zhou af516b85cd config: add config option to use global config file
Allow --with-configfile to set global config file. --without-configfile
will disable the use of global config file.
2023-05-24 11:57:37 -05:00
Hui Zhou ed0e2c3c8b config: configure for threadcomm feature
Threadcomm feature is enabled when compiler TLS is available and per-vci
critical section is enabled.
2023-05-05 09:26:07 -05:00
Ken Raffenetti aef80b96d0 misc: Use C99 snprintf 2023-05-02 10:49:43 -05:00
Hui Zhou e1b63e9c08 configure: remove export MPILIBNAME etc
No subconfigure need access $MPILIBNAME or $PMPILIBNAME.
2023-05-01 15:08:10 -05:00
Hui Zhou 098c34b14e config: default --enable-error-checking to runtime
There should be negligible difference between error-checking all and
runtime, but the latter allows users to use MPIR_CVAR_ERROR_CHECKING to
disable error checking at runtime.
2023-03-21 16:01:33 -05:00
Hui Zhou fb4cab797e config: format configure pmi configure help text
Make the help text on pmi option clearer.
2023-03-21 11:42:04 -05:00
Hui Zhou 12e2a2eb23 configure: deprecate the convenience options in --with-pmi
We allowed use of --with-pmi=[cray|slurm] as convenience options.
Deprecate these options to encourage downstream to use standardized PMI
interfaces.

Rename the cray option to oldcray since the newer Cray environment
should work with the standard `--with-pmi=path` or `--with-pmi2=path`
options directly.

The --with-pmilib option is used to configure special settings for slurm
and cray pmi. Deprecate this option as downstream, e.g. cray, is fixing
the non-conformant implementations.

User still can manually set --with-pmi in order to work with old Slurm
and Cray libraries until the newer implementation from Slurm and Cray
become ubiquitous.
2023-03-19 20:59:29 -05:00
Hui Zhou 44ee0bc2f8 configure: remove --with-pmilib=pmix option
User can configure pmix directly with `--with-pmix` option.
2023-03-19 20:59:29 -05:00
Hui Zhou 87607ff358 configure: add --with-pmi2=[path]
Allow user to set pmi2 library/header path. This is similar to the
`--with-pmix=[path]` option.
2023-03-19 20:59:29 -05:00
Hui Zhou f20b7d1bea configure: add --with-{cuda,hip,ze} options to configure
These options are parsed by MPL. Prevent mpich configure to warn
"unrecognized options" by adding these options the the main configure.
2023-03-17 17:21:17 -05:00
Hui Zhou ab17b262bc ch3/sock: ignore SIGUSR1
When hydra's -disable-auto-cleanup, it sends SIGUSR1 to notify process
failures. ch3:sock does not support this feature, but at least ignore
the signal to prevent it being killed by it.
2023-02-17 16:25:31 -06:00
Ken Raffenetti b78242cc38 examples: Add cudapi example code
Add a port of cpi.c to CUDA. Use a GPU kernel to compute the partial
areas at each process, then sum them with a final MPI_Reduce from device
memory into CPU memory. This is intended to be used as smoke test for
functioning GPU support.
2023-01-24 16:14:43 -06:00