Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

What fault tolerance techniques does Open MPI plan on supporting?

0
Posted

What fault tolerance techniques does Open MPI plan on supporting?

0

Open MPI plans on supporting the following fault tolerance techniques: • Coordinated and uncoordinated process checkpoint and restart. Similar to those implemented in LAM/MPI and MPICH-V, respectively. • Message logging techniques. Similar to those implemented in MPICH-V • Data Reliability and network fault tolerance. Similar to those implemented in LA-MPI • User directed, and communicator driven fault tolerance. Similar to those implemented in FT-MPI. The Open MPI team will not limit their fault tolerance techniques to those mentioned above, but intend on extending beyond them in the future. 3. Does Open MPI support checkpoint and restart of parallel jobs (similar to LAM/MPI)? The current stable release of Open MPI does not support the checkpointing and restarting of processes. However, the Open MPI development trunk does contain such support. The Open MPI team is actively working on integrating a variety of checkpoint and restart techniques into Open MPI, including similar functional

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123