Does BLCR support checkpointing parallel/distributed applications?
Not by itself. But by using checkpoint callbacks (see previous FAQ). some MPI implementations have made themselves checkpointable by BLCR. You can checkpoint/restart an MPI application running across an entire cluster of machines with BLCR, without any application code modifications, if you use one of these MPI implementations (listed alphabetically): • LAM/MPI 7.x or later • MPICH-V 1.0.x • MVAPICH2 0.9.8 or later • Open MPI 1.3 or later See the documentation of your specific MPI for usage instructions. In almost all cases you will need to use a tool provided by the MPI implementation to request a checkpoint or restart, rather then using BLCR’s cr_checkpoint and cr_restart utilities. At this time we are aware of at least three other MPI implementations that are working on BLCR support, but surprisingly our information is not always the latest. If in doubt, check the support channels of your favorite MPI implementation Note that any questions about using these MPI implementations with