FEAP User Forum

FEAP => Parallel FEAP => Topic started by: JStorm on July 29, 2020, 02:09:40 AM

Title: Measure up the time waiting
Post by: JStorm on July 29, 2020, 02:09:40 AM: Is it possible to determine the time, while a process is waiting for the others?
In this way I want to identify slow nodes in an HPC which slow down the whole solution.
Title: Re: Measure up the time waiting
Post by: Prof. S. Govindjee on July 29, 2020, 11:59:12 AM: This is tricky, I think. It will depend a lot on the particular allocation of nodes that the job scheduler is giving you.

PETSc does give a number of options for timing and they do help identify code that is unbalanced; see https://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#chapter.13 .

With regard to particular nodes you could try placing calls to PetscTime in your code (this will return the current time in seconds -- from some reference, typically the epoch). If you print this along with the value of rank, then you will see which nodes arrived at the print statement at what times. This will tell you which nodes are slower than others.

I don't know if there is a Fortran wrapper for PetscTime, so you will just have to try. Also I do not know if it is synchronized across all processes. If not, you can directly use the MPI_Wtime() function which returns a real*8 time in seconds since a fixed reference (like the epoch). The function MPI_Wtick() will give you the resolution. I believe that the MPI clocks are not guaranteed to be synchronized; see the value of MPI_WTIME_IS_GLOBAL.
Title: Re: Measure up the time waiting
Post by: Prof. S. Govindjee on July 29, 2020, 12:09:17 PM: One option for dealing with the sync problem is to do something like:

Code: [Select]
use mpi implicit none real (kind=8) :: myt #include "setups.h" myt = MPI_Wtime() ! a bunch of code myt = MPI_Wtime() - myt write(*,*) rank,myt
Title: Re: Measure up the time waiting
Post by: JStorm on July 29, 2020, 10:59:14 PM: Thank you Prof. Govindjee, I will give it a try.