FEAP User Forum
FEAP => Parallel FEAP => Topic started by: abbhuiya on July 22, 2014, 06:06:31 PM
-
Dear Professor Taylor,
I solved a problem using 1million nodes with 2,4,12 and 24 processors (FGM-Average Implicit Method). The time data is OK if I use up to 12 processors. But whenever I use more than 12 processors, it takes a very long time to solve the problem. For example,
No of processors Time
2 40m35.5325s
4 20m23.5985s
12 8m59.705s
24 239m38.704s
There are 12 cores in each node in our high performance computing system (uses SUNGrid Engine).
I contacted them about the issue. They said that when I use less than 12 cores, the communication is through memory. Otherwise, the communication goes through network among nodes. They suggested to check the code section for how the processes communicate with each other. I am not sure where to check this (I mean in which section of the code we defined the communication among the cores for parallel computing)
I will be very grateful if I get some help/hints about this.
My best Regards to you
abbhuiya
-
Search for MPI in the parfeap directory.
But note that most of the parallel data transfers are done during the solve which is handled by PETSc so this may not be very helpful.
Better would probably be to looks at the PETSc log summaries for clues on what is eating up all the time.
-
Dear Professor Taylor,
I am not sure where can I get the PETSc log summaries? Is it the same thing as FEAP log summary? Which part should I take a look? I can attach the file if you want to take a look and suggest me something.
Best Regards
abbhuiya