Author Topic: Problem with MPI  (Read 9213 times)

Sebastian Castro

  • Jr. Member
  • **
  • Posts: 24
Problem with MPI
« on: January 15, 2014, 08:43:43 AM »
Hi everyone,

I'm running a problem with parFEAP with an element and material that I wrote. When I use 4 cores the problem is solved well, but when I increase the number of cores the program stops always in the same iteration and prints the next message:

Fatal error in MPI_Recv: Error message texts are not available

I'm using version 8.3 and the server where I run FEAP has 64 cores (I use at most 16 to execute the program). Do you know what could be the problem?

Thank you

FEAP_Admin

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 993
Re: Problem with MPI
« Reply #1 on: January 15, 2014, 03:26:31 PM »
Have you set the correct number of processes when you called mpiexe/mpirun?
And did you partition for the correct number of processes?
« Last Edit: January 15, 2014, 03:35:18 PM by FEAP_Admin »

Sebastian Castro

  • Jr. Member
  • **
  • Posts: 24
Re: Problem with MPI
« Reply #2 on: January 15, 2014, 05:45:06 PM »
Yes, I verified using htop. About the partition, is fine. If my file is called Iblock, I can see all the Iblock_XXXX files according the cores I'm using.

I noticed other thing: if I run my problem with 8 cores, the program stops in the iteration 300 (I need to compute 450) but with 16 cores the problem appears in the iteration 150.

The command lines I'm using are

$FEAPHOME8_3/parfeap/feap

for partition, and

$PETSC_DIR/externalpackages/mpich2-1.0.8/bin/mpiexec -np nproc $FEAPHOME8_3/parfeap/feap -ksp_type cg -pc_type jacobi

for solving the problem, am I right?

Thanks

FEAP_Admin

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 993
Re: Problem with MPI
« Reply #3 on: January 16, 2014, 12:27:27 AM »
That seems correct.  There could be a problem with your installation?  or there could be a problem with the partitioning.
If you post the un-partitioned input file, I can try to run it (assuming it isn't too big of a problem).  If you are concerned about
posting the input file, you can also email it.

Sebastian Castro

  • Jr. Member
  • **
  • Posts: 24
Re: Problem with MPI
« Reply #4 on: January 20, 2014, 05:24:29 AM »
Thanks for your reply. I decided to install the latest version of FEAP and it's working for different examples that I have for the serial and parallel version (I used PETSc 3.4.3 with the configuration suggested in the manual). I noticed that when I increase the number of cores to 7 at most, FEAP works well. However, when I use 8 or more I receive the following message in the partition:

Input Error: Incorrect numflag.
   *WARNING* Length allocation for:TEMP3 Length =           0

I attach the input file I'm using (it's an simple example) and its solve.XXXX file

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Problem with MPI
« Reply #5 on: January 20, 2014, 10:10:34 PM »
The problem runs fine for me independent of the number of processors I choose (I tried 7 and 10).

Are you sure your PETSc/MPI implementation is correct?  Have you independently
tried to run MPI programs on more than 7 processors and independently tried to run
PETSc programs on more than 7 processors?

Sebastian Castro

  • Jr. Member
  • **
  • Posts: 24
Re: Problem with MPI
« Reply #6 on: January 21, 2014, 11:19:17 AM »
First of all, thanks for answering all my posts. Now I really installed the lastest version of FEAP I think (ver 8.4.1d) and it's working well (I used 16 cores).

Now I'm trying to reduce the execution time, so I am going to read about the different PETSc's solvers. By the way, do you recommend any solver and preconditioner in particular?

Thanks again for everything  :)

FEAP_Admin

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 993
Re: Problem with MPI
« Reply #7 on: January 21, 2014, 12:46:02 PM »
For mechanics oriented problems in 3D GAMG seems to work well.  You can also try ML. 

For direct you can try MUMPS or SuperLU.  We have used both with some success but also
have occasionally had problems with them -- either crashing or returning incorrect solutions.
Spooles was reliable but it is not part of the latest PETSc release.

It is important to run OUTD with the correct options.  See the newest parallel manual.