Author Topic: parfeap not running on remote machine  (Read 11811 times)

sg

  • New Member
  • *
  • Posts: 7
parfeap not running on remote machine
« on: February 12, 2024, 08:12:23 PM »
Hello,

I have recently upgraded from FEAP 8.4 to 8.6. Both the serial and parallel versions work fine on my laptop (Mac arm64). However, parallel feap is having problems running on a remote cluster. PETSc invokes MPI_Abort() during initialization and the job crashes without running. Serial version of the feap runs fine on the remote machine. I have checked PETSc v 3.13.2 after installation and it runs all the examples successfully. There is no error during feap compilation either.

Can anyone point me where things could be going wrong? I have attached here a folder containing input files and the error log files.

Thanks

JStorm

  • Sr. Member
  • ****
  • Posts: 250
Re: parfeap not running on remote machine
« Reply #1 on: February 13, 2024, 02:32:40 AM »
Are you sure that PETSc and all libraries (like for MPI) are compiled with the same compiler as parFEAP?
FEAP 8.6 with PETSc 3.13.2 is working for me on desktop PC and clusters.

sg

  • New Member
  • *
  • Posts: 7
Re: parfeap not running on remote machine
« Reply #2 on: February 13, 2024, 10:35:51 AM »
I am using the already installed (default) module for openmpi/ impi on the cluster. Then I am compiling PETSc and parFEAP with a same compiler (GNU/ Intel). Is there anything different I need to do? Is it possible for you to share your makefile for parallel feap on a cluster?

Attaching a debugger returns following error:

(gdb) c
Continuing.
 
Thread 1 "feap" received signal SIGSEGV, Segmentation fault.
p_res_norm (grnorm=19.05255888325765, reln=<error reading variable: Cannot access memory at address 0x0>)
    at p_res_norm.f:68
68      p_res_norm.f: No such file or directory.
(gdb) bt
#0  p_res_norm (grnorm=<error reading variable: Cannot access memory at address 0x7fffdecc17f0>,
    reln=<error reading variable: Cannot access memory at address 0x0>) at p_res_norm.f:68
Backtrace stopped: Cannot access memory at address 0x7fffdecc1488
(gdb)
« Last Edit: February 13, 2024, 11:52:34 AM by sg »

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: parfeap not running on remote machine
« Reply #3 on: February 14, 2024, 02:08:56 AM »
I think the issue is that the call to p_res_norm( ) in parfeap/usolve.F is not correct in version 8.6.

Try the following; in parfeap/usolve.F change
Code: [Select]
      real (kind=8) :: grnorm
to
Code: [Select]
      real (kind=8) :: grnorm, reln

Also in parfeap/usolve.F change
Code: [Select]
          call p_res_norm( grnorm )
to
Code: [Select]
          call p_res_norm( grnorm, reln)

The error has been corrected in the upcoming version 8.7 but seems not to have gotten back ported to version 8.6.  I believe this will fix your problem.  Sorry about the hassle.

sg

  • New Member
  • *
  • Posts: 7
Re: parfeap not running on remote machine
« Reply #4 on: February 14, 2024, 01:19:46 PM »
That worked like a charm! Thank you, professor!

ParFEAP 8.6 also works well with the latest version of PETSc (v3.20.4).
« Last Edit: February 14, 2024, 01:21:39 PM by sg »

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: parfeap not running on remote machine
« Reply #5 on: February 16, 2024, 04:27:25 PM »
Good to know.  And good to know that parFEAP is playing nice with the newest PETSc.