Author Topic: Metis/Parmetis partitioning limits  (Read 14119 times)

tgross

  • Jr. Member
  • **
  • Posts: 18
Metis/Parmetis partitioning limits
« on: September 03, 2013, 02:52:46 AM »
Hello!

I am trying to run large FE models using parFEAP. I am using an isotropic 8node hexahedral mesh, Small displacement formulation and a simple elastic material (for the beginning).
When increasing the Model DOFs I have partitioning problems with metis and parmetis:

Parmetis:
For problems with around 20Mio DOF parmetis doesn't finish writing the graph file (around 1.2GB) with the following error message:
At line 2593 of file pmacr7.F (unit = 12, file = 'graph.file')
Fortran runtime error: End of file

Metis:
when using metis for partitioning 20MIO DOF problems could be solved easily. However, at around 60Mio DOF also metis encounters some problems. It crashes with the following error message:
[METIS Fatal Error] ***Memory allocation failed for AllocateWorkSpace: edegrees. Requested size: -2020186992 bytes
  • PETSC ERROR: ------------------------------------------------------------------------
  • [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range

    Only a few percent of the machine memory was used when metis crashed.

    Are there any known limits in model size for metis/parmetis?
    Did anyone encounter similar problems?
    Are there settings which would allow me to process larger FE models?

    Thank you for your help!
    Best regards,
    Thomas

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Metis/Parmetis partitioning limits
« Reply #1 on: September 03, 2013, 08:54:40 PM »
The "Requested size: -2020186992 bytes" statement indicates that the program has tried to allocate an array with a length that is larger than can be stored in a signed 32-bit integer.

The basic limitation is that in the flat file (problem defined on a single processor) no needed array should exceed ~2x10^9 in length.

To track the problem and see if there is a work around/fix you should run in the debugger to see precisely which allocation request is
causing the program to crash.  I would concentrate on the metis partitioning since the parmetis partitioning gets very little use.

If you can provide a bit more debugging information, we may be able to help.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Metis/Parmetis partitioning limits
« Reply #2 on: September 03, 2013, 08:57:20 PM »
One other comment.  If you compile the program with default integers as 64bit (this is a compile option) -- make sure to set ipr = 1 in main/feap.f -- and build a petsc with 64bit integers too, then the problem may go away.  But I have never tried this combination of options so I can not say if it will work or not.

tgross

  • Jr. Member
  • **
  • Posts: 18
Re: Metis/Parmetis partitioning limits
« Reply #3 on: September 10, 2013, 01:31:02 AM »
Thank you very much for your answer. We are currently testing your suggestions and I will post an update soon.

Best regards
Thomas

MarkusB

  • New Member
  • *
  • Posts: 9
Re: Metis/Parmetis partitioning limits
« Reply #4 on: December 12, 2013, 08:16:27 AM »
Dear Prof. Govindjee,

I tried to compile feap and petsc using 64-bit integers as suggested using fresh versions of feap and petsc.
However this does not work/solve the problem.
I think I know (one reason) why:
 The problem concerning 64-bit integers, in my humble opinion, seems to be located in the partition sub-program of parfeap.
 It is written using exclusively 32-bit integers (C "int"). For 64-bit integers to work with metis the pre-defined type "idx_t" should be used.
 Metis and Parmetis provide special types and macros for the integer type and the mpi-integer-flag in the header files.
 If metis has been compiled using 32-bit integers idx_t will be an int. In case metis has been compiled using its 64-bit integer option, idx_t will be a 64 bit integer (int64_t).
 Cf. also: https://svn.alcf.anl.gov/repos/libs/METIS/include/metis.h

Currently, we try to avoid files which are too large for a 32-bit metis.

Cheers, Markus