Author Topic: large mesh partitioning with parfeap 8.4  (Read 8842 times)

blackbird

  • Full Member
  • ***
  • Posts: 100
large mesh partitioning with parfeap 8.4
« on: May 17, 2018, 01:57:46 AM »
Dear all,

I obtained a problematic partitioning for a large mesh (800k elements) with parfeap 8.4. I would like to share the input (Ishpb), the mesh information (netz) and the outpput file (Oshpb) here, however the file sizes are >1MB, so I decided to make them available over the cloud by the link

https://cloudstore.zih.tu-dresden.de/index.php/s/yMws9gQIuwqKS2n

The setup is fine, when I use a number of 24 partitions. When I use a number of 240 partitions, most of the parts contain ~2k elements, however there is part 39 with 246 elements and part 46 with 0 elements (see Oshpb). Especially the part with 0 elements makes it impossible to run the simulation.

Do you have a clue, why the partitioning is failing here this way? Do you know a workaround?

Best
Christian

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #1 on: May 17, 2018, 06:51:58 AM »
That should not happen but it will be hard to debug on something this large.  The first thing to figure out is if the graph being sent to METIS is correct.  If that is the case, is the result coming back from METIS correct or does it have these zero partitions?  If it is correct, then on will need to see why FEAP is making an error.

I don't know very much about the METIS algorithm but you should try reading about its limitations.  Also you should perform a series of tests to see at which number of processors it breaks down.  That may also give a clue as to what the problem is.

blackbird

  • Full Member
  • ***
  • Posts: 100
Re: large mesh partitioning with parfeap 8.4
« Reply #2 on: May 22, 2018, 01:12:45 AM »
It seems I found a very challenging mesh. As you suggested, I tried some different numbers of partitioning (each a multiple of 24, as this is the number of cores for one node at our hpc-facility) and the problem already occurs for 48 partitions.

Now I checked the METIS website and they claim to handle " ... Graphs with several millions of vertices can be partitioned in 256 parts ... ". So the size should not be the problem.

Is there a specific part of the code you would recommend to check whether this is a problem of parFEAP or METIS?

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #3 on: May 22, 2018, 03:31:30 AM »
The first thing you need to do, is verify that the node graph that FEAP computes and passes to Metis is correct; see parfeap/unix/smetis.c .
If that is correct then the problem is likely with Metis.  If it is incorrect, then the problem is with the construction of the node graph.

How you can test the node graph is the hard part since it is rather large, it will be hard to just look at.  You will need to research some basic algorithms for computing properties of graphs and then try and verify that the properties are correct for your mesh.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #4 on: May 22, 2018, 04:46:59 AM »
Looking more closely at the output, it seems that you are not getting empty partitions as far as the node graph is concerned.    You have a positive number of nodes in each partition.  But somehow FEAP is determining that you have no elements in the partition.  I will have to think about why this could happen.   If you can debug further yourself that will also be helpful.

FYI, all the relevant code is contained in parfeap/pmacr7.F
« Last Edit: May 22, 2018, 04:49:22 AM by Prof. S. Govindjee »

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #5 on: May 22, 2018, 04:54:35 AM »
do you have a visualization of the mesh?  i could not get one on my computer (not enough memory).

blackbird

  • Full Member
  • ***
  • Posts: 100
Re: large mesh partitioning with parfeap 8.4
« Reply #6 on: May 22, 2018, 04:57:49 AM »
I made the same observation (nodes are ok, but no elements), therefore I thought about the bandwidth of the mesh I am giving - and here is the solution to obtain proper partitions:

1. output an optimized input by the commands
OPTI
OUTM

2. partitioning of this flat input file

However, I wonder whether giving the mesh as a separate file may be the problem here. I did not concern this, as the mesh (file "netz") has the same structure as the feap's flat input, but maybe the include-command in the input file is the problem here?

blackbird

  • Full Member
  • ***
  • Posts: 100
Re: large mesh partitioning with parfeap 8.4
« Reply #7 on: May 22, 2018, 05:00:51 AM »
here is the picture of the mesh - a cylindirc bar created with ansys

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #8 on: May 22, 2018, 05:05:33 AM »
Thanks.  I also managed to get an image by dumping a paraview input file (PVIEw).

When you say you were able to get a correct partitioning, is it the OPTI or the OUTM that is needed?
I have experienced the need to use OUTM before but never the need for OPTI.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #9 on: May 22, 2018, 05:11:08 AM »
I see the problem!

Your ansys mesh has duplicate nodes.  The file contains 873744 nodes but all are not used.  The mesh itself only uses 822141 unique nodes.   OUTM does this for you, and thus the partitioning with Ishpb.rev then produces a valid result.  OPTI is not needed (and I think should be avoided).

blackbird

  • Full Member
  • ***
  • Posts: 100
Re: large mesh partitioning with parfeap 8.4
« Reply #10 on: May 22, 2018, 05:19:20 AM »
I agree, that OPTI is not neccessary.

However it is still strange, as I cleared the duplicate nodes in ansys already (NUMMRG, NODE) AND the last node in the input mesh (number 873744) is used in several elements (e.g.  799889, 799890, ...). Is there any information on how OUTM is identifying duplicate nodes?

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #11 on: May 22, 2018, 05:21:31 AM »
More precisely your files skips some of the nodes.  You have 822141 nodes in your file.  But the max node number used is 873744.  So some of the node numbers are skipped and that is causing the problem.  The partitioner requires all the nodes be used.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1160
Re: large mesh partitioning with parfeap 8.4
« Reply #12 on: May 22, 2018, 05:27:07 AM »
Look for example at node 306612.  The next node is 306629.  So 17 nodes have been skipped right there.

blackbird

  • Full Member
  • ***
  • Posts: 100
Re: large mesh partitioning with parfeap 8.4
« Reply #13 on: May 22, 2018, 07:35:42 AM »
Thank you very much!

FYI ansys meshing is also able to do this by NUMCMP,NODE - now I know the reason I will issue this command for my feap meshes

Thanks!