Author Topic: Convergence problems with v8.6 (very slow compared to v8.5)  (Read 32067 times)

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #30 on: June 22, 2021, 11:19:59 AM »
One thought is that there is a memory over write error someplace that is being exposed by the large number of includes or large number of materials.

I would be good to first isolate which is the cause, materials or includes.

Then, as painful as it may be, I would run the code with valgrind to get for something getting clobbered.  The fact that this does not appear in 8.5 could be a fluke of how the memory blocks are being assigned.

JStorm

  • Sr. Member
  • ****
  • Posts: 250
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #31 on: June 22, 2021, 12:59:25 PM »
If the issue is caused by a memory violation, then this can be perhaps tested via valgrind at a constrained problem.
All element and material routines would be executed (including the memory bug) but the large system of equations can be avoided by fixing all DOFs via boundary conditions.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #32 on: June 22, 2021, 04:00:42 PM »
Hard to say if this suggestion will work, since the memory corruption could be to the tangent's memory...

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #33 on: June 24, 2021, 08:46:42 AM »
My ongoing troubleshooting with parallel v8.6.1j has shed some new light on the possible origins of the bug. I haven't yet incorporated debugging with valgrind. So far it's more of a mechanistic debugging  :-\

1. Two identical BVPs (DOF>1E6) are tested: one where mesh is generated with FEAP BLOCk and second where mesh is imported with INCLude using 3rd party programs. Both work correctly with standard library as well as user element. Therefore, a potential bug in INCLUde can be ruled out.

2. Partitioning done with v8.5.2 is slightly different from v8.6.1. E.g. v8.5.2 prints EREGions in partitioned files which is missing in v8.6.1.

3. Series of identical BVPs with increasing number of grains (=material tags) are performed. When grains (material tags) > 999, user element (solid3d with ndf=4) throws the following error
Code: [Select]
     Material Number1000: Element Type: user           : ELMT =***
     Element Material Set =   1
  *ERROR* ELMLIB: Element:     0, type number*** input, isw =  1
 RANK =   0
Feap standard library (3d elastic orthotropic) does not throw this error but simply stops converging.

4. For grains (material tags) <=999 standard library and user elements work correctly doesn't matter how many DOF.

Somehow the argument jel to program/elmlib.f seems to be corrupted when material tags > 999. One possible explanation as to how this corruption leads to divergence is that elements with material tags > 999 have garbage properties. This I haven't tested yet. 

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #34 on: June 24, 2021, 12:25:46 PM »
If you look in the 8.6 partition files you should find EREGions defined (just not in all of them).  I'll have a look at the jel issue.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #35 on: June 24, 2021, 12:38:09 PM »
You should change format statement 4000 in elmlib.f,  i3 --> i5 so we can see what is actually in jel.
Note, jel should fit in an i3 format.  When using user elements there are only up to 50 user elements allowed, elmt01 through elmt50.  Feap's internal elements have negative numbers for jel.

The other thing to note is that the problem in point 3 is occurring early on.  isw is equal to 1.  so you are still in the input stage.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #36 on: June 24, 2021, 12:55:29 PM »
Can you post a sample of what the material cards look like in the partitioned files? for material 1000 or higher?

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #37 on: June 24, 2021, 01:11:00 PM »
Another question:  How many parameters are you saving into d( ) or ud( ) in your user element?

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #38 on: June 24, 2021, 01:26:56 PM »
Partial good news.  I have created a small problem with 32x32 mesh with 1024 material sets that fails in the way you are seeing using FEAP elements (i.e. no convergence).  This will help debugging from our side.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #39 on: June 24, 2021, 01:33:25 PM »
For the record, here are the files.  Serial files and a 4-partitioning of them.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #40 on: June 24, 2021, 01:40:53 PM »
I have also been able to show that this is not an issue of partitioning.  If I make a single partition (parallel) input file it also fails and using a director solver I get a zero pivot error.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #41 on: June 24, 2021, 01:47:51 PM »
I see the problem  :(

The format statement for writing out the material cards to the parallel files (also for creating flat files from serial FEAP) is incorrect.  When you have large numbers of materials there is not enough room and two fields get jammed together.

I work up a patch later today.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #42 on: June 24, 2021, 02:08:33 PM »
Try this.

Edit program/pmatin.f and change format statement 2008 from
Code: [Select]
2008  format(2x,a15,1x,15i4:/(16i4))
to
Code: [Select]
2008  format(2x,a15,1x,15i5:/(16i4))
This is a quick fix (I hope).

Note you will need to rebuild the FEAP archive and then rebuild your serial and parallel feap executables.

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #43 on: June 24, 2021, 02:13:49 PM »
Thank you Prof. Govindjee for the prompt confirmation and the solution :). I will test it tomorrow.

By the way, input cards (printed in partitioned input files) for the FEAP element (number 1000) is:
Code: [Select]
MATErial    1000
  SOLId              01000   1   2   3
    ELAStic MODUli 6
    1.76000e+05 9.10000e+04 6.80000e+04 0.00000e+00 0.00000e+00 0.00000e+00
    9.10000e+04 1.75000e+05 6.80000e+04 0.00000e+00 0.00000e+00 0.00000e+00
    6.80000e+04 6.80000e+04 2.20000e+05 0.00000e+00 0.00000e+00 0.00000e+00
    0.00000e+00 0.00000e+00 0.00000e+00 8.50000e+04 0.00000e+00 0.00000e+00
    0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 7.20000e+04 0.00000e+00
    0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 7.20000e+04
    VECTOr ORTHotropic 4.76997e-01 8.42490e-01 2.50369e-01 -3.85284e-01 -5.55979e-02 9.21122e-01
and similarly for user element is
Code: [Select]
MATErial    1000
  user               11000   1   2   3
    ka,mu,sy,ac,ee,ld,et,hh,sn,om,cr

I noticed the jammed material tag number. 

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #44 on: June 24, 2021, 03:25:46 PM »
In version 8.5 we were not writing this information out in this way and that is probably why it was working in the older version.