Author Topic: Convergence problems with v8.6 (very slow compared to v8.5)  (Read 31956 times)

arktik

  • Jr. Member
  • **
  • Posts: 46
Convergence problems with v8.6 (very slow compared to v8.5)
« on: April 04, 2021, 06:01:57 AM »
Dear FEAP Admin,

The convergence rate of parallel FEAP v8.6 (8.6.1i) seems to be suboptimal when compared to v8.5 (8.5.2i). I checked with different boundary value problems (purely mechanical). Here is what I found out:
  • With PETSc OFF (without partitioning), both versions give identical convergence rates.
  • With PETSc ON (with partitioning), v8.6 gives lower convergence rate -- or a very high residual norm for same number of iterations. v8.5 is not effected.
  • The solution accuracy is not effected in either case
The check was performed with original source code without compiling any user-defined modifications. For your reference, I am attaching the test examples used for the above conclusions. Please let us know what is happening and how it can resolved.

For one of the more complex problems (~2000 material tags) not attached in zip directory, v8.6 diverged to NaN, where v8.5 gave expected results.

Additional Info:

v8.5 is installed with the following major dependencies a) GCC-7.3.1 b) OpenMPI-3.1.1 c) PETSc-3.11.1
v8.6 is installed with the following major dependencies a) GCC-7.3.1 b) OpenMPI-4.0.4 c) PETSc-3.13.2


Sincerely

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #1 on: April 05, 2021, 12:46:33 AM »
Thanks for the sample files.  We will have a look.

However, one quick question.  Do you know if the partitionings are different from 8.5 and 8.6?

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #2 on: April 05, 2021, 01:28:38 AM »
Thank you Prof. Govindjee for a quick response. I am not really sure if I understood what you mean by "if partitioning is different".  Are you referring to this topic http://feap.berkeley.edu/forum/index.php?topic=2436.0? This has been taken care of while performing the above test.

In the tested examples, the partitioning is done with
Code: [Select]
GRAPh NODE <nproc>
OUTDomains AIJ 1

as shown in each example input file (and explained in section 1.3.1 of parmanual_86.pdf). I think this should lead to identical partitions for both versions(?).

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #3 on: April 05, 2021, 01:54:09 AM »
Yes, that is precisely the question.  Do you know if the partitioned input files contain the same distribution of nodes?

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #4 on: April 05, 2021, 06:58:03 AM »
I checked the output files. The partitioning is identical i.e. number of nodal points and number of elements generated in each of the partitioned files are same for both the versions (for all examples).

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #5 on: April 05, 2021, 11:48:23 AM »
Thank you for that diagnostic.  If the partitionings are the same then it is hard to imagine why the convergence behavior is so different.
Are you sure that both versions have been compiled with the same options (debugging or not; and same level of optimization)?

If those points are the same, then can you post the output of the petsc log and ksp monitor (for one of your examples)?

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #6 on: April 06, 2021, 07:43:50 AM »
I have attached the ksp monitor outcome and log view for both versions only for ex2. I am a bit baffled that from these petsc generated reports nothing is really apparent as they are almost identical. Both versions were compiled with the similar options/flags (as seen in log views).

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #7 on: April 06, 2021, 05:14:38 PM »
Your KSP monitor logs look to be about the same (as do your overall PETSC logs)  I was under the impression that you were seeing different iterative convergence with the KSP solver?

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #8 on: April 06, 2021, 11:31:07 PM »
Sorry for the confusion. By slower rate of convergence, I meant the values printed by FEAP in its own log files e.g. starting with Lxxx_0001 and so on. KSP monitor and petsc log files apparently show similar behavior for both versions.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #9 on: April 07, 2021, 12:51:10 PM »
Ok.  Thanks for the clarification.  Can you post the L-files for your ver85 and ver86 runs.

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #10 on: April 07, 2021, 11:52:23 PM »
For each of the example files above, the L-files for v85 and v86 are attached. In the meantime, I also tested if different openmpi and petsc versions used in the testing above could be the source of the problem. However, that is not the case. For v85 and v86 compiled with identical petsc and openmpi versions, this problem still exists.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #11 on: April 08, 2021, 12:38:18 AM »
Thanks.
I'll have a deeper look.

JStorm

  • Sr. Member
  • ****
  • Posts: 250
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #12 on: April 08, 2021, 06:19:38 AM »
The number of Newton iterations is same in your examples.
The only difference that I have recognized is that the final residuum norm with FEAP86 is always larger then with FEAP85.
But convergence was accepted by FEAP based on the energy norm which was small enough with both version.

It would be interesting to know why FEAP85 has achieved a better residuum norm in all three examples at all steps.
However, convergence behaviour looks ok for me.

Prof. S. Govindjee

  • Administrator
  • FEAP Guru
  • *****
  • Posts: 1164
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #13 on: April 08, 2021, 01:56:41 PM »
If one looks at the energy norms from the two codes they are essentially identical sometimes even to 15 digits.  The issue with the residual norm appears to be related to some changes that have been made to the serial code which are not quite correctly implemented in the parallel code.  A patch is being developed.

arktik

  • Jr. Member
  • **
  • Posts: 46
Re: Convergence problems with v8.6 (very slow compared to v8.5)
« Reply #14 on: April 08, 2021, 10:12:56 PM »
Thanks Prof. Govindjee for the assessment. Yes, I also noticed that the energy norm was almost identical. As I mentioned, for a more complex problem (the tested examples were very simplified cases), ver86 simply fails to show convergence. I am assuming, it is not just the incorrect calculation of the residuum norm but the tangent modulus itself is involved?! Please let me know if you want the more complicated test case with ~2000 material tags (> 6x10^6 DOFs) as well.