Looking at the information you have provided, I see that the solution is getting off to a bad start without even trying a solution.
You can see that the very first residual that you are computing between the two programs is different and this should not be the case -- though I will point out that the technical details of how those residuals are computed differs between 8.5 and 8.6. Most likely the tangents are different too.
To help focus in it will be helpful to know
(1) the exact lines you are using to partition your equations
(2) what happens if you just use one material and not 2000.
(3) can you start serial FEAP on your problem and run FORM to get the expected residual? Note this expected residual will be the residual that one expects in parallel if you have output the parallel files using OUTD,AIJ and not OUTD,AIJ,1