Merge pull request #3792 from RosettaCommons/aleaverfay/jd3_fix_archive_crosstalk_deadlock
Fix deadlock bug in archive-to-archive communication in JD3
Previously, if archive 1 finished outputting all of its results,
the master node could assign it to retrieve a result from
archive 2, which was not done outputting all of its results.
Archive 1 would then go and send a message to archive 2: an
MPI_Send request. Archive 1 would block until archive 2
responded. While it was waiting, output work could get
assigned to archive 1. When archive 2 would get back to
the master node, the master node would assign it to retrieve
the result from archive 1. Then archives 1 and 2 are both
sending MPI_Send requests.
This is deadlock.
MPI_Send requests do not exit until the corresponding MPI_Recv
has been called on the remote host.
Oops.
The solution is simple: do not allow archives to talk to each
other until the very end of the simulation when it is possible
to guarantee that no work will arrive for Archive 1 while it
waits to hear from Archive 2.