Merge pull request #4238 from RosettaCommons/aleaverfay/sjq_ilj_construction_bug
Fixing bug w/ MSRS following JD3 refactor (Issue #4234).
The MRSJobQueen had operated with the knowledge/assumption that the
job_node index stored in the StandardInnerLarvalJob referred to the
preliminary job node from which the later job node descended. The
When the JD3 refactor went through back in January, the JobExtractor
began using the job-node index stored in the InnerLarvalJob to record
the node of origin for a job. The logic for when you declare one job node
complete and move on to the next job node depended on the proper labeling
of InnerLarvalJobs with their node of origin. However, multistage rosetta
scripts was not labeling ILJ's with the job node but instead with the
preliminary job node. This meant that multistage rosetta script jobs
stopped working; it seems likely that everyone who has tried to use MSRS
since January would have been unable to get it to run properly.
(Thanks @odaunc for bringing this bug to Jack's and my attention!)
It turns out that the integration tests for multistage_rosetta_scripts
were not failing before because they were run using the
VanillaJobDistributor, and the VanillaJobDistributor is so simple
that it does not use the JobExtractor at all.
In order to test JD3 applications in single threaded code,
I'm making the VJD as similar as possible to the other JDs
(Thanks for this suggestion, @JackMaguire ).
It was a good thing I did this! It turns out that MRSJQ has/had
a different idea of what the JobDAG would look like from the SJQ.
The MRSJQ thinks that an n-stage protocol should have n nodes in
the JobDAG. The SJQ thinks that if there are m input Poses, that
the first m nodes in the JobDAG will represent one input Pose
apiece. The VJD-version of the MRS integration test was exiting
with an assertion failure.
There's actually a one-line solution to this inconsistency/discrepancy,
in job-dag philosophy, which is that when the MRSJQ calls the
SJQ::create_and_init_inner_larval_job_from_prelim
for the very first stage, the MRSJQ should pass "1" for the
job-node index. (This "fix" is relative to earlier in this PR,
because I am the one to have put the wrong value for this parameter.)
This means that the InnerLarvalJobs that the SJQ constructs for the
preliminary-job-node will have a different job-node index than the
InnerLavalJobs that the MRSJQ constructs and that will actually be used
to construct LarvalJobs. This discrepancy likely has no consequences,
but should be noted.
This PR causes several cosmetic integration test changes as all JD3
applications now will use the JobExtractor and the JobExtractor is
more verbose in its interactions with the JobQueen.