Merge pull request #5573 from RosettaCommons/vmullig/citation_info_in_docs
Ensure that the CitationManager exports citation information in the auto-generated documentation
@jadolfbr suggested this a while back, and I think it's a good idea.
UPDATE: The act of doing this revealed a number of bugs and issues with local movers setting global state. Many of these are corrected in this PR -- see notes at the bottom.
Tasks:
- [x] Ensure that this happens for movers.
- [x] Ensure that this happens for filters.
- [x] Ensure that this happens for residue selectors.
- [x] Ensure that this happens for task operations.
- [x] Ensure that this happens for residue level task operations.
- [x] Ensure that this happens for packer palettes.
- [x] Ensure that this happens for simple metrics.
- [x] Ensure that this happens for constraint generators.
- [x] Switch author info to put e-mail addresses in square brackets instead of pointy brackets, since the XSD doesn't like additional pointy brackets.
- [x] Update the auto-generated documentation.
- [x] Add an integration test for the `-output_schema` flag of the `rosetta_scripts` application.
- Add integration test to find movers, filters, etc. that (a) do not call `xsd_type_definition_w_attributes` or `xsd_type_definition_w_attributes_and_repeatable_subelements` and (b) do not manually include their own citation info. (Do this by checking the xsd of each mover/filter/taskop, etc. for the citation string.) --> At some point in the future.
TODO:
- [x] In `BackboneTorsionSampler` mover, move scorefunction initialization to parse_my_tag/apply time so that we don't get unnecessary scorefunction loading during instantiation.
- [x] Same with `BackboneTorsionPerturbation` mover.
- [x] Figure out why some header information in PDB files is no longer printing and restore it.
- [x] Fix cppcheck error while I'm at it.
- [x] Fix failing integration tests --> remaining failure seems to be in `HybridizeProtocol`.
- [x] Check that the failure-to-run cases were fixed.
- [x] Check changes.
- [x] Fix unit test failure.
- [x] Pull request #5581 should be merged before this one.
- [x] Fix scripts that are failing validation.
This addresses issue #5557.
**Update**:
So it turns out that a bunch of movers and filters cannot be instantiated with default options, which prevents their `CitationManager` information from being accessed. In hindsight, I'm kind of wishing I had made the `provide_citation_info()` function static, though that would have prevented configuration-specific citation information from being written out (e.g. if mover X only uses filter Y, and cites it, if option Z is set by the user in a RosettaScripts script). I've had to clean up the following movers and filters:
- `ErraserMinimizerMover` (Wanted to create a scorefunction that doesn't exist on initialization. I don't know whether there's some specialized protocol that provides this scorefunction, so I've put this in a try/catch block. The mover now loads the scorefunction with the nearest name on failure, and prints a big red warning.)
- `DumpSingleResidueRotamers` (Failed on initialization if the user didn't provide configuration with a custom flag. I'm moving the failure to apply-time. Yes, it would be better for this mover not to expect a commandline flag for configuration.)
- `BackboneTorsionSampler` and `BackboneTorsionPerturbation` (Both wanted to create talaris2013 as their scorefunction, and crashed when instantiated if the restore talaris behaviour flag wasn't passed. I've put this in a catch/throw block, and now I print a big red warning and then get the default scorefunction if talaris initialization fails.)
- `GALigandDock` (If the right scorefunction flags aren't passed, on initialization the GALigandDock mover tries to create an object that is unable to read from database on initialization. I'm moving this to an apply-time failure, and cleaning up some inefficient apply-time object copying in the process.)
- `GlycanDockProtocol` (On instantiation, this mover reads from the global options and fails if not set properly. Not only does this prevent setup from RosettaScripts, it also prevents -info from working. I'm moving the checks of correct configuration to apply time. Yes, it would be better for this mover not to expect a commandline flag for configuration.)
- `AsyncMPITemperingBase` and `ParallelTempering` (Throws on instantiation in non-MPI builds. I'm moving this to a throw on first use.)
- `LoopHashDiversifier` (Tried to load its loop hash library on initialization, based on paths set on the commandline. If no path is set, this would fail. I'm moving this to an apply-time loading.)
- `CaIrmsdFilter`, `IrmsdFilter`, `LrmsdFilter`, `FnonnatFilter`, and `FnatFilter` (Tried to set native pose on initialization, and would fail if no -in:file:native commandline flag was supplied. I've switched this to an apply-time check.)
- `ParallelBetaPairingPreferenceFilter` (Tries to initialize itself from a file that doesn't exist. I can't fix that, but I can at least make it happen at apply time instead of on instantiation.)
- `BCLFragmentBaseMover` and `BCLFragmentMutateMover` were checking for the `extras=bcl` build in their constructors. They now check in their `apply()` and `parse_my_tag()` functions.
- Certain movers, namely the `Tomponents` framework (or anything using the `StructureDataFactory`), `ligand_docking`, and the `Matcher`, required `-preserve_header` to be set to `true`. They were silently switching this internally, altering everything that happened downstream by changing the global state if they were ever instantiated. This is a big no-no. I've switched this to a parse- or apply-time failure with an informative error message saying that the flag is required in cases in which there wasn't an easy way to get around the requirement, and in one case, to a scheme that uses a _local_ options setting instead of changing the _global_ options setting.
Many integration tests will change, since initialization order is different. I've checked that there's no substantial change to output that isn't expected, though.