Merge pull request #1047 from RosettaCommons/xrw_packer_palette
XRW -- PackerPalettes: Let noncanonicals share the TaskOperation conventions of canonicals
After three years, this PR is finally in master!
This pull request was originally part of the 2016 Chemical XRW, which aimed to expand Rosetta's chemical functionality greatly, with the goal of letting Rosetta read in and operate on the entire PDB (including all noncanonical entries.)
~~This pull request uses stuff developed for the MirrorSymmetry branch, and must be merged after pull request #1104.~~ (Merged a long time ago.)
- [x] <b>THIS PULL REQUEST NEEDS TO HAVE `vmullig/xrw_packer_palette` MERGED INTO IT AFTER COMPILATION ERRORS ARE FIXED IN THAT BRANCH.</b>
<b>Background and Rationale:</b>
TaskOperations modify the default packer behaviour, and are commutative (i.e. the order of application doesn't matter). This means, though, that in order to have the default behaviour be that one designs with all twenty canonical amino acids (and nothing else), TaskOperations can only turn canonical amino acids OFF and noncanonical amino acids ON. This makes it a real pain in the neck to do any mixed canonical/noncanonical design -- indeed, it has required the addition of some commutativity-breaking resfile commands. We'd like to remedy that.
<b>Description of Pull Request</b>:
This pull request changes the convention for noncanonicals, so that they share the convention for canonicals (i.e. TaskOperations can only turn ResidueTypes OFF, be they canonical or noncanonical). However, we want to preserve (a) commutativity, and (b) the default behaviour of Rosetta for users who are not using noncanonicals.
We are therefore introducing the concept of a <b>PackerPalette</b>. The PackerPalette tells the packer, "These are all of the possible ResidueTypes and VariantTypes that you're allowed to use for design. All of these are ON by default, unless you're passed TaskOperations that turn some of these OFF at some positions." If a user does not provide a PackerPalette, the default palette is the twenty canonical amino acids, preserving the classic behaviour of Rosetta (so the only people who have to learn new rules are noncanonical designers, and the new rules are nicer for them).
<b>[Packer?]Tasks:</b>
- [x] Create the PackerPalette class (in same core level as TaskOperation class?).
- [x] Write the default apply() function.
- [x] Let actual PackerPalettes derive from the base class -- so create a ~~BasicPackerPalette~~ CustomBaseTypePackerPalette class.
- [x] Implement the ~~BasicPackerPalette~~ CustomBaseTypePackerPalette parse_my_tag() function.
- [x] Integration test for the ~~BasicPackerPalette~~ CustomBaseTypePackerPalette class.
- [x] Documentation for the ~~BasicPackerPalette~~ CustomBaseTypePackerPalette class.
- [x] Create DefaultPackerPalette class, too (no options for that).
- ~~Integration test for the DefaultPackerPalette class.~~ --> Not necessary. Every integration test that calls the packer now invokes the DefaultPackerPalette.
- [x] Documentation for the DefaultPackerPalette class.
- [x] Implement factory architecture for PackerPalettes.
- [x] Finish PackerPalette::decide_what_to_do_with_base_type().
- [x] Check how I'm using ResidueTypeFinder::variants() and ResidueTypeFinder::disallow_variants(). I fear that the two calls to it (one for terminal variant types, one for non-terminal) step on one another, with the second one overwriting the first.
- Think about the AND/OR logic here a bit. If I'm designing with N-methyl-phosphotyrosine (i.e. with two variants), do I want to allow all four possible combinations of those variants (tyrosine, N-methyl-tyrosine, phosphotyrosine, and N-methyl-phosphotyrosine)? --> Put off for the next pull request, which will add a `CustomVariantTypePackerPalette`.
- [x] Create a RosettaScripts-accessible user interface for setting up PackerPalettes.
- [x] A PACKER_PALETTE section in a RosettaScript.
- [x] Utility functions for parsing defined PackerPalettes in mover or filter parse_my_tag functions.
- [x] Document this!
- [x] Add this to the template-printing function in the RosettaScripts app.
- [x] Create a commandline-accessible user interface for setting up a default CustomBaseTypePackerPalette.
- [x] Document this.
- [x] Modify the appropriate packer initialization functions to accept a PackerPalette.
- [x] Ensure that they default to a PackerPalette that preserves the old default Rosetta behaviour.
- [x] Continue at the TODO FIRST lines in ResidueLevelTask_.cc. Move all of the commented-out logic to the initialization function in the PackerPalette base class.
- [x] Modify movers and filters that invoke the packer so that they accept (and parse from XML) a PackerPalette.
- [x] Deprecate the RESET resfile command.
- [x] Refine the EMPTY resfile command to be a non-commutativity-breaking command.
- [x] Deprecate the NC resfile command.
- [x] Deprecate the PIKRNA resfile command.
- [x] Ensure that deprecated commands (RESET, EMPTY, NC, PIKRNA) produce an appropriate, <i>informative</i> error message explaining what the user needs to do to get his/her protocol working again.
- [x] Remove NCAA support from LayerDesign (too hard to maintain).
- [x] Remove this in the LayerDesign documentation, too.
- [x] Update other protocols that need NCAA packing to use the new convention.
- [x] Ensure that all canonical design continues to work as before, with no modification.
- [x] Unit tests for `PackerPalette` functions.
- [x] Integration tests for new noncanonical design workflow.
- ~~Add base ResidueType lookup to ResidueType class. (Make it fast at the expense of memory by storing the string for the base ResidueType).~~ -> Moved to pull request #1050.
- [x] Tweak the PIKAA syntax to allow restricting to canonicals plus noncanonicals. (Syntax is, for example, "PIKAA FAMILYVWX[601]X[ORN]", where the brackets contain full base names).
- [x] Integration test.
- [x] Documentation.
- [x] Beauty.
- [x] Documentation.
- [x] Delete temporary text file with pull request description.
- [x] Switch temporary output to debug output, or delete it outright.
- [x] XSD functions for PackerPalettes.
- [x] Add PackerPalette behaviors for carbohydrates (provided by @JWLabonte).
- [x] ~~Add general way to check that backbones are compatible. (@JWLabonte).~~ We will do this later, but see #3744.
- [x] Read default base `ResidueType`s from the database and remove hardcoding of names. (@JWLabonte).
- Write a `CustomVariantTypePackerPalette`. --> Put off to a future pull request.
- Add support for selecting residues with certain properties to the `CustomBaseTypePackerPalette`. --> Put off to a future pull request.
- Re-enable unit test for peptoid design from sarcosine (in `core/pack/PeptoidDesignTests.cxxtest.hh`) and ensure that this is possible. --> Put off to a future pull request.
- [x] Debug the small subset of performance tests that get slower. --> Done. They no longer get slower.
- [x] Debug the subset of integration tests that get slower.
- [x] Serialize.
The above task list will be expanded and fleshed out as we figure out exactly how to do what we want to do, here. Documentation is in pull request RosettaCommons/documentation#2.