Merge pull request #84 from RosettaCommons/roccomoretti/rotamer_refactor
Rotamer Generation Refactor
This is a project I've been working on to clean up the way rotamer libraries get generated. Currently, there's an ad-hoc system based on various bits of data in the ResidueType, along with various if/else statements in RotamerLibrary as well as RotamerSet. This pull request implements a scheme to replace that with a more unified and extendable system.
* The various loose bits of rotamer specification data in the ResidueType are replaced with a single OP to a RotamerLibrarySpecification object. (in src/core/chemical/rotamers/) Generally, each various type of SingleResidueRotamerLibrary will have it's own RotamerLibrarySpecification subtype, each with it's own set of data.
* RotamerLibrarySpecification objects are created either based on the current rotamer lines in .params files, or with a new ROTAMERS line, which has the format ROTAMERS <tag> <data> where <tag> is a string which specifies which RotamerSpecification class to instantiate and <data> is in a class-specific, newline terminated format.
* To facilitate adding new rotamer library types, RotamerLibrarySpecifications have a (RosettaScripts-like) factory system. This not only allows "easy" addition of new rotamer types, but also allows you to place the implementation anywhere in the library hierarchy.
* SingleResidueRotamerLibraries are now instantiated based on the type of RotamerLibrarySpecification that's in the ResidueType.
* This is also a factory based system, based off of the same tag from the RotamerLibrarySpecification class - each RotamerLibrarySpecification class has a corresponding SingleResidueRotamerLibrary, more or less. It's flexible enough that you can do one to many or many to one if you want to.
* The SingleResidueRotamerLibrary factory functionality has been excised from the RotamerLibrary class into a new SingleResidueRotamerLibraryFactory class. This makes core::pack::dunbrack::RotamerLibrary strictly a class for handling the Dunbrack rotamer libraries.
* This also entailed splitting off the CenRot library functionality - the upshot of which is that it is no longer necessary to have a separate flag to enable CenRot rotamers - it will do auto lazy loading.
* Functionality in core/pack/dunbrack/ was also split between core/pack/dunbrack/ and core/pack/rotamers/ - core/pack/dunbrack should contain the functionality which is specific for handling Dunbrack-based rotamer libraries, and core/pack/rotamers/ hold code which is more generic and not necessarily related to the Dunbrack rotamer libraries.
* The previous SingleResidueRotamerLibrary caching scheme used has been replaced by a new write-once/software transactional memory inspired one which should be more threadsafe than the previous one.
* The way SingleLigandRotamerLibrary's data are stored has changed. Previously the library was read and stored as Residue objects. This introduces an implicit cross dependency between the RotamerLibraries and ResiudeTypes/ResidueTypeSets, which breaks down if the underlying ResidueType is deleted, or if there are multiple ResidueTypeSets with different ResidueTypes with the same name. Now, instead of storing Residues the SingleLigandRotamer now represents the input pdb as an atom name/coordinate map, and the vector of Rotamer objects is recreated each time it's needed, just as it is for the Dunbrack libraries. As a benefit, this simplifies the enzdes/metal ion covalent residue patching code a bit.
I (started to) remove the implicit dependence of Rotamer library on the AA enum. Instead, canonical amino acids now have an explicit ROTAMER_AA line which specifies the Dunbrack rotamer library to use. This removes the "magic" that previously happened based on the ResidueTypeSet identity, so that AA LYS in fa_standard magically got Dunbrack rotamers, but AA LYS in centroid didn't. Now AA LYS in fa_standard gets Dunbrack rotamers because the params file says that it should, and AA LYS in centroid doesn't because the params file says that it doesn't.
Certain types (RNA, DNA, etc.) still base rotamers off of AA type in an ugly if/else clause in RotamerSet - changing this is a TODO future goal. (Preferably one which lands on someone else's plate)
Cosmetic tracer changes expected on the following 25 integration tests:
Enzrevert_xml coupled_moves cstfile_to_theozyme_pdb enzdes extract_atomtree_diffs hybridization inverse_rotamer_remodel kinemage_grid_output ligand_database_io ligand_dock_7cpa ligand_dock_grid ligand_dock_script ligand_motif_design ligand_water_docking match_1n9l mp_relax_w_ligand orbitals relax_w_allatom_cst residue_data_resource sdf_reader startfrom_file validate_database write_mol_file -- fiber_diffraction_fad fold_and_dock
The last two look worse than they are due to extra debugging output.
I'm also seeing a ~10% slowdown on the fa_dun* scoreterm performance tests, though not really on the full-scale scoring/minimization tests. It's slightly concerning, but I don't think it's worth putting off the merge for.