Merge pull request #1508 from RosettaCommons/vmullig/support_nmethyl
Adding support for N-methyl amino acids
# **Background and Rationale:**
N-methyl amino acids can be used during peptide synthesis to confer various desirable properties on a synthetic peptide. N-methylation removes a hydrogen bond donor, in many cases promoting solubility in hydrophobic environments and helping with membrane permeability. It also greatly alters the conformational preferences of an amino acid residue.
![nmethylpeptide](https://cloud.githubusercontent.com/assets/4205776/17455648/a2aa4124-5b73-11e6-8f32-9e7c63ae7ef2.png)
We want to be able to design with N-methyl amino acids, and this pull request is intended to add support for:
- Modelling N-methyl amino acid geometry (adding a patch for N-methylation).
- Properly scoring N-methyl amino acid-containing peptides (permitting the loading of custom Ramachandran and p_aa_pp tables for N-methyl amino acids).
Following the implementation of PackerPalettes (pull request #1047), a future pull request will add:
- Support for designing with N-methyl amino acids (_i.e._ getting the packer decide whether to N-methylate a position or not).
**Tasks (note that many of these changes have already been merged into master in incremental merges):**
- [x] Add a N-methylation VariantType.
- [x] Add a patch for this.
- [x] Figure out a way to allow custom rama_prepro maps to be specified for the patched amino acids.
- [x] Figure out a way to allow custom rotamer libraries to be specified for the patched amino acids.
- [x] Debug the rotamer libraries -- there's something wrong there.
- [x] Add support for lazily loading custom scoring tables.
- [x] For _rama_prepro_.
- ~~Think about thread-safety.~~ <i>Put off to a future pull request. The lazily loaded objects are in ScoringManager, which contains a bunch of stuff that needs to be made thread-safe. It's all in one place, in any case.</i>
- [x] Expand mainchain scoring tables to permit N-dimensional data (for future support for beta-amino acids).
- [x] Add support for specifying the dimension, resolution, and grid offset in the database file (so that we're no longer hardcoding 5-degree bins and whether we have grid points in bin centres or on edges).
- [x] Add support for comments in the database file.
- [x] Comment the database files explaining the columns.<br/>
~~\- Possibly add support for variable-resolution mainchain torsion tables (_e.g._ with finer sampling in alpha-helix and beta-strans wells). This has been talked about in the past, and now might be a time to do it.~~ _Beyond the scope of this pull request, though it should be easier to do now._
- [x] For all of the above, add a MainchainTorsionScoringTable class.
- [x] Support symmetrization of gly/ACHRIAL tables.
- [x] Support polycubic interpolation.<br/>
- ~~Support linear interpolation.~~ <i>Put off to a future pull request, since </i>rama_prepro<i> uses exclusively polycubic interpolation.</i>
- [x] Unit tests for _rama_prepro_ scoring with N-methylation --> RamaPrePro tests AND cyclic geometry tests.
- [x] Unit tests for _fa_dun_ scoring and rotamer generation with N-methylation --> Covered in cyclic geometry tests and protocols::cyclic_peptide::N_methylation tests.
- [x] Beauty.
- [x] Documentation.
- [x] RAMA_PRE_PRO lines in params files.
- ~~Rotamer lines in patch files.~~ --> We have no documentation for patch files. Grr.
- ~~RamaPrePro lines in patch files.~~
- [x] Figure out how to add sampling tables / CDFs for arbitrary mainchain potentials.
- [x] Add function to produce random mainchain torsion vectors biased by probability distribution.
- [x] Unit test.
- [x] Debug why we don't seem to be using the N-methyl rama map currently.
- [x] Add option to use rama_prepro instead of rama tables in GenKIC when sampling.
- [x] Document this.
- [x] Unit test.
- [x] Integration test.
- [x] Add rama_prepro GenKIC filter.
- [x] Document this.
- [x] Add to unit test.
- [x] Add to integration test.
- [x] Add option to GenKIC to correct polymer bond-dependent atoms.
- [x] Document this.
- [x] Turn this on in simple_cycpep_predict.
- [x] Add support for N-methylation to _simple_cycpep_predict_.
- [x] Add option (true by default) to use rama_prepro for sampling instead of rama.
- [x] Document the new option (rama_prepro instead of rama).
- [x] Interface for specifying N-methylated positions.
- [x] Documentation.
- [x] Integration test.
- [x] Use rama_prepro cutoff energy instead of rama cutoff energy.
- [x] Move tryptophan rotamers into database for unit tests.
- [x] Move the rest of the rotamers into the rotamer library.
- [x] Unit test with N-methyl-tryptophan rotamers. --> Covered in protocols::cyclic_peptide::N_methylation tests.
- [x] Move generic N-methyl-L-alpha-amino acid rama table into database.
- [x] Cyclic N-methyl amino acid scoring unit test.
- [x] Regular NMe.
- [x] For beta_nov15 NMe.
- [x] For beta_nov16 NMe.
- [x] Non-NMe beta_nov16 unit test while I'm at it.
- [x] Two-chain NMe.
- [x] Two-chain beta_nov15 NMe.
- [x] Two-chain beta_nov16 NMe.
- [x] Non-NMe beta_nov16 two-chain unit test while I'm at it.
- [x] Mirror image N-methyl amino acid scoring unit test.
- [x] Note to self: @twcraven will regenerate the N-methyl map (and add maps for pre-proline, and for sarcosine and sarcosine pre-pro). I need to add and test those.
- [x] Need sarcosine.
- [x] Need sarcosine_prepro.
- [x] Sarcosine unit tests.
- [x] Fix problem with rebuilding polymer bond-dependent atoms (methyl group).
- [x] Document new option ("update_polymer_bond_dependent_atoms") in ModifyVariantTypeMover.
- [x] Unit test for recursive identification of connection-dependent atoms.
- [x] Fix issue with rebuilding polymer bond dependent methyl group when it's at a terminus.
- [x] Figure out why first output in simple_cycpep_predict_nmethyl integration test has crooked N-methyl group.
- [x] OK, figured it out: it's the packer. When rotamers are constructed, backbone atom coordinates are preserved, by the N-methyl group is not considered to be a set of backbone atoms. Add recursive check for polymer bond dependencies in Residue::place to correct this.
- [x] Add option to MutateResidue mover to update polymer bond-dependent atoms.
- [x] Document this.
Also:
- [x] Updating HydroxylTorsionPotential for D-amino acids.
- [x] Add params files for noncanonical amino acids found in cyclosporin A.
- [x] Add AIBparams #file.
For a future pull request:
- Update _rama_ as _rama_prepro_ was updated.
- Update _p_aa_pp_ as _rama_prepro_ was updated.