Merge pull request #3612 from RosettaCommons/BYachnin/mhc-epitope-new
mhc_epitope scoreterm, a flexible, packer-compatible deimmunization scoreterm
This is the second attempt, after addressing a release mode unit test failure in PR #3390 .
This PR introduces a new scoreterm, mhc_epitope, which can be used to identify and remove T-cell epitopes from proteins. It is packer compatible, using the "design guidance scoreterm" machinery introduced by @vmullig and @asford .
Thanks to @vmullig for writing the original code that we copied, and for the thorough review!
The code was developed by myself and @cbaileykellogg . The simplest mode involves turning on the scoreterm to work with the "ProPred" prediction matrices to remove epitopes (MHCEpitopePredictorMatrix). The threshold for what is considered to be an epitope (i.e. how aggressively to de-immunize) can be tuned using either raw scores or relative scores.
Alternatively, the user can pre-compute epitope scores for a pre-defined set of sequences using more sophisticated prediction tools (e.g. NetMHC and IEDB). These should be stored in a SQL database to be accessed in the MHCEpitopePredictorExternal class. We will be providing some scripts in the tools repo to help users generate these SQL databases, and corresponding PSSMs to limit design space with task ops.
This can also be implemented as a constraint mover, allowing specific regions to be targeted using residue selectors. These can use different predictor classes (e.g. use a general MHCEpitopePredictorMatrix in the scorefunction and a MHCEpitopePredictorExternal in specific regions with residue selectors) and other settings (thresholds, etc.).
The scoreterm should behave well with symmetric proteins, ligands, non-canonical amino acids, and multi-chain systems.
INTEGRATION TEST CHANGES:
`mhc_epitope` fails because it is new
53 tests fail because of cosmetic addition of `EnergyMethodOptions::show: mhc_epitope_setup_files:` to output pdb files (we have added this as a new EnergyMethodOption): backbonegridsampler_multiresidue, bundlegridsampler, bundlegridsampler_copy_pitch, bundlegridsampler_design, bundlegridsampler_design_nstruct_mode, bundlegridsampler_epsilon, bundlegridsampler_z0_offset, bundlegridsampler_z1_offset, coupled_moves, ligand_dock_ensemble, mp_find_interface, mp_mutate_relax, mp_mutate_repack, oligourea_predict, pepspec, perturb_helical_bundle, remodel, remodel_disulfides, remodel_helical_repeat, scaffold_matcher, simple_cycpep_predict, simple_cycpep_predict_angle, simple_cycpep_predict_anglelength, simple_cycpep_predict_cartesian, simple_cycpep_predict_cispro, simple_cycpep_predict_cterm_isopeptide_lariat, simple_cycpep_predict_cterm_isopeptide_lariat_tailless, simple_cycpep_predict_design, simple_cycpep_predict_nterm_isopeptide_lariat, simple_cycpep_predict_nterm_isopeptide_lariat_tailless, simple_cycpep_predict_octahedral_metal, simple_cycpep_predict_settings, simple_cycpep_predict_sidechain_isopeptide, simple_cycpep_predict_sidechain_isopeptide_reverse, simple_cycpep_predict_square_planar_metal, simple_cycpep_predict_square_pyramidal_metal, simple_cycpep_predict_symm_gly, simple_cycpep_predict_symmetric_sampling, simple_cycpep_predict_tbmb, simple_cycpep_predict_terminal_disulfide, simple_cycpep_predict_terminal_disulfide_internal_permutations, simple_cycpep_predict_terminal_disulfide_tails, simple_cycpep_predict_tetrahedral_metal, simple_cycpep_predict_tetrahedral_metal_asp, simple_cycpep_predict_tma, simple_cycpep_predict_trigonal_planar_metal, simple_cycpep_predict_trigonal_pyramidal_metal, simple_grafting_movers, supercharge, sweep_respair_energies, test_energy_method_options, zinc_heterodimer, zinc_homodimer_design
3 tests fail because of cosmetic addition of `INSERT INTO "score_types" VALUES(1,376,'mhc_epitope');` (and re-ordering of other score_types) in output dump files (we have added this as a new score_type): database_jd2_compact_io, database_jd2_io, features
hotspot_hashing is broken in master