Merge pull request #4335 from RosettaCommons/vmullig/int_to_size_in_string_util
Fixing bug that was preventing read of some large text files.
In our string utilities, we have a function that reads in a text file and returns a string with the file's contents. This function reserves space based on the number of characters read in. This count was unfortunately being stored in a standard, signed int, which meant that long text files could result in overflow and a negative value being passed to string::reserve(). This corrects this by switching the integer to a platform::Size, which guarantees that the number of numbers that can be stored is greater than or equal to the maximum size of string that you could store.
notify author
notify list [rosetta-logs@googlegroups.com]
Merge pull request #4290 from RosettaCommons/roccomoretti/bugprone-integer-division_fix
Fix clang-tidy bugprone-integer-division
Dividing two integers in C++ gives a (truncated) integer result. This may not be what you want. Clang-tidy has a check to see if you're doing an integer division and then using the result as a floating point. This flags a fair number of examples in Rosetta, some of which are obvious bugs.
I've attempted to fix the instances which clang-tidy flags, mainly by converting them to actual floating point results. This may not be the best way to fix some of these, so feel free to adjust such that the intent is clearer. See https://github.com/RosettaCommons/main/pull/4290 for detailed discussion.
notify author
notify list [rosetta-logs@googlegroups.com]
Merge pull request #4327 from RosettaCommons/rfalford12/find-optimal-hydrophobic-thk
Adding application for finding the lowest energy hydrophobic thickness
notify author
notify list [rosetta-logs@googlegroups.com]
Merge pull request #4325 from RosettaCommons/vmullig/tweak_simple_cycpep_predict_sci_test_cutoffs
Tweak simple_cycpep_predict sci test cutoffs a wee bit.
Some of these are a bit too lax at the moment. Based on the last half-dozen or so runs, I'm making them a little bit more stringent.
notify author
notify list [rosetta-logs@googlegroups.com]
Merge pull request #4322 from RosettaCommons/BYachnin/fix_quality_clang_tidy_tests
Fixing the ubuntu.clang.code_quality.clang_tidy test breakages introduced by my recent merge.
Here's the broken test: https://b3.graylab.jhu.edu/test/562923
Here's the test for this revision: https://b3.graylab.jhu.edu/test/563115
notify author
notify list [rosetta-logs@googlegroups.com]
Merge pull request #4295 from RosettaCommons/vmullig/rocco_fix_to_hbond_geom
Picking Rocco's fix to hbonds_geom.cc out of roccomoretti/bugprone-integer-division_fix.
Description:
In pull request #4290, Rocco has identified a number of places in which we have likely been making division errors by assuming that an integer divided by an integer gives a floating-point number (which it does not). These are bugs that need to be fixed. Unfortunately, at least one likely has implications for scoring or minimization. This is the change most likely to cause unit, integration, and scientific test changes, so I think it makes sense to test it and merge it entirely separately. I want to be sure that this is a change that we can make without re-calibrating the whole scoring function.
Notes on the current scientific test failures and changes:
-- make_fragments, RosettaCM, and glycan_structure_prediction are all failures-to-run because they're not yet in the current master.
-- dock_glycans and mp_symdock fail in the same way as they do currently in master.
-- Looking at fast_relax_5iter (which passes), I notice no remarkable qualitative difference from current master. (There are no large changes in the scores or score ranges, for example).
-- Looking at antibody_snugdock (which passes), there are big jumps in the discrimination score, but I gather that this happens from run to run anyways. The plots look qualitatively similar.
-- The simple_cycpep_predict test does a lot of sampling. Again, the E vs. RMSD lots look qualitatively very similar.
-- All tests passing in master pass in this branch.
Based on the scientific test results (https://b3.graylab.jhu.edu/revision/commits/13892), I'm pretty convinced that this change to scoring is sufficiently benign that we don't need to worry about recalibrating everything or about it invalidating scientific performance.
notify author
notify list [rosetta-logs@googlegroups.com]
Merge pull request #4315 from RosettaCommons/dimaio/fix_hybrid_max_contig_insertion_option
Fixing the behavior of max_contig_insertion in HybridizeMover.
notify author
notify list [rosetta-logs@googlegroups.com]
Integrate nmer into mhc_epitope to allow packer-compatibility
The "established" mechanism for de-immunizing proteins in Rosetta is using the NMerSVMEnergy class implemented by @indigogo . In contrast to MHCEpitopeEnergy, NMer is not inherently packer compatible.
This PR essentially adds a new MHCEpitopePredictor that allows nmer to be packer-compatible, using the framework we established with mhc_epitope. While the SVM-based scoring is rather slow using the current SVMs, the common framework could still be used to score proteins outside of the context of the packer. In addition, if faster SVMs are added (or generated by the user), they could be used in a packer-compatible manner.
In order to implement this, a few changes to both the MHCEpitopeEnergy class and the NMer classes needed to be made. Testing has not shown any changes triggered by these changes. (The failure for mhc_epitope_nmer_preload is expected, as the test was changed to introduce nmer stuff. The failure for mhc_epitope is cosmetic.)
In addition to the nmer integration itself, this PR also makes some under-the-hood changes:
-MHCEpitopeEnergy now handles Predictors that work using a core + potentially missing overhang definition of their peptides, where the "overhang" regions could hang off the end of the peptide chain. This is different from cases where the entire peptide is "core," meaning that some peptides at the end of the chain would not be counted. NetMHCII dealt with this internally, but nmer does not, so it was needed for equivalent treatment. Extendable to other Predictors that use this strategy.
-Disk reads are now all handled by the ScoringManager, which should improve performance in certain cases.
-Some things got moved around to address the split of `core.3` which took place some months ago.
Thanks to @indigogo for originally writing the nmer code, and @vmullig for review (and particularly suggesting the ScoringManager improvements).
notify author
notify list [rosetta-logs@googlegroups.com]