Adding files for the soon-to-be-released RFdiffusion video tutorial (#452)
The materials were created by Diego Lopez Mateos, Matthew Hvasta, and
Kush Narang for the Tutorial Hackathon track of the 2026 Megathon event.
feat: add inference.empty_cache_per_design flag to reduce CUDA allocator fragmentation (#451)
## Problem
When running RFdiffusion with variable-length contigs (e.g.
`contigmap.contigs=[A1-469/0 1-50]`) over hundreds or thousands of
designs, per-worker VRAM grows steadily from ~7 GB to 10–13 GB per
process. This limits how many workers can run in parallel on a single
GPU before exhausting VRAM.
Root cause: PyTorch's CUDA caching allocator accumulates fragmented
memory blocks across designs. With variable-length contigs each design
allocates differently-sized tensors; freed blocks are cached but cannot
be reused for different-sized allocations, causing steady VRAM growth.
## Fix
Add an optional `inference.empty_cache_per_design` flag (default
`False`, opt-in) that calls `torch.cuda.empty_cache()` at the end of
each design iteration. This releases all unused cached CUDA memory
blocks back to the CUDA memory manager, keeping each worker near its
initial VRAM footprint for the full run.
### Changes
**`config/inference/base.yaml`**
```yaml
write_trajectory: True
empty_cache_per_design: False # NEW
```
**`scripts/run_inference.py`** — after the trajectory/PDB write block,
before `log.info`:
```python
if conf.inference.empty_cache_per_design and torch.cuda.is_available():
torch.cuda.empty_cache()
log.info(f"Finished design in {(time.time()-start_time)/60:.2f} minutes")
```
## Measured impact
Tested on NVIDIA RTX 5090 32 GB running a long PPI campaign with
variable-length contigs:
| Setting | Per-worker VRAM (steady-state) |
|---------|-------------------------------|
| Without fix | 8–13 GB (grows over run) |
| With `empty_cache_per_design=True` | ~5.2 GB (stable) |
This allowed raising the number of parallel workers from 3 to 5 on a 32
GB GPU.
## Why opt-in
`torch.cuda.empty_cache()` adds a small per-design overhead (~1–2 ms)
and is only beneficial for long runs with variable-length contigs. For
short runs or fixed-length designs there is no fragmentation issue, so
the default remains `False` to preserve existing behavior.
## Testing
All 20 applicable tests in `tests/test_diffusion.py` pass with this
change. The one skipped test (`design_ppi_scaffolded`) fails due to a
missing `ppi_scaffolds/` directory in the test fixture — a pre-existing
issue unrelated to this PR.
## Notes
- Placement is after both the PDB write (`writepdb`) and the optional
trajectory block — every consumer of `denoised_xyz_stack` /
`px0_xyz_stack` has already finished before the cache is cleared.
- This does not affect memory held by live tensors — only frees
cached-but-unused blocks.
- Compatible with all existing RFdiffusion design modes (PPI, motif
scaffolding, unconditional).
Updated "Fixed issues with designing in scaffoldguided mode" original PR 386 (#426)
The original PR for this was #386 from
[OrangeCatzhang](https://github.com/OrangeCatzhang). This PR is to fix
the error "AttributeError: 'bool' object has no attribute
'scaffold_list'" when running in scaffoldguided mode.
The first error is fixed by passing the full composed config object
(conf) into BlockAdjacency instead of passing the scaffoldguided
sub-node. BlockAdjacency expects the full config and to access
conf.scaffoldguided.<fields> internally, so passing the sub-node caused
self.conf.scaffoldguided to resolve to the nested boolean field
(scaffoldguided.scaffoldguided), which produced the AttributeError when
code tried to read .scaffold_list. Passing teh full conf fixes that
mismatch.
The other fix is to add initialization of cyclic_reses to
ScaffoldedSampler. I have slightly updated what was in the original PR
to avoid code duplication. I added a helper function to the Sampler
class and then call that in both Sampler and ScaffoldedSampler to
initialize cyclic_reses. I also removed the changes to the
scaffoldedguided flag from the original PR, so the CLI stays the same.
Fix workflow tests that are failing (#410)
This PR updates the tests so that all the examples run and if an example
fails then the test results in a failure as well. Changes include:
- Reformatting design_macrocyclic_binder.sh and
design_macrocyclic_monomer.sh to be submitted correctly by
test_diffusion.py
- Reducing the total length in design_tetrahedral_oligos.sh to reduce
run time of this test
- Changes to test_diffusion.py and main.yml to be able to run the
examples in different chunks so examples can run in parallel and to make
sure that if an example errors out, that the tests does not pass.
Currently design_ppi_scaffolded, design_timbarrel, and
design_ppi_flexible_peptide_with_secondarystructure_specification are
failing which should be addressed in other, future PRs.
Retain chain and residue numbering in RFdiffusion (#348)
A number of issues (e.g., #103 , #171 ,
https://github.com/RosettaCommons/RFdiffusion/issues/312 , #315 ) have
mentioned that RFdiffusion will change the chain IDs and residue
numbering of the input structure. The designed chain ends up as chain
"A", and the fixed chain(s) end up as chain "B". The numbering is also
reset to start at 1. This can be particularly problematic in cases where
comparisons to structures are needed, as well as multi-chain situations
where all of the chains get fused.
Inspired by @GCS-ZHN 's comment and solution referenced in Issue #103 ,
I've modified the code to maintain chain and residue numbering. In
particular:
Chains that are not "designable" will retain their original chain ID
letters and residue numbers.
Chains that are partially fixed (e.g., motif re-scaffolding) will retain
their original chain ID letters. Residues will be re-numbered from 1 to
length of chain. (It was not clear to me what the "correct" behaviour of
chain residue numbering should be, given that the length of the chain
and the position of any fixed residues might change.)
Chains that are being fully generated de novo will be assigned the first
available chain ID in the alphabet not used by any other chain. Residues
will be numbered from 1 to length of chain.
Update README.md (#365)
Update README to point to new [documentation
resource](https://sites.google.com/omsf.io/rfdiffusion/overview).
I also added some text in the Installation section to 1) specify that
Sergey O.'s colab notebook only contains some of the features of
RFdiffusion and 2) that there is now a Rosetta Commons-maintained docker
image.