Merge pull request #1301 from RosettaCommons/aleaverfay/aleaverfay/standard_job_queen_jobdef_interface
JD3 Standard Job Queen XSD Interface
This branch completes the XML Schema integration into JD3 by working out an interface between the StandardJobQueen and her subclasses which allows for the specification of the XML Schema of the job-definition file.
Per-job options may be specified on the command line, in a <Common> block at the top of a job-definition XML file, or in the <Job> block of a particular set of jobs in the job-definition file. The StandardJobQueen must be told about all of the options that could be specified -- she will hold std::list of OptionKeys -- and she initializes an OptionCollection which contains only the options that she has been told about. When a derived JobQueen wants to initialize a job, she should read from the OptionCollection the StandardJobQueen hands her. If she is calling a function that will read from the command line, then she should instead call a function that reads from an OptionCollection object.
For example, the PackerTask now has an initialize_from_option_collection function that takes an OptionCollection const &. If you call the PackerTask's initialize_from_command_line() function, that will pass the global OptionCollection (that is, the basic::options::option global variable) in to the initialize_from_option_collection function. The PackerTask is responsible for listing all of the OptionKeys that are used to read from the input OptionCollection in its "list_read_options" method. Derived JobQueens that want to allow the user to initialize a PackerTask from the command line should call the PackerTask::list_read_options function to get that list of options and then to hand that list to the StandardJobQueen. E.g., in the FixbbJobQueen in src/apps/pilot/andrew/fixbb_jd3, there are these calls:
FixbbJobQueen()
{
utility::options::OptionKeyList opts;
core::scoring::list_read_options_in_get_score_function( opts );
core::pack::task::PackerTask::list_options_read( opts );
core::pack::task::operation::ReadResfile::list_options_read( opts );
add_options( opts );
add_option( basic::options::OptionKeys::minimize_sidechains );
add_option( basic::options::OptionKeys::min_pack );
add_option( basic::options::OptionKeys::off_rotamer_pack );
}
As a result, the XML Schema for the fixbb_jd3 application lists all of the options that are read by get_score_function, the PackerTasks's initialize_from_command_line / initialize_from_option_collection, and the ReadResfile task operation.
The StandardJobQueen asks the derived JobQueen to flesh out the subtags that can be defined in the <Common/> and <Job> tags. To do so, the derived JobQueen will interact with the XMLSchemaDefinition and XMLSchemaComplexTypeGenerator classes. The FixbbJobQueen allows the user to define TaskOperations and ScoreFunctions using the canonical RosettaScripts interface, and then to use those TaskOperations and ScoreFunctions in initializing a PackRotamersMover. The elements of the JobDefinition file are optional, though, so the user can get a vanilla version of fixbb without using the rosetta-scripts like interface.
The StandardJobQueen expects jobs to be defined by input Poses -- not all JobQueens will want this, of course, but JD2 expected this, so that's where we start with JD3. Input poses come from PoseInputters -- currently, there is only implemented a PDBPoseInputter. Within a job-definition file, each <Job> element must specify an <Input> subelement which itself will contain a single subelement listing a particular pose inputter -- e.g. a <PDB filename=...> element specifying the PDBPoseInputter. If a PoseInputter specifies more than one Pose, then all of the input poses will be run through the protocol using the same per-job specification. E.g. an eventual SilentPoseInputter could be used to specify all the Poses in a silent file, or only a handful of them listed by their tags.
The optional <Output> element can be specified to name the output poses, e.g. you might run 1ubq.pdb through two different protocols specified in the same job-definition file, but where you don't want them to overwrite one another -- so you would name the outputs from the first job block to "first_1ubq_foo_0001.pdb" and the outputs from the second job block to "second_1ubq_bar_0001.pdb"; this would be accomplished with
<Output>
<PDB filename_pattern="first_$_foo"/>
</Output>
and
<Output>
<PDB filename_pattern="second_$_bar"/>
</Output>
where the $ is replaced by the input-tag for the pose, and the nstruct index is appended to the end of the filename.