pydra.engine.core module
Basic processing graph elements.
- class pydra.engine.core.TaskBase(name: str, audit_flags: AuditFlag = AuditFlag.NONE, cache_dir=None, cache_locations=None, inputs: str | File | Dict | None = None, cont_dim=None, messenger_args=None, messengers=None, rerun=False)
Bases:
object
A base structure for the nodes in the processing graph.
Tasks are a generic compute step from which both elementary tasks and
Workflow
instances inherit.- DEFAULT_COPY_COLLATION = 0
- SUPPORTED_COPY_MODES = 15
- property cache_dir
Get the location of the cache directory.
- property cache_locations
Get the list of cache sources.
- property can_resume
Whether the task accepts checkpoint-restart.
- property checksum
Calculates the unique checksum of the task. Used to create specific directory name for task that are run; and to create nodes checksums needed for graph checksums (before the tasks have inputs etc.)
- checksum_states(state_index=None)
Calculate a checksum for the specific state or all of the states of the task. Replaces lists in the inputs fields with a specific values for states. Used to recreate names of the task directories,
- Parameters:
state_index – TODO
- combine(combiner: List[str] | str, overwrite: bool = False)
Combine inputs parameterized by one or more previous tasks.
- Parameters:
combiner (list[str] or str) – the
overwrite (bool) – whether to overwrite an existing combiner on the node
**kwargs (dict[str, Any]) – values for the task that will be “combined” before they are provided to the node
- Returns:
self – a reference to the task
- Return type:
- property cont_dim
- property done
Check whether the tasks has been finalized and all outputs are stored.
- property errored
Check if the task has raised an error
- property generated_output_names
Get the names of the outputs generated by the task. If the spec doesn’t have generated_output_names method, it uses output_names. The results depends on the input provided to the task
- get_input_el(ind)
Collect all inputs required to run the node (for specific state element).
- help(returnhelp=False)
Print class help.
- property lzout
- property output_dir
Get the filesystem path where outputs will be written.
- property output_names
Get the names of the outputs from the task’s output_spec (not everything has to be generated, see generated_output_names).
- pickle_task()
Pickling the tasks with full inputs
- result(state_index=None, return_inputs=False)
Retrieve the outcomes of this particular task.
- Parameters:
state_index (:obj: int) – index of the element for task with splitter and multiple states
return_inputs (:obj: bool,
str
) – if True or “val” result is returned together with values of the input fields, if “ind” result is returned together with indices of the input fields
- Returns:
result – the result of the task
- Return type:
- set_state(splitter, combiner=None)
Set a particular state on this task.
- Parameters:
splitter – TODO
combiner – TODO
- split(splitter: str | List[str] | Tuple[str, ...] | None = None, overwrite: bool = False, cont_dim: dict | None = None, **inputs)
Run this task parametrically over lists of split inputs.
- Parameters:
splitter (str or list[str] or tuple[str] or None) – the fields which to split over. If splitting over multiple fields, lists of fields are interpreted as outer-products and tuples inner-products. If None, then the fields to split are taken from the keyword-arg names.
overwrite (bool, optional) – whether to overwrite an existing split on the node, by default False
cont_dim (dict, optional) – Container dimensions for specific inputs, used in the splitter. If input name is not in cont_dim, it is assumed that the input values has a container dimension of 1, so only the most outer dim will be used for splitting.
**split_inputs – fields to split over, will automatically be wrapped in a StateArray object and passed to the node inputs
- Returns:
self – a reference to the task
- Return type:
- property uid
the unique id number for the task It will be used to create unique names for slurm scripts etc. without a need to run checksum
- property version
Get version of this task structure.
- class pydra.engine.core.Workflow(name, audit_flags: AuditFlag = AuditFlag.NONE, cache_dir=None, cache_locations=None, input_spec: List[str] | Dict[str, Type[Any]] | SpecInfo | None = None, cont_dim=None, messenger_args=None, messengers=None, output_spec: List[str] | Dict[str, type] | SpecInfo | BaseSpec | None = None, rerun=False, propagate_rerun=True, **kwargs)
Bases:
TaskBase
A composite task with structure of computational graph.
- property checksum
Calculates the unique checksum of the task. Used to create specific directory name for task that are run; and to create nodes checksums needed for graph checksums (before the tasks have inputs etc.)
- create_connections(task, detailed=False)
Add and connect a particular task to existing nodes in the workflow.
- create_dotfile(type='simple', export=None, name=None, output_dir=None)
creating a graph - dotfile and optionally exporting to other formats
- property graph_sorted
Get a sorted graph representation of the workflow.
- property lzin
- property nodes
Get the list of node names.
- set_output(connections: Tuple[str, LazyField] | List[Tuple[str, LazyField]])
Set outputs of the workflow by linking them with lazy outputs of tasks
- Parameters:
connections (tuple[str, LazyField] or list[tuple[str, LazyField]] or None) – single or list of tuples linking the name of the output to a lazy output of a task in the workflow.
- pydra.engine.core.is_lazy(obj)
Check whether an object has any field that is a Lazy Field
- pydra.engine.core.is_task(obj)
Check whether an object looks like a task.