pydra.engine.core module

Basic processing graph elements.

class pydra.engine.core.TaskBase(name: str, audit_flags: AuditFlag = AuditFlag.NONE, cache_dir=None, cache_locations=None, inputs: str | File | Dict | None = None, cont_dim=None, messenger_args=None, messengers=None, rerun=False)

Bases: object

A base structure for the nodes in the processing graph.

Tasks are a generic compute step from which both elementary tasks and Workflow instances inherit.

DEFAULT_COPY_COLLATION = 0

SUPPORTED_COPY_MODES = 15

audit_flags: AuditFlag = 0

AuditFlag.

Type:: What to audit – available flags

property cache_dir: Get the location of the cache directory.

property cache_locations: Get the list of cache sources.

property can_resume: Whether the task accepts checkpoint-restart.

property checksum: Calculates the unique checksum of the task. Used to create specific directory name for task that are run; and to create nodes checksums needed for graph checksums (before the tasks have inputs etc.)

checksum_states(state_index=None)

Calculate a checksum for the specific state or all of the states of the task. Replaces lists in the inputs fields with a specific values for states. Used to recreate names of the task directories,

Parameters:: state_index – TODO

combine(combiner: List[str] | str, overwrite: bool = False)

Combine inputs parameterized by one or more previous tasks.

Parameters:

combiner (list[str] or str) – the
overwrite (bool) – whether to overwrite an existing combiner on the node
**kwargs (dict[str, Any]) – values for the task that will be “combined” before they are provided to the node

Returns:

self – a reference to the task

Return type:

TaskBase

property cont_dim

property done: Check whether the tasks has been finalized and all outputs are stored.

property errored: Check if the task has raised an error

property generated_output_names: Get the names of the outputs generated by the task. If the spec doesn’t have generated_output_names method, it uses output_names. The results depends on the input provided to the task

get_input_el(ind): Collect all inputs required to run the node (for specific state element).

help(returnhelp=False): Print class help.

property lzout

property output_dir: Get the filesystem path where outputs will be written.

property output_names: Get the names of the outputs from the task’s output_spec (not everything has to be generated, see generated_output_names).

pickle_task(): Pickling the tasks with full inputs

result(state_index=None, return_inputs=False)

Retrieve the outcomes of this particular task.

Parameters:

state_index (:obj: int) – index of the element for task with splitter and multiple states
return_inputs (:obj: bool, str) – if True or “val” result is returned together with values of the input fields, if “ind” result is returned together with indices of the input fields

Returns:

result – the result of the task

Return type:

Result

set_state(splitter, combiner=None)

Set a particular state on this task.

Parameters:

splitter – TODO
combiner – TODO

split(splitter: str | List[str] | Tuple[str, ...] | None = None, overwrite: bool = False, cont_dim: dict | None = None, **inputs)

Run this task parametrically over lists of split inputs.

Parameters:

splitter (str or list[str] or tuple[str] or None) – the fields which to split over. If splitting over multiple fields, lists of fields are interpreted as outer-products and tuples inner-products. If None, then the fields to split are taken from the keyword-arg names.
overwrite (bool, optional) – whether to overwrite an existing split on the node, by default False
cont_dim (dict, optional) – Container dimensions for specific inputs, used in the splitter. If input name is not in cont_dim, it is assumed that the input values has a container dimension of 1, so only the most outer dim will be used for splitting.
**split_inputs – fields to split over, will automatically be wrapped in a StateArray object and passed to the node inputs

Returns:

self – a reference to the task

Return type:

TaskBase

property uid: the unique id number for the task It will be used to create unique names for slurm scripts etc. without a need to run checksum

property version: Get version of this task structure.

class pydra.engine.core.Workflow(name, audit_flags: AuditFlag = AuditFlag.NONE, cache_dir=None, cache_locations=None, input_spec: List[str] | Dict[str, Type[Any]] | SpecInfo | None = None, cont_dim=None, messenger_args=None, messengers=None, output_spec: List[str] | Dict[str, type] | SpecInfo | BaseSpec | None = None, rerun=False, propagate_rerun=True, **kwargs)

Bases: TaskBase

A composite task with structure of computational graph.

add(task)

Add a task to the workflow.

Parameters:: task (TaskBase) – The task to be added.

property checksum: Calculates the unique checksum of the task. Used to create specific directory name for task that are run; and to create nodes checksums needed for graph checksums (before the tasks have inputs etc.)

create_connections(task, detailed=False)

Add and connect a particular task to existing nodes in the workflow.

Parameters:

task (TaskBase) – The task to be added.
detailed (bool) – If True, add_edges_description is run for self.graph to add a detailed descriptions of the connections (input/output fields names)

create_dotfile(type='simple', export=None, name=None, output_dir=None): creating a graph - dotfile and optionally exporting to other formats

property graph_sorted: Get a sorted graph representation of the workflow.

property lzin

property nodes: Get the list of node names.

set_output(connections: Tuple[str, LazyField] | List[Tuple[str, LazyField]])

Set outputs of the workflow by linking them with lazy outputs of tasks

Parameters:: connections (tuple[str, LazyField] or list[tuple[str, LazyField]] or None) – single or list of tuples linking the name of the output to a lazy output of a task in the workflow.

pydra.engine.core.is_lazy(obj): Check whether an object has any field that is a Lazy Field

pydra.engine.core.is_task(obj): Check whether an object looks like a task.

pydra.engine.core.is_workflow(obj): Check whether an object is a Workflow instance.