Warning: This document is for the development version of Pydra: A simple dataflow engine with scalable semantics. The main version is master.

pydra.engine.core module

Basic processing graph elements.

class pydra.engine.core.TaskBase(name: str, audit_flags: AuditFlag = AuditFlag.NONE, cache_dir=None, cache_locations=None, inputs: str | File | Dict | None = None, cont_dim=None, messenger_args=None, messengers=None, rerun=False)

Bases: object

A base structure for the nodes in the processing graph.

Tasks are a generic compute step from which both elemntary tasks and Workflow instances inherit.

audit_flags: AuditFlag = 0

AuditFlag.

Type:

What to audit – available flags

property cache_dir

Get the location of the cache directory.

property cache_locations

Get the list of cache sources.

property can_resume

Whether the task accepts checkpoint-restart.

property checksum

Calculates the unique checksum of the task. Used to create specific directory name for task that are run; and to create nodes checksums needed for graph checksums (before the tasks have inputs etc.)

checksum_states(state_index=None)

Calculate a checksum for the specific state or all of the states of the task. Replaces lists in the inputs fields with a specific values for states. Used to recreate names of the task directories,

Parameters:

state_index – TODO

combine(combiner, overwrite=False)

Combine inputs parameterized by one or more previous tasks.

Parameters:
  • combiner – TODO

  • overwrite (bool) – TODO

property cont_dim
property done

Check whether the tasks has been finalized and all outputs are stored.

property errored

Check if the task has raised an error

property generated_output_names

Get the names of the outputs generated by the task. If the spec doesn’t have generated_output_names method, it uses output_names. The results depends on the input provided to the task

get_input_el(ind)

Collect all inputs required to run the node (for specific state element).

help(returnhelp=False)

Print class help.

property output_dir

Get the filesystem path where outputs will be written.

property output_names

Get the names of the outputs from the task’s output_spec (not everything has to be generated, see generated_output_names).

pickle_task()

Pickling the tasks with full inputs

result(state_index=None, return_inputs=False)

Retrieve the outcomes of this particular task.

Parameters:
  • state_index (:obj: int) – index of the element for task with splitter and multiple states

  • return_inputs (:obj: bool, str) – if True or “val” result is returned together with values of the input fields, if “ind” result is returned together with indices of the input fields

Return type:

result

set_state(splitter, combiner=None)

Set a particular state on this task.

Parameters:
  • splitter – TODO

  • combiner – TODO

split(splitter, overwrite=False, cont_dim=None, **kwargs)

Run this task parametrically over lists of split inputs.

Parameters:
  • splitter – TODO

  • overwrite (bool) – TODO

  • cont_dim (dict) – Container dimensions for specific inputs, used in the splitter. If input name is not in cont_dim, it is assumed that the input values has a container dimension of 1, so only the most outer dim will be used for splitting.

property uid

the unique id number for the task It will be used to create unique names for slurm scripts etc. without a need to run checksum

property version

Get version of this task structure.

class pydra.engine.core.Workflow(name, audit_flags: AuditFlag = AuditFlag.NONE, cache_dir=None, cache_locations=None, input_spec: List[str] | SpecInfo | None = None, cont_dim=None, messenger_args=None, messengers=None, output_spec: SpecInfo | BaseSpec | None = None, rerun=False, propagate_rerun=True, **kwargs)

Bases: TaskBase

A composite task with structure of computational graph.

add(task)

Add a task to the workflow.

Parameters:

task (TaskBase) – The task to be added.

property checksum

Calculates the unique checksum of the task. Used to create specific directory name for task that are run; and to create nodes checksums needed for graph checksums (before the tasks have inputs etc.)

create_connections(task, detailed=False)

Add and connect a particular task to existing nodes in the workflow.

Parameters:
  • task (TaskBase) – The task to be added.

  • detailed (bool) – If True, add_edges_description is run for self.graph to add a detailed descriptions of the connections (input/output fields names)

create_dotfile(type='simple', export=None, name=None, output_dir=None)

creating a graph - dotfile and optionally exporting to other formats

property graph_sorted

Get a sorted graph representation of the workflow.

property nodes

Get the list of node names.

set_output(connections)

Write outputs.

Parameters:

connections – TODO

pydra.engine.core.is_lazy(obj)

Check whether an object has any field that is a Lazy Field

pydra.engine.core.is_task(obj)

Check whether an object looks like a task.

pydra.engine.core.is_workflow(obj)

Check whether an object is a Workflow instance.