Warning: This document is for the development version of Pydra: A simple dataflow engine with scalable semantics. The main version is master.

pydra.engine.specs module

Task I/O specifications.

class pydra.engine.specs.BaseSpec

Bases: object

The base dataclass specs for all inputs and outputs.

check_fields_input_spec()

Check fields from input spec based on the medatada.

e.g., if xor, requires are fulfilled, if value provided when mandatory.

check_metadata()

Check contained metadata.

collect_additional_outputs(inputs, output_dir, outputs)

Get additional outputs.

copyfile_input(output_dir)

Copy the file pointed by a File input.

property hash
hash_changes()

Detects any changes in the hashed values between the current inputs and the previously calculated values

retrieve_values(wf, state_index: int | None = None)

Get values contained by this spec.

template_update()

Update template.

class pydra.engine.specs.FunctionSpec

Bases: BaseSpec

Specification for a process invoked from a shell.

check_metadata()

Check the metadata for fields in input_spec and fields.

Also sets the default values when available and needed.

class pydra.engine.specs.LazyField(*, name: str, field: str, type: Type[T] | Any, splits=_Nothing.NOTHING, cast_from: Type[Any] | None = None)

Bases: Generic[T]

Lazy fields implement promises.

cast(new_type: Type[T] | Any) LazyField

“casts” the lazy field to a new type

Parameters:

new_type (type) – the type to cast the lazy-field to

Returns:

cast_field – a copy of the lazy field with the new type

Return type:

LazyField

cast_from: Type[Any] | None
field: str
name: str
classmethod sanitize_splitter(splitter: str | Tuple[str, ...], strip_previous: bool = True) Tuple[Tuple[str, ...], ...]

Converts the splitter spec into a consistent tuple[tuple[str, …], …] form used in LazyFields

split(splitter: str | Tuple[str, ...]) LazyField

“Splits” the lazy field over an array of nodes by replacing the sequence type of the lazy field with StateArray to signify that it will be “split” across

Parameters:

splitter (str or ty.Tuple[str, …] or ty.List[str]) – the splitter to append to the list of splitters

splits: FrozenSet[Tuple[Tuple[str, ...], ...]]
type: Type[T] | Any
class pydra.engine.specs.LazyIn(task: core.TaskBase)

Bases: LazyInterface

class pydra.engine.specs.LazyInField(*, name: str, field: str, type: Type[T] | Any, splits=_Nothing.NOTHING, cast_from: Type[Any] | None = None)

Bases: LazyField[T]

attr_type = 'input'
get_value(wf: Workflow, state_index: int | None = None) Any

Return the value of a lazy field.

Parameters:
  • wf (Workflow) – the workflow the lazy field references

  • state_index (int, optional) – the state index of the field to access

Returns:

value – the resolved value of the lazy-field

Return type:

Any

class pydra.engine.specs.LazyInterface(task: core.TaskBase)

Bases: object

class pydra.engine.specs.LazyOut(task: core.TaskBase)

Bases: LazyInterface

class pydra.engine.specs.LazyOutField(*, name: str, field: str, type: Type[T] | Any, splits=_Nothing.NOTHING, cast_from: Type[Any] | None = None)

Bases: LazyField[T]

attr_type = 'output'
get_value(wf: Workflow, state_index: int | None = None) Any

Return the value of a lazy field.

Parameters:
  • wf (Workflow) – the workflow the lazy field references

  • state_index (int, optional) – the state index of the field to access

Returns:

value – the resolved value of the lazy-field

Return type:

Any

class pydra.engine.specs.MultiInputObj(iterable=(), /)

Bases: list, Generic[T]

class pydra.engine.specs.MultiOutputType

Bases: object

class pydra.engine.specs.Result(*, output: Any | None = None, runtime: Runtime | None = None, errored: bool = False)

Bases: object

Metadata regarding the outputs of processing.

errored: bool
get_output_field(field_name)

Used in get_values in Workflow

Parameters:

field_name (str) – Name of field in LazyField object

output: Any | None
runtime: Runtime | None
class pydra.engine.specs.Runtime(*, rss_peak_gb: float | None = None, vms_peak_gb: float | None = None, cpu_peak_percent: float | None = None)

Bases: object

Represent run time metadata.

cpu_peak_percent: float | None

Peak in cpu consumption.

rss_peak_gb: float | None

Peak in consumption of physical RAM.

vms_peak_gb: float | None

Peak in consumption of virtual memory.

class pydra.engine.specs.RuntimeSpec(*, outdir: str | None = None, container: str | None = 'shell', network: bool = False)

Bases: object

Specification for a task.

From CWL:

InlineJavascriptRequirement
SchemaDefRequirement
DockerRequirement
SoftwareRequirement
InitialWorkDirRequirement
EnvVarRequirement
ShellCommandRequirement
ResourceRequirement

InlineScriptRequirement
container: str | None
network: bool
outdir: str | None
class pydra.engine.specs.ShellOutSpec(*, return_code: int, stdout: str, stderr: str)

Bases: object

Output specification of a generic shell process.

collect_additional_outputs(inputs, output_dir, outputs)
generated_output_names(inputs, output_dir)

Returns a list of all outputs that will be generated by the task. Takes into account the task input and the requires list for the output fields. TODO: should be in all Output specs?

return_code: int

The process’ exit code.

stderr: str

The process’ standard input.

stdout: str

The process’ standard output.

class pydra.engine.specs.ShellSpec(*, executable: str | List[str], args: str | List[str] | None = None)

Bases: BaseSpec

Specification for a process invoked from a shell.

args: str | List[str] | None
check_metadata()

Check the metadata for fields in input_spec and fields.

Also sets the default values when available and needed.

executable: str | List[str]
retrieve_values(wf, state_index=None)

Parse output results.

class pydra.engine.specs.SpecInfo(*, name: str, fields: List[Tuple] = _Nothing.NOTHING, bases: Sequence[Type[BaseSpec]] = _Nothing.NOTHING)

Bases: object

Base data structure for metadata of specifications.

bases: Sequence[Type[BaseSpec]]

Keeps track of specification inheritance. Should be a tuple containing at least one BaseSpec

fields: List[Tuple]

List of names of fields (can be inputs or outputs).

name: str

A name for the specification.

class pydra.engine.specs.StateArray(iterable=(), /)

Bases: List[T]

an array of values from, or to be split over in an array of nodes (see TaskBase.split()), multiple nodes of the same task. Used in type-checking to differentiate between list types and values for multiple nodes

class pydra.engine.specs.TaskHook(*, pre_run_task: ~typing.Callable = <function donothing>, post_run_task: ~typing.Callable = <function donothing>, pre_run: ~typing.Callable = <function donothing>, post_run: ~typing.Callable = <function donothing>)

Bases: object

Callable task hooks.

post_run: Callable
post_run_task: Callable
pre_run: Callable
pre_run_task: Callable
reset()
pydra.engine.specs.attr_fields(spec, exclude_names=())
pydra.engine.specs.donothing(*args, **kwargs)
pydra.engine.specs.path_to_string(value)

Convert paths to strings.