Python-tasks

Python tasks are Python functions that are parameterised in a separate step before they are executed or added to a workflow.

Define decorator

The simplest way to define a Python task is to decorate a function with pydra.compose.python.define

[1]:
from pydra.compose import python


# Note that we use PascalCase because the object returned by the decorator is actually a class
@python.define
def MyFirstTask(a, b):
    """Sample function for testing"""
    return a + b

The resulting task-definition class can be then parameterized (instantiated), and executed

[2]:
# Instantiate the task, setting all parameters
my_first_task = MyFirstTask(a=1, b=2.0)

# Execute the task
outputs = my_first_task()

print(outputs.out)
3.0

By default, the name of the output field for a function with only one output is out. To name this something else, or in the case where there are multiple output fields, the outputs argument can be provided to python.define

[3]:
@python.define(outputs=["c", "d"])
def NamedOutputTask(a, b):
    """Sample function for testing"""
    return a + b, a - b


named_output_task = NamedOutputTask(a=2, b=1)

outputs = named_output_task()

print(outputs)
NamedOutputTaskOutputs(c=3, d=1)

The input and output field attributes automatically extracted from the function, explicit attributes can be augmented

[4]:
@python.define(
    inputs={"a": python.arg(allowed_values=[1, 2, 3]), "b": python.arg(default=10.0)},
    outputs={
        "c": python.out(type=float, help="the sum of the inputs"),
        "d": python.out(type=float, help="the difference of the inputs"),
    },
)
def AugmentedTask(a, b):
    """Sample function for testing"""
    return a + b, a - b

Type annotations

If provided, type annotations are included in the task, and are checked at the time of parameterisation.

[5]:
from pydra.compose import python


@python.define
def MyTypedTask(a: int, b: float) -> float:
    """Sample function for testing"""
    return a + b


try:
    # 1.5 is not an integer so this should raise a TypeError
    my_typed_task = MyTypedTask(a=1.5, b=2.0)
except TypeError as e:
    print(f"Type error caught: {e}")
else:
    assert False, "Expected a TypeError"

# While 2 is an integer, it can be implicitly coerced to a float
my_typed_task = MyTypedTask(a=1, b=2)
Type error caught: Incorrect type for field in 'a' field of MyTypedTask interface : 1.5 is not of type <class 'int'> (and cannot be coerced to it)

Docstring parsing

Instead of explicitly providing help strings and output names in inputs and outputs arguments, if the function describes the its inputs and/or outputs in the doc string, in either reST, Google or NumpyDoc style, then they will be extracted and included in the input or output fields

[6]:
from pydra.utils import print_help


@python.define(outputs=["c", "d"])
def DocStrExample(a: int, b: float) -> tuple[float, float]:
    """Example python task with help strings pulled from doc-string

    Args:
        a: First input
            to be inputted
        b: Second input

    Returns:
        c: Sum of a and b
        d: Product of a and b
    """
    return a + b, a * b


print_help(DocStrExample)
------------------------------------
Help for Python task 'DocStrExample'
------------------------------------

Inputs:
- a: int
    First input to be inputted
- b: float
    Second input
- function: Callable[]; default = DocStrExample()

Outputs:
- c: float
    Sum of a and b
- d: float
    Product of a and b

Wrapping external functions

Like all decorators, python.define is just a function, so can also be used to convert a function that is defined separately into a Python task.

[7]:
import numpy as np

NumpyCorrelate = python.define(np.correlate)

numpy_correlate = NumpyCorrelate(a=[1, 2, 3], v=[0, 1, 0.5])

outputs = numpy_correlate()

print(outputs.out)
[3.5]

Like with decorated functions, input and output fields can be explicitly augmented via the inputs and outputs arguments

[8]:
import numpy as np

NumpyCorrelate = python.define(np.correlate, outputs=["correlation"])

numpy_correlate = NumpyCorrelate(a=[1, 2, 3], v=[0, 1, 0.5])

outputs = numpy_correlate()

print(outputs.correlation)
[3.5]