pipelines.tasks package#

class pipelines.tasks.ModelTask(config: Dict[str, Any] | None = None)#

Bases: Task

The base task class to use when iterating over many model instances.

class Meta#

Bases: object

model: str | None#

The model class to use when fetching objects from the database

get_iterator()#

Returns an iterator to run the task over. By default this returns the result of get_queryset.

get_queryset(*args, **kwargs)#

Returns a queryset containing all items for the model provided in the meta. If model is not defined on the Meta class this method must be overridden otherwise an ImproperlyConfigured error will be raised.

static get_serializable_task_object(obj)#

Serializes an django model object so that it can be stored on the task result object.

If obj is not None the object will be stored as a dictionary containing pk, app_label and model_name so that the object can be retrieved from the database.

Parameters:

obj – The object to serialize

class pipelines.tasks.Task(config: Dict[str, Any] | None = None)#

Bases: Registrable, ClassWithAppConfigMeta

The base task class that all tasks should extend to implement your own tasks.

ConfigType#

alias of TaskConfig

InputType: Type[BaseModel] | None = None#

pydantic class used to validate task input data

clean_config(config: Dict[str, Any])#

Validates the supplied config against the defined ConfigType

Parameters:

config – The config to check

clean_input_data(input_data: Dict[str, Any])#

Validates the supplied input data against the defined InputType

Parameters:

input_data – The config to check

classmethod get_id()#

Generates the tasks id based in the app label and class name.

classmethod get_iterator()#

Returns an iterator for multiple instances of the task to be started with. If no iteration is required, None should be returned.

static get_serializable_task_object(obj)#

Converts the object to a json serializable version to be stored in the db.

Parameters:

obj – The object to store

pipeline_task: str#

The attribute this tasks is named against. This is set via __init_subclass__ on Pipeline

run(pipeline_id: str, run_id: str, cleaned_data: BaseModel | None)#

The tasks business logic. When writing custom tasks this needs to be implemented with your own logic.

Parameters:
  • pipeline_id – The id of the registered pipeline the task is part of

  • run_id – The id of the current pipeline run

  • cleaned_data – The cleaned input data (based on InputType)

start(task_result: TaskResult, reporter: PipelineReporter)#

Starts the task running. The status of the task result will be updated on starting and finishing.

Parameters:
  • task_result – The task result object representing this task.

  • reporter – The pipeline reporter to write status changes to.

class pipelines.tasks.TaskConfig(**extra_data: Any)#

Bases: BaseModel

Base class all task configs should be built from

class Config#

Bases: object

extra = 'allow'#

Submodules#