pipelines.tasks.base module#
- exception pipelines.tasks.base.ConfigValidationError(task: Task, msg: str)#
Bases:
TaskError
Error raised when there is one or more errors in the task config
- exception pipelines.tasks.base.InputValidationError(task: Task, msg: str)#
Bases:
TaskError
Error raised when there is one or more errors in the task input data
- class pipelines.tasks.base.ModelTask(config: Dict[str, Any] | None = None)#
Bases:
Task
The base task class to use when iterating over many model instances.
- class Meta#
Bases:
object
- model: str | None#
The model class to use when fetching objects from the database
- get_iterator()#
Returns an iterator to run the task over. By default this returns the result of
get_queryset
.
- get_queryset(*args, **kwargs)#
Returns a queryset containing all items for the model provided in the meta. If
model
is not defined on theMeta
class this method must be overridden otherwise anImproperlyConfigured
error will be raised.
- static get_serializable_task_object(obj)#
Serializes an django model object so that it can be stored on the task result object.
If
obj
is notNone
the object will be stored as a dictionary containingpk
,app_label
andmodel_name
so that the object can be retrieved from the database.- Parameters:
obj – The object to serialize
- class pipelines.tasks.base.Task(config: Dict[str, Any] | None = None)#
Bases:
Registrable
,ClassWithAppConfigMeta
The base task class that all tasks should extend to implement your own tasks.
- ConfigType#
pydantic class used to validate the task arguments
alias of
TaskConfig
- InputType: Type[BaseModel] | None = None#
pydantic class used to validate task input data
- clean_config(config: Dict[str, Any])#
Validates the supplied config against the defined
ConfigType
- Parameters:
config – The config to check
- clean_input_data(input_data: Dict[str, Any])#
Validates the supplied input data against the defined
InputType
- Parameters:
input_data – The config to check
- classmethod get_id()#
Generates the tasks id based in the app label and class name.
- classmethod get_iterator()#
Returns an iterator for multiple instances of the task to be started with. If no iteration is required,
None
should be returned.
- static get_serializable_task_object(obj)#
Converts the object to a json serializable version to be stored in the db.
- Parameters:
obj – The object to store
- pipeline_object: Any | None#
- pipeline_task: str#
The attribute this tasks is named against. This is set via
__init_subclass__
on Pipeline
- run(pipeline_id: str, run_id: str, cleaned_data: BaseModel | None)#
The tasks business logic. When writing custom tasks this needs to be implemented with your own logic.
- Parameters:
pipeline_id – The id of the registered pipeline the task is part of
run_id – The id of the current pipeline run
cleaned_data – The cleaned input data (based on
InputType
)
- start(task_result: TaskResult, reporter: PipelineReporter)#
Starts the task running. The status of the task result will be updated on starting and finishing.
- Parameters:
task_result – The task result object representing this task.
reporter – The pipeline reporter to write status changes to.
- task_object: Any | None#
- class pipelines.tasks.base.TaskConfig(**extra_data: Any)#
Bases:
BaseModel
Base class all task configs should be built from