gigl.src.inference.v1.lib.transforms.batch_generator#
Attributes#
Classes#
A function object used by a transform with custom processing. |
Module Contents#
- class gigl.src.inference.v1.lib.transforms.batch_generator.BatchProcessorDoFn(batch_generator_fn)[source]#
Bases:
apache_beam.DoFn
A function object used by a transform with custom processing.
The ParDo transform is such a transform. The ParDo.apply method will take an object of type DoFn and apply it to all elements of a PCollection object.
In order to have concrete DoFn objects one has to subclass from DoFn and define the desired behavior (start_bundle/finish_bundle and process) or wrap a callable object using the CallableWrapperDoFn class.
- process(element)[source]#
Method to use for processing elements.
This is invoked by
DoFnRunner
for each element of a inputPCollection
.The following parameters can be used as default values on
process
arguments to indicate that a DoFn accepts the corresponding parameters. For example, a DoFn might accept the element and its timestamp with the following signature:def process(element=DoFn.ElementParam, timestamp=DoFn.TimestampParam): ...
The full set of parameters is:
DoFn.ElementParam
: element to be processed, should not be mutated.DoFn.SideInputParam
: a side input that may be used when processing.DoFn.TimestampParam
: timestamp of the input element.DoFn.WindowParam
:Window
the input element belongs to.DoFn.TimerParam
: auserstate.RuntimeTimer
object defined by the spec of the parameter.DoFn.StateParam
: auserstate.RuntimeState
object defined by the spec of the parameter.DoFn.KeyParam
: key associated with the element.DoFn.RestrictionParam
: aniobase.RestrictionTracker
will be provided here to allow treatment as a SplittableDoFn
. The restriction tracker will be derived from the restriction provider in the parameter.DoFn.WatermarkEstimatorParam
: a function that can be used to track output watermark of SplittableDoFn
implementations.
- Parameters:
element (List[RawBatchType]) – The element to be processed
*args – side inputs
**kwargs – other keyword arguments.
- Returns:
An Iterable of output elements or None.
- Return type:
Iterable[InferenceBatchType]