gigl.src.inference.v1.lib.transforms.batch_generator#

Attributes#

Classes#

BatchProcessorDoFn

A function object used by a transform with custom processing.

Module Contents#

class gigl.src.inference.v1.lib.transforms.batch_generator.BatchProcessorDoFn(batch_generator_fn)[source]#

Bases: apache_beam.DoFn

A function object used by a transform with custom processing.

The ParDo transform is such a transform. The ParDo.apply method will take an object of type DoFn and apply it to all elements of a PCollection object.

In order to have concrete DoFn objects one has to subclass from DoFn and define the desired behavior (start_bundle/finish_bundle and process) or wrap a callable object using the CallableWrapperDoFn class.

process(element)[source]#

Method to use for processing elements.

This is invoked by DoFnRunner for each element of a input PCollection.

The following parameters can be used as default values on process arguments to indicate that a DoFn accepts the corresponding parameters. For example, a DoFn might accept the element and its timestamp with the following signature:

def process(element=DoFn.ElementParam, timestamp=DoFn.TimestampParam):
  ...

The full set of parameters is:

  • DoFn.ElementParam: element to be processed, should not be mutated.

  • DoFn.SideInputParam: a side input that may be used when processing.

  • DoFn.TimestampParam: timestamp of the input element.

  • DoFn.WindowParam: Window the input element belongs to.

  • DoFn.TimerParam: a userstate.RuntimeTimer object defined by the spec of the parameter.

  • DoFn.StateParam: a userstate.RuntimeState object defined by the spec of the parameter.

  • DoFn.KeyParam: key associated with the element.

  • DoFn.RestrictionParam: an iobase.RestrictionTracker will be provided here to allow treatment as a Splittable DoFn. The restriction tracker will be derived from the restriction provider in the parameter.

  • DoFn.WatermarkEstimatorParam: a function that can be used to track output watermark of Splittable DoFn implementations.

Parameters:
  • element (List[RawBatchType]) – The element to be processed

  • *args – side inputs

  • **kwargs – other keyword arguments.

Returns:

An Iterable of output elements or None.

Return type:

Iterable[InferenceBatchType]

batch_generator_fn[source]#
gigl.src.inference.v1.lib.transforms.batch_generator.InferenceBatchType[source]#
gigl.src.inference.v1.lib.transforms.batch_generator.RawBatchType[source]#