gigl.src.data_preprocessor.lib.ingest.reference#

Classes#

DataReference

Contains a URI string to the data reference, and provides a means of yielding

EdgeDataReference

DataReference which stores edge data

NodeDataReference

DataReference which stores node data.

Module Contents#

class gigl.src.data_preprocessor.lib.ingest.reference.DataReference[source]#

Bases: abc.ABC

Contains a URI string to the data reference, and provides a means of yielding instance dicts via a beam PTransform.

A single DataReference is currently assumed to have data relevant to a single node or edge type. A single DataReference cannot currently house mixed-type data.

abstract yield_instance_dict_ptransform(*args, **kwargs)[source]#

Returns a PTransform whose expand method returns a PCollection of InstanceDicts, which can be subsequently ingested and transformed via Tensorflow Transform.

TODO: extend to support multiple edge types being in the same table. :param args: :param kwargs: :return:

Return type:

gigl.src.data_preprocessor.lib.types.InstanceDictPTransform

reference_uri: str[source]#
class gigl.src.data_preprocessor.lib.ingest.reference.EdgeDataReference[source]#

Bases: DataReference, abc.ABC

DataReference which stores edge data

dst_identifier: str | None = None[source]#
edge_type: gigl.src.common.types.graph_data.EdgeType[source]#
edge_usage_type: gigl.src.common.types.graph_data.EdgeUsageType[source]#
src_identifier: str | None = None[source]#
class gigl.src.data_preprocessor.lib.ingest.reference.NodeDataReference[source]#

Bases: DataReference, abc.ABC

DataReference which stores node data.

identifier: str | None = None[source]#
node_type: gigl.src.common.types.graph_data.NodeType[source]#