gigl.common.utils.gcs#
Attributes#
Classes#
Utility class for interacting with Google Cloud Storage (GCS). |
Module Contents#
- class gigl.common.utils.gcs.GcsUtils(project=None)[source]#
Utility class for interacting with Google Cloud Storage (GCS).
Initialize the GcsUtils instance.
- Parameters:
project (Optional[str]) – The GCP project ID. Defaults to None.
- add_bucket_lifecycle_rule_with_prefix(gcs_path, days_to_expire, should_delete_irrelevant_lifecycle_rules=False)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
days_to_expire (int)
- Return type:
None
- close_upload_delete_and_push_to_gcs(local_file_handle, gcs_file_path)[source]#
- Parameters:
local_file_handle (TextIO)
gcs_file_path (gigl.common.GcsUri)
- Return type:
None
- copy_gcs_path(src_gcs_path, dst_gcs_path)[source]#
- Parameters:
src_gcs_path (gigl.common.GcsUri)
dst_gcs_path (gigl.common.GcsUri)
- count_blobs_in_gcs_path(gcs_path, suffix=None)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
suffix (Optional[str])
- Return type:
int
- delete_files(gcs_files)[source]#
- Parameters:
gcs_files (Iterable[Union[gigl.common.GcsUri, google.cloud.storage.Blob]])
- Return type:
None
- delete_files_in_bucket_dir(gcs_path)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
- Return type:
None
- delete_gcs_file_if_exist(gcs_path)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
- Return type:
None
- does_gcs_file_exist(gcs_path)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
- Return type:
bool
- download_file_from_gcs(gcs_path, dest_file_path)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
dest_file_path (gigl.common.LocalUri)
- Return type:
None
- download_file_from_gcs_to_temp_file(gcs_path)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
- Return type:
tempfile._TemporaryFileWrapper
- download_files_from_gcs_paths_to_local_dir(gcs_paths, local_path_dir)[source]#
- Parameters:
gcs_paths (List[gigl.common.GcsUri])
local_path_dir (gigl.common.LocalUri)
- Return type:
None
- download_files_from_gcs_paths_to_local_paths(file_map)[source]#
Downloads files from GCS path to local path. :param file_map: mapping of GCS path -> local path :return:
- Parameters:
file_map (Dict[gigl.common.GcsUri, gigl.common.LocalUri])
- static get_bucket_and_blob_path_from_gcs_path(gcs_path)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
- Return type:
Tuple[str, str]
- list_uris_with_gcs_path_pattern(gcs_path, suffix=None, pattern=None)[source]#
List GCS URIs with a given suffix or pattern.
Ex: gs://bucket-name/dir/file1.txt gs://bucket-name/dir/foo.txt gs://bucket-name/dir/file.json
list_uris_with_gcs_path_pattern(gcs_path=gs://bucket-name/dir, suffix=”.txt”) -> [gs://bucket-name/dir/file1.txt, gs://bucket-name/dir/foo.txt] list_uris_with_gcs_path_pattern(gcs_path=gs://bucket-name/dir, pattern=”file.*”) -> [gs://bucket-name/dir/file1.txt, gs://bucket-name/dir/file.json]
- Parameters:
gcs_path (GcsUri) – The GCS path to list URIs from.
suffix (Optional[str]) – The suffix to filter URIs by. If None (the default), then no filtering on suffix will be done.
pattern (Optional[str]) – The regex to filter URIs by. If None (the default), then no filtering on the pattern will be done.
- Returns:
A list of GCS URIs that match the given suffix or pattern.
- Return type:
List[GcsUri]
- read_from_gcs(gcs_path)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
- Return type:
str
- upload_files_to_gcs(local_file_path_to_gcs_path_map, parallel=True)[source]#
Upload files from local paths to their subsequent provided GCS paths.
- upload_from_filelike(gcs_path, filelike, content_type='application/octet-stream')[source]#
Uploads a file-like object to GCS.
A “filelike” object is one that satisfies the typing.IO interface, e.g contains read(), write(), etc. The prototypical example of this is the object returned by open(), but we also use io.BytesIO as an in-memory buffer which also satisfies the typing.IO interface.
- Parameters:
gcs_path (GcsUri) – The GCS path to upload the file to.
filelike (IO[AnyStr]) – The file-like object to upload.
content_type (str) – The content type of the file. Defaults to “application/octet-stream”.
- Return type:
None
- upload_from_string(gcs_path, content)[source]#
- Parameters:
gcs_path (gigl.common.GcsUri)
content (str)
- Return type:
None