libreary.adapters package¶
Submodules¶
libreary.adapters.BaseAdapter module¶
-
class
libreary.adapters.BaseAdapter.BaseAdapter[source]¶ Bases:
objectClass definition for the base adapter for libre-ary resource adapters.
Use this mostly as a reference for what other adapters need to implement.
- An adapter allows LIBREary to save copies of digital objects
- to different places across cyberspace. Working with many adapters in concert, one should be able do save sufficient copies to places they want them.
-
delete() → None[source]¶ Delete a copy of a resource from this adapter. Delete the corresponding entry in the copies table.
:param r_id - the resource to retrieve’s UUID
-
get_actual_checksum(r_id: str) → str[source]¶ Return an exact checksum of a resource, not relying on the metadata db
If possible, this should be done with no file I/O
-
retrieve() → str[source]¶ Retrieve a copy of a resource from this adapter.
Retrieve assumes that the file can be stored to the output_dir. AdapterManager will always verify that this is the case.
Returns the path to the resource.
May overwrite files in the output_dir
:param r_id - the resource to retrieve’s UUID
libreary.adapters.drive module¶
-
class
libreary.adapters.drive.GoogleDriveAdapter(config: dict)[source]¶ Bases:
objectdocstring for GoogleDriveAdapter
- An Adapter allows LIBREary to save copies of digital objects
- to different places across cyberspace. Working with many adapters in concert, one should be able do save sufficient copies to places they want them.
DriveAdapter allows you to store objects in Google Drive
-
delete(r_id: str) → None[source]¶ Delete a copy of a resource from this adapter. Delete the corresponding entry in the copies table.
:param r_id - the resource to retrieve’s UUID
-
get_actual_checksum(r_id: str, delete_after_download: bool = True) → str[source]¶ Return an exact checksum of a resource, not relying on the metadata db.
The :param deep trusts the tag we’ve given google drive on ingestion, if True, it will retrieve and recompute
-
get_google_client() → None[source]¶ Build a Google Drive client object.
Important to note that this uses an OAUTH flow, so you’ll need to run it from a computer that has a web browser you can use.
Store the creds JSON file in the place you note in config[“adapter”][“credentials_file”] A token will be stored in config[“adapter”][“token_file”].
If you are running LIBREary on a headless server, I recommend getting a token first, and saving the token file on the server, so that you don’t need to mess around with headless browsers etc.
- Get creds JSON file from here:
- https://developers.google.com/drive/api/v3/quickstart/python?authuser=3
-
load_metadata(r_id: str) → List[List[str]][source]¶ Get a summary of information about a resource. That summary includes:
id, path, levels, file name, checksum, object uuid, description
This method trusts the metadata database. There should be a separate method to verify the metadata db so that we know we can trust this info
:param r_id - UUID of resource you’d like to learn about
-
retrieve(r_id: str) → str[source]¶ Retrieve a copy of a resource from this adapter.
Retrieve assumes that the file can be stored to the output_dir. AdapterManager will always verify that this is the case.
Returns the path to the resource.
May overwrite files in the output_dir
:param r_id - the resource to retrieve’s UUID
libreary.adapters.local module¶
-
class
libreary.adapters.local.LocalAdapter(config: dict)[source]¶ Bases:
object- An Adapter allows LIBREary to save copies of digital objects
- to different places across cyberspace. Working with many adapters in concert, one should be able do save sufficient copies to places they want them.
LocalAdapter is a basic adapter which saves files to a local directory specified in the adapter’s config
Later in this project’s plan, the LocalAdapter will be used for ingesting the master copies as well as as a (probably) commonly used adapter.
It’s also very nice to use for testing, as saving files is easy (ish) to debug and doesn’t cost any money (unlike a cloud service) or have any configuration difficulty.
-
delete(r_id: str) → None[source]¶ Delete a copy of a resource from this adapter. Delete the corresponding entry in the copies table.
:param r_id - the resource to retrieve’s UUID
-
get_actual_checksum(r_id: str) → str[source]¶ Returns an exact checksum of a resource, not relying on the metadata db.
- If possible, implementations of get_actual_checksum should do no file I/O.
- In the case of LocalAdapter, we’re able to do this without copying files around.
:param r_id - resource we want the checksum of
-
load_metadata(r_id: str) → List[List[str]][source]¶ Get a summary of information about a resource. That summary includes:
id, path, levels, file name, checksum, object uuid, description
This method trusts the metadata database. There should be a separate method to verify the metadata db so that we know we can trust this info
:param r_id - UUID of resource you’d like to learn about
-
retrieve(r_id: str) → str[source]¶ Retrieve a copy of a resource from this adapter.
Retrieve assumes that the file can be stored to the output_dir. AdapterManager will always verify that this is the case.
Returns the path to the resource.
May overwrite files in the output_dir
:param r_id - the resource to retrieve’s UUID
libreary.adapters.s3 module¶
-
class
libreary.adapters.s3.S3Adapter(config: dict)[source]¶ Bases:
object- An Adapter allows LIBREary to save copies of digital objects
- to different places across cyberspace. Working with many adapters in concert, one should be able do save sufficient copies to places they want them.
S3Adapter allows users to store objects in AWS S3.
-
create_session() → boto3.session.Session[source]¶ Create a session.
First we look in self.key_file for a path to a json file with the credentials. The key file should have ‘AWSAccessKeyId’ and ‘AWSSecretKey’. Next we look at self.profile for a profile name and try to use the Session call to automatically pick up the keys for the profile from the user default keys file ~/.aws/config. Finally, boto3 will look for the keys in environment variables: AWS_ACCESS_KEY_ID: The access key for your AWS account. AWS_SECRET_ACCESS_KEY: The secret key for your AWS account. AWS_SESSION_TOKEN: The session key for your AWS account. This is only needed when you are using temporary credentials. The AWS_SECURITY_TOKEN environment variable can also be used, but is only supported for backwards compatibility purposes. AWS_SESSION_TOKEN is supported by multiple AWS SDKs besides python.
-
delete(r_id: str) → None[source]¶ Delete a copy of a resource from this adapter. Delete the corresponding entry in the copies table.
:param r_id - the resource to retrieve’s UUID
-
get_actual_checksum(r_id: str, delete_after_download: bool = True) → str[source]¶ Returns an exact checksum of a resource, not relying on the metadata db.
- If possible, implementations of get_actual_checksum should do no file I/O.
- For S3, we need to download and checksum manually. :/
:param r_id - resource we want the checksum of :param delete_after_download - True if the file should be downloaded after the
checksum is calculated
-
load_metadata(r_id: str) → List[List[str]][source]¶ Get a summary of information about a resource. That summary includes:
id, path, levels, file name, checksum, object uuid, description
This method trusts the metadata database. There should be a separate method to verify the metadata db so that we know we can trust this info
:param r_id - UUID of resource you’d like to learn about
-
retrieve(r_id: str) → str[source]¶ Retrieve a copy of a resource from this adapter.
Retrieve assumes that the file can be stored to the output_dir. AdapterManager will always verify that this is the case.
Returns the path to the resource.
May overwrite files in the output_dir
:param r_id - the resource to retrieve’s UUID
Module contents¶
-
class
libreary.adapters.S3Adapter(config: dict)[source]¶ Bases:
object- An Adapter allows LIBREary to save copies of digital objects
- to different places across cyberspace. Working with many adapters in concert, one should be able do save sufficient copies to places they want them.
S3Adapter allows users to store objects in AWS S3.
-
create_session() → boto3.session.Session[source]¶ Create a session.
First we look in self.key_file for a path to a json file with the credentials. The key file should have ‘AWSAccessKeyId’ and ‘AWSSecretKey’. Next we look at self.profile for a profile name and try to use the Session call to automatically pick up the keys for the profile from the user default keys file ~/.aws/config. Finally, boto3 will look for the keys in environment variables: AWS_ACCESS_KEY_ID: The access key for your AWS account. AWS_SECRET_ACCESS_KEY: The secret key for your AWS account. AWS_SESSION_TOKEN: The session key for your AWS account. This is only needed when you are using temporary credentials. The AWS_SECURITY_TOKEN environment variable can also be used, but is only supported for backwards compatibility purposes. AWS_SESSION_TOKEN is supported by multiple AWS SDKs besides python.
-
delete(r_id: str) → None[source]¶ Delete a copy of a resource from this adapter. Delete the corresponding entry in the copies table.
:param r_id - the resource to retrieve’s UUID
-
get_actual_checksum(r_id: str, delete_after_download: bool = True) → str[source]¶ Returns an exact checksum of a resource, not relying on the metadata db.
- If possible, implementations of get_actual_checksum should do no file I/O.
- For S3, we need to download and checksum manually. :/
:param r_id - resource we want the checksum of :param delete_after_download - True if the file should be downloaded after the
checksum is calculated
-
load_metadata(r_id: str) → List[List[str]][source]¶ Get a summary of information about a resource. That summary includes:
id, path, levels, file name, checksum, object uuid, description
This method trusts the metadata database. There should be a separate method to verify the metadata db so that we know we can trust this info
:param r_id - UUID of resource you’d like to learn about
-
retrieve(r_id: str) → str[source]¶ Retrieve a copy of a resource from this adapter.
Retrieve assumes that the file can be stored to the output_dir. AdapterManager will always verify that this is the case.
Returns the path to the resource.
May overwrite files in the output_dir
:param r_id - the resource to retrieve’s UUID
-
class
libreary.adapters.LocalAdapter(config: dict)[source]¶ Bases:
object- An Adapter allows LIBREary to save copies of digital objects
- to different places across cyberspace. Working with many adapters in concert, one should be able do save sufficient copies to places they want them.
LocalAdapter is a basic adapter which saves files to a local directory specified in the adapter’s config
Later in this project’s plan, the LocalAdapter will be used for ingesting the master copies as well as as a (probably) commonly used adapter.
It’s also very nice to use for testing, as saving files is easy (ish) to debug and doesn’t cost any money (unlike a cloud service) or have any configuration difficulty.
-
delete(r_id: str) → None[source]¶ Delete a copy of a resource from this adapter. Delete the corresponding entry in the copies table.
:param r_id - the resource to retrieve’s UUID
-
get_actual_checksum(r_id: str) → str[source]¶ Returns an exact checksum of a resource, not relying on the metadata db.
- If possible, implementations of get_actual_checksum should do no file I/O.
- In the case of LocalAdapter, we’re able to do this without copying files around.
:param r_id - resource we want the checksum of
-
load_metadata(r_id: str) → List[List[str]][source]¶ Get a summary of information about a resource. That summary includes:
id, path, levels, file name, checksum, object uuid, description
This method trusts the metadata database. There should be a separate method to verify the metadata db so that we know we can trust this info
:param r_id - UUID of resource you’d like to learn about
-
retrieve(r_id: str) → str[source]¶ Retrieve a copy of a resource from this adapter.
Retrieve assumes that the file can be stored to the output_dir. AdapterManager will always verify that this is the case.
Returns the path to the resource.
May overwrite files in the output_dir
:param r_id - the resource to retrieve’s UUID
-
class
libreary.adapters.GoogleDriveAdapter(config: dict)[source]¶ Bases:
objectdocstring for GoogleDriveAdapter
- An Adapter allows LIBREary to save copies of digital objects
- to different places across cyberspace. Working with many adapters in concert, one should be able do save sufficient copies to places they want them.
DriveAdapter allows you to store objects in Google Drive
-
delete(r_id: str) → None[source]¶ Delete a copy of a resource from this adapter. Delete the corresponding entry in the copies table.
:param r_id - the resource to retrieve’s UUID
-
get_actual_checksum(r_id: str, delete_after_download: bool = True) → str[source]¶ Return an exact checksum of a resource, not relying on the metadata db.
The :param deep trusts the tag we’ve given google drive on ingestion, if True, it will retrieve and recompute
-
get_google_client() → None[source]¶ Build a Google Drive client object.
Important to note that this uses an OAUTH flow, so you’ll need to run it from a computer that has a web browser you can use.
Store the creds JSON file in the place you note in config[“adapter”][“credentials_file”] A token will be stored in config[“adapter”][“token_file”].
If you are running LIBREary on a headless server, I recommend getting a token first, and saving the token file on the server, so that you don’t need to mess around with headless browsers etc.
- Get creds JSON file from here:
- https://developers.google.com/drive/api/v3/quickstart/python?authuser=3
-
load_metadata(r_id: str) → List[List[str]][source]¶ Get a summary of information about a resource. That summary includes:
id, path, levels, file name, checksum, object uuid, description
This method trusts the metadata database. There should be a separate method to verify the metadata db so that we know we can trust this info
:param r_id - UUID of resource you’d like to learn about
-
retrieve(r_id: str) → str[source]¶ Retrieve a copy of a resource from this adapter.
Retrieve assumes that the file can be stored to the output_dir. AdapterManager will always verify that this is the case.
Returns the path to the resource.
May overwrite files in the output_dir
:param r_id - the resource to retrieve’s UUID