Getting Started - GCP

Overview

You'll provision infrastructure that ultimately looks as follows:

This includes:

  • Cloud Functions

  • Service Accounts

  • Secret Manager Secrets, to hold pseudonymization salt, encryption keys, and data source API keys

  • Cloud Storage Buckets (GCS), if using psoxy to sanitize bulk file data, such as CSVs

NOTE: if you're connecting to Google Workspace as a data source, you'll also need to provision Service Account Keys and activate Google Workspace APIs.

Prerequisites

  • a Google Project

    • we recommend a dedicated GCP project for your deployment, to provide an implicit security boundary around your infrastructure as well as simplify monitoring/cleanup

  • a GCP (Google) user or Service Account with permissions to provision Service Accounts, Secrets, Storage Buckets, Cloud Functions, and enable APIs within that project. eg:

  • the following APIs enabled in the project: (via GCP Console)

  • additional APIs enabled in the project: (using the Service Usage API above, our Terraform will attempt to enable these, but as there is sometimes a few minutes delay in activation and in some cases they are required to read your existing infra prior to apply, you may experience errors. To pre-empt those, we suggest ensuring the following are enabled:

Terraform State Backend

You'll also need a secure backend location for your Terraform state (such as a GCS or S3 bucket). It need not be in the same host platform/project/account to which you are deploying the proxy, as long as the Google/AWS user you are authenticated as when running Terraform has permissions to access it.

Some options:

  • GCS : https://developer.hashicorp.com/terraform/language/settings/backends/gcs

  • S3 : https://developer.hashicorp.com/terraform/language/settings/backends/s3

Alternatively, you may use a local file system, but this is not recommended for production use - as your Terraform state may contain secrets such as API keys, depending on the sources you connect.

See: https://developer.hashicorp.com/terraform/language/settings/backends/local

Bootstrap

For some help in bootstraping a GCP environment, see also: infra/modules/gcp-bootstrap/README.md

The module psoxy-constants is a dependency-free module that provides lists of GCP roles, etc needed for bootstraping a GCP project in which your proxy instances will reside.

Example

The https://github.com/Worklytics/psoxy-example-gcp repo provides an example configuration for hosting proxy instances in GCP. You use that template, following it's Usage docs to get started.

Security Considerations

  • the 'Service Account' approach described in the prerequisites is preferable to giving a Google user account IAM roles to administer your infrastructure directly. You can pass this Service Account's email address to Terraform by setting the gcp_terraform_sa_account_email. Your machine/environments CLI must be authenticated as GCP entity which can impersonate this Service Account, and likely create tokens as it (Service Account Token Creator role).

  • using a dedicated GCP project is superior to using a shared project, as it provides an implicit security boundary around your infrastructure as well as simplifying monitoring/cleanup. The IAM roles specified in the prerequisites must be granted at the project level, so any non-Proxy infrastructure within the GCP project that hosts your proxy instances will be accessible to the user / service account who's managing the proxy infrastructure.

Last updated