If you're using Terraform Cloud or Enterprise, here are a few things to keep in mind.
NOTE: this is tested only for gcp; for aws YMMV, and in particular we expect Microsoft 365 sources will not work properly, given how those are authenticated.
Prereqs:
git/java/maven, as described here https://github.com/Worklytics/psoxy#required-software-and-permissions
for testing, you'll need the CLI of your host environment (eg, AWS CLI, GCloud CLI, Azure CLI) as well as npm/NodeJS installed on your local machine
After authenticating your terraform CLI to Terraform Cloud/enterprise, you'll need to:
Create a Project in Terraform Cloud; and a workspace within the project.
Clone one of our example repos and run the ./init
script to initialize your terraform.tfvars
for Terraform Cloud. This will also put a bunch of useful tooling on your machine.
3. Commit the bundle that was output by the ./init
script to your repo:
Change the terraform backend main.tf
to point to your Terraform Cloud rather than be local
remove backend
block from main.tf
add a cloud
block within the terraform
block in main.tf
(obtain content from your Terraform Cloud)
run terraform init
to migrate the initial "local" state to the remote state in Terraform Cloud
You'll have to authenticate your Terraform Cloud with Google / AWS / Azure, depending on the cloud you're deploying to / data sources you're using.
If you're using Terraform Cloud or Enterprise, our convention of writing "TODOs" to the local file system might not work for you.
To address this, we've updated most of our examples to also output todo values as Terraform outputs, todos_1
, todos_2
, etc.
To get them nicely on your local machine, something like the following:
get an API token from your Terraform Cloud or Enterprise instance (eg, https://developer.hashicorp.com/terraform/cloud-docs/users-teams-organizations/api-tokens).
set it as an env variable, as well as the host:
run a curl command using those values to get each todos:
If you have terraform
CLI auth'd against your Terraform Cloud or Enterprise instance, then you might be able to avoid the curl-hackery above, and instead use the following:
(This approach should also work with Terraform CLI running with backend
, rather than cloud
)
As Terraform Cloud runs remotely, the test tool we provide for testing your deployment will not be available by default on your local machine. You can install it locally and adapt the suggestions from the todos_2
output variable of your terraform run to test your deployment from your local machine or another environment. See testing.md for details.
If you have run our init
script locally (as suggested in 'Getting Started') then the test tool should have been installed (likely at .terraform/modules/psoxy/tools/
). You will need to update everything in todos_2.md
to point to this path for those test commands to work.
If you need to directly install/re-install it, something like the following should work:
Done with your Psoxy deployment?
Terraform makes it easy to clean up when you're through with Psoxy, of you wish to rebuild everything from scratch.
First, a few caveats:
this will NOT undo any changes outside of Terraform, even those we instructed you to perform via TODO -
files that Terraform may have generated.
be careful with anything you created outside of Terraform and later imported into Terraform, such as GCP project / AWS account themselves. If you DON'T want to destroy these, do terraform state rm <resource>
(analogue of the import) for each.
Do the following to destroy your Psoxy infra:
open you main.tf
of your terraform confriguation; remove ALL blocks that aren't terraform
, or provider
. You'll be left with ~30 lines that looks like the following.
NOTE: do not edit your terraform.tfvars
file or remove any references to your AWS / Azure / GCP accounts; Terraform needs be authenticated and know where to delete stuff from!
run terraform apply
. It'll prompt you with a plan that says "0 to create, 0 to modify" and then some huge number of things to destroy. Type 'yes' to apply it.
That's it. It should remove all the Terraform infra you created.
if you want to rebuild from scratch, revert your changes to main.tf
(git checkout main.tf
) and then terraform apply
again.
By default, the Terraform examples provided by Worklytics install a NodeJS-based tool for testing your proxy deployments.
Full documentation of the test tool is available here. And the code is located in the tools
directory of the Psoxy repository.
Wherever you run this test tool from, your AWS or GCloud CLI must be authenticated as an entity with permissions to invoke the Lambda functions / Cloud functions that you deployed for Psoxy.
If you're testing the bulk cases, the entity must be able to read/write to the cloud storage buckets created for each of those bulk examples.
If you're running the Terraform examples in a different location from where you wish to run tests, then you can install the tool alone:
Clone the Psoxy repo to your local machine:
From within that clone, install the test tool:
Get specific test commands for your deployment
If you set the todos_as_outputs
variable to true
, your Terraform apply run should contain todo2
output variable with testing instructions.
If you set todos_as_local_files
variable to true
, your Terraform apply run should contain local files named TODO 2 ...
with testing instructions.
In both cases, you will need to replace the test tool path included there with the path to your installation.
Example commands of the primary testing tool: "Psoxy Test Calls"
If you used and approach other than Terraform, or did not directly use our Terraform examples, you may not have the testing examples or the test tool installed on your machine.
In such a case, you can install the test tool manually by following steps 1+2 above, and then can review the documentation on how to use it from your machine.
This document describes how to migrate your deployment from one cloud provider to another, or one project/account to another. It does not cover migrating between proxy versions.
Use cases:
move from a dev
account to a prod
account (Account / Project Migration)
move from a "shared" account to a "dedicated" account (Account / Project Migration)
move from AWS --> GCP, and vice versa (Provider Migration)
Some data/infrastructure MUST, or at least SHOULD be preserved during your migration. Below is an enumeration of both cases.
Data, such as configuration values, can generally be copied; you just need to make a new copy of it in the new environment managed by the new Terraform configuration.
Some infrastructure, such as API Clients, will be moved; eg, the same underlying resource will continue to exist, it will just be managed by the new Terraform configuration instead of the old one. This is the more tedious case, as you must both import
this infrastructure to your new configuration and then rm
(remove) it from your old configuration, rather than having it be destroy
ed when you teardown the old configuration. You should carefully review every terraform apply
, including terraform destroy
commands, to ensure that infrastructure you intend to move is not destroyed, or replaced (eg, terraform sees it as tainted, and does a destroy
+ create
within a single apply
operation).
What you MUST copy:
SALT
value. This is a secret used to generate the pseudonyms. If this is lost/destroyed, you will be unable to link any data pseudonymized with the original salt to data you process in the future.
NOTE: the underlying resource to preserve is actually a random_password
resource, not an SSM parameter / GCP Secret - because those simply are being filled from the terraform random_password
resource; if you import parameter/secret, but not the random_password
, Terraform will generate a new value and overwrite the parameter/secret.
as of v0.4.35 examples, the terraform resource ID for this value is expected to be module.psoxy.module.psoxy.random_password.pseudonym_salt
; if not, you can search for it with terraform state list | grep random_password
value for PSEUDONYMIZE_APP_IDS
. This value, if set to true
will have the proxy use a rule set that pseudonymizes identifiers issued by source applications themselves in some cases where these identifiers aren't inherently PII - but the association could be considered discoverable.
value for EMAIL_CANONICALIZATION
. prior to v0.4.52, this default was in effect STRICT
; so if your original deployment was built on a version prior to this, you should explicitly set this value to STRICT
in your new configuration (likely email_canonicalization
variable in terraform modules)
any custom sanitization rules that you've set, either in your Terraform configuration or directly as the value of a RULES
environment variable, SSM Parameter, or GCP Secret.
historical sanitized files for any bulk connectors, if you wish to continue to have this data analyzed by Worklytics. (eg, everything from all your -sanitized
buckets)
NOTE: you do NOT need to copy the ENCRYPTION_KEY
value; rotation of this value should be expected by clients.
What you SHOULD move:
API Clients. Whether generated by Terraform or not, the "API Client" for a data source must typically be authorized by a data source administrator to grant it access to the data source. As such, if you destroy the client, or lose its id, you'll need to coordinate with the administrator again to recreate it / obtain the configuration information.
as of v0.4.35
, Google Workspace and Microsoft 365 API clients are managed directly by Terraform, so these are important to preserve.
What you SHOULD copy:
API Client Secrets, if generated outside of Terraform. If you destroy/lose these values, you'll need to contact the data source administrator to obtain new versions.
Prior to beginning your migration, you should make a list of what existing infrastructure and/or configuration values you intend to move/copy.
The following is a rough guide on the steps you need to take to migrate your deployment.
Salt value. If using an example forked from our template repos at v0.4.35
or later, you can find the output
block in your main.tf
for pseudonym_salt
, uncomment it, run terraform apply
. You'll then be able to obtain the value with: terraform output --raw pseudonym_salt
On macOS, you can copy the value to your clipboard with: terraform output --raw pseudonym_salt | pbcopy
Microsoft 365 API client, if any:
Find the resource ids: terraform state list | grep "\.azuread_application\."
For each, obtain it's objectId
: terraform state show 'module.psoxy.module.msft-connection["azure-ad"].azuread_application.connector'
Prepare import command for each client for your new configuration, eg: terraform import 'module.psoxy.module.msft-connection["azure-ad"].azuread_application.connector' '<objectId>'
Google Workspace API clients, if any:
Find the resource ids: tf state list | grep 'google_service_account\.connector-sa'
For each, obtain its unique_id
: terraform state show 'module.worklytics_connectors_google_workspace.module.google_workspace_connection["gdirectory"].google_service_account.connector-sa'
Prepare import command for each client for your new configuration, eg: terraform import 'module.worklytics_connectors_google_workspace.module.google_workspace_connection["gdirectory"].google_service_account.connector-sa' '<unique_id>'
Create a new Terraform configuration from scratch; run terraform init
there (if you begin with one of our examples, our init
script does this). Use the terraform.tfvars
of your existing configuration as a guide for what variables to set, copying over any needed values.
Run a provisional terraform plan
and review.
Run the imports you prepared in Phase 1, if all appear OK, run another terraform plan
and review (comparing to the old one).
Optionally, run terraform plan -out=plan.out
to create a plan file; if you send this, along with all the *.tf
/*.tfvars
files to Worklytics, we can review it and confirm that it is correct.
Run terraform apply
to create the new infrastructure; re-confirm that the plan is not re-creating any API clients/etc that you intended to preserve
Via AWS / GCP console, or CLIs, move the values of any secrets/parameters that you intend to by directly reading the values from your old account/project, and copying them into the new account/project
Look at the TODO 3
files/output variables for all your connectors. Make a mapping between the old values and the new values. Send this to Worklytics. It should include for each the proxy URLs, AWS Role to use, and any other values that are changing.
Wait for confirmation that Worklytics has migrated all your connections to the new values. This may take 1-2 days.
Remove references to any API Clients you migrated in Phase 1:
eg, terraform state rm 'module.psoxy.module.msft-connection["azure-ad"].azuread_application.connector'
run terraform destroy
in the old configuration. Carefully review the plan before confirming.
if you're using Google Workspace sources, you may see destruction of google_project_service
resources; if you allow these to be destroyed, these APIS will be disabled; if you are using the same GCP project in your other configuration, you should run terraform apply
there again to re-enable them.
You may also destroy any API clients/etc that are managed outside of Terraform and which you did not migrate to the new environment.
You may clean up any configuration values, such as SSM Parameters / GCP Secrets to customize the proxy rules sets, that you may have created in your old host environment.