1 of 14

AWS

Getting Started

Overview

You'll provision the following to host Psoxy in AWS:

Lambda Functions
IAM Roles and Policies
System Manager Parameter Store Parameters
Cloud Watch Log Groups
S3 buckets, if using the 'bulk' mode to sanitize file data (such as CSVs)
Cognito Pools and Identities, if connecting to Microsoft 365 (Azure AD) data sources

The diagram below provides an architecture overview of the 'REST' and 'Bulk' use-cases.

Prerequisites

An AWS Account in which to deploy Psoxy We strongly recommend you provision one specifically for use to host Psoxy, as this will create an implicit security boundary, reduce possible conflicts with other infra configured in the account, and simplify eventual cleanup.
You will need the numeric AWS Account ID for this account, which you can find in the AWS Console.
If your AWS organization enforces Service Control Policies, ensure that these are allow the AWS components required by Psoxy or exempt the AWS Account in which you will deploy Psoxy from these policies.
If your organization uses any sort of security control enforcement mechanism, you may have disable/provide exceptions to those controls for you initial deployment. Then generally those controls can be implemented later by extending our examples. Our protips page provides some guidance on how to extend the base examples to meet more extreme requirements.
A sufficiently privileged AWS Role You must have a IAM Role within the AWS account with sufficient privileges to (AWS managed policy examples linked):
- create IAM roles + policies (eg IAMFullAccess)
- create and update Systems Manager Parameters (eg, AmazonSSMFullAccess )
- create and manage Lambdas (eg AWSLambda_FullAccess )
- create and manage S3 buckets (eg AmazonS3FullAccess )
- create Cloud Watch Log groups (eg CloudWatchFullAccess)
(Yes, the use of AWS Managed Policies results in a role with many privileges; that's why we recommend you use a dedicated AWS account to host proxy which is NOT shared with any other use case)
You will need the ARN of this role.
NOTE: if you're connecting to Microsoft 365 (Azure AD) data sources, you'll also need permissions to create AWS Cognito Identity Pools and add Identities to them, such as arn:aws:iam::aws:policy/AmazonCognitoPowerUser. Some AWS Organizations have Service Control Policies in place that deny this by default, even if you have an IAM role that allows it at an account level.
An authenticated AWS CLI in your provisioning environment. Your environment (eg, shell/etc from which you'll run terraform commands) must be authenticated as an identity that can assume that role. (see next section for tips on options for various environments you can use)
Eg, if your Role is arn:aws:iam::123456789012:role/PsoxyProvisioningRole, the following should work:

aws sts assume-role --role-arn arn:aws:iam::123456789012:role/PsoxyProvisioningRole --role-session-name tf_session

If not, use `aws sts get-caller-identity` to confirm how your CLI is authenticated.

Provisioning Environment

To provision AWS infra, you'll need the aws-cli installed and authenticated on the environment where you'll run terraform.

Here are a few options:

Your Local Machine or a VM/Container Outside AWS

Generate an AWS Access Key for your AWS User.
Run aws configure in a terminal on the machine you plan to use, and configure it with the key you generated in step one.

NOTE: this could even be a GCP Cloud Shell, which may simplify auth if your wish to connect your Psoxy instance to Google Workspace as a data source.

EC2 Instance

If your organization prefers NOT to authorize the AWS CLI on individual laptops and/or outside AWS, provisioning Psoxy's required infra from an EC2 instance may be an option.

provision an EC2 instance (or request that your IT/dev ops team provision one for you). We recommend a micro instance with an 8GB disk, running ubuntu (not Amazon Linux; if you choose that or something else, you may need to adapt these instructions). Be sure to create a PEM key to access it via SSH (unless your AWS Organization/account provides some other ssh solution).
associate the Role above with your instance (see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html)
connect to your instance,

# avoid ssh complaints about permissions on your key
chmod 400 psoxy-access-key.pem

ssh -i ~/psoxy-access-key.pem ubuntu@{PUBLIC_IPV4_DNS_OF_YOUR_EC2_INSTANCE}

Whichever environment you choose, follow general prereq installation, and, when ready, continue with README.

Terraform State Backend

S3

You'll also need a backend location for your Terraform state (such as an S3 bucket). It can be in any AWS account, as long as the AWS role that you'll use to run Terraform has read/write access to it.

See https://developer.hashicorp.com/terraform/language/settings/backends/s3 for more details.

Local File System

Alternatively, you may use a local file system, but this is not recommended for production use - as your Terraform state may contain secrets such as API keys, depending on the sources you connect.

See https://developer.hashicorp.com/terraform/language/settings/backends/local

Bootstrap

The module psoxy-constants is a dependency-free module that provides lists of AWS managed policies, etc needed for bootstraping a AWS account in which your proxy instances will reside.

Getting Started

Once you've fulfilled the prereqs, including having your terraform deployment environment, backend, and AWS account prepared, we suggest you use our AWS example template repo:

https://github.com/Worklytics/psoxy-example-aws

Follow the 'Usage' instructions there to continue.

Authentication and Authorization

This page provides an overview of how proxy authenticates and confirms authorization of clients (Worklytics tenants).

Authentication

Each Worklytics tenant operates as a unique GCP service account within Google Cloud. GCP issues an identity token for this service account to processes running in the tenant, which the tenant then uses to authenticate against AWS.

No secrets or keys need to be exchanged between Worklytics and your AWS instance. The integrity of the authentication is provided by the signature of the identity token provided by GCP, which AWS verifies against Google's public certificates.

Annotating the diagram for the above case, with specific components for Worklytics-->Proxy case:

In the above, the AWS resource you're allowing access to is AWS IAM role, which your Worklytics tenant assumes and then can access S3 or invoke Lambda function.

Authorization

Within your AWS account, you create an IAM role, with a role assumption policy that allows your Worklytics tenant's GCP Service Account (identified by a numeric ID you obtain from the Worklytics portal) to assume the role.

This assumption policy will have a statement similar to the following, where the value of the aud claim is the numeric ID of your Worklytics tenant's GCP Service Account:

Colloquially, this allows a web identity federated from accounts.google.com where Google has asserted the claim that aud == 12345678901234567890123456789 to assume the role.

Then you use this AWS IAM role as the principal in AWS IAM policies you define to authorize to invoke your proxy instances via their function URLs (API connectors) or to read from their sanitized output buckets (bulk data connectors)

See: https://github.com/Worklytics/psoxy/blob/v0.4.40/infra/modules/aws/main.tf#L81-L102

Getting Started with Cloud Shell

YMMV; as of June 2023, AWS's 1GB limit on cloud shell persistent storage is too low for real world proxy deployments, which typically require install gcloud CLI / Azure CLI to connect to sources

So use use your local machine, or a VM/container elsewhere in AWS (EC2, AWS Cloud9, etc

clone the repo

git clone https://github.com/Worklytics/psoxy.git

add the following lines to your ~/.bashrc. (AWS Cloud Shell preserves only your HOME directory across sessions, so add any commands that modify/install things outside to your .bashrc)


# install Maven (and, via dependency, java)
sudo yum -y install maven

# GCloud SDK (if using Google Workspace data sources)
# The next line updates PATH for the Google Cloud SDK.
if [ -f '/home/cloudshell-user/google-cloud-sdk/path.bash.inc' ]; then . '/home/cloudshell-user/google-cloud-sdk/path.bash.inc'; fi

# The next line enables shell command completion for gcloud.
if [ -f '/home/cloudshell-user/google-cloud-sdk/completion.bash.inc' ]; then . '/home/cloudshell-user/google-cloud-sdk/completion.bash.inc'; fi

Then source ~/.bashrc, to execute the above.

install Terraform

git clone https://github.com/tfutils/tfenv.git ~/.tfenv
mkdir ~/bin
ln -s ~/.tfenv/bin/* ~/bin/
tfenv install
tfenv use latest

if using Google Workspace data sources, install Google Cloud CLI and authenticate.

curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-435.0.1-linux-x86_64.tar.gz
tar -xvf google-cloud-cli-435.0.1-linux-x86_64.tar.gz
sudo ./google-cloud-sdk/install.sh
rm google-cloud-cli-435.0.1-linux-x86_64.tar.gz

if using Microsoft 365 data sources, install Azure CLI and authenticate.

https://docs.microsoft.com/en-us/cli/azure/install-azure-cli

You should now be ready for the general instructions in the README.md.

Other stuff

If default NodeJS tooling doesn't work for you, legacy testing tools use python/awscurl, installed via pip. See example below:


# install AWS Curl (used for testing)
sudo yum -y install pip
pip install awscurl

AWS Development

Prereqs

Required:

AWS CLI

Optional:

AWS SAM CLI (macOS) for local testing, if desired
awscurl for direct testing of deployed AWS lambda from a terminal

Build

Maven build produces a zip file.

Build core library
From java/impl/aws/:

mvn clean package

Run Locally

Locally, you can test function's behavior from invocation on a JSON payload (but not how the API gateway will map HTTP requests to that JSON payload):

https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-using-invoke.html

Deploy to AWS

We recommend deploying your Psoxy code into AWS using the terraform modules found in [infra/modules/](../../infra/modules/] for AWS. These modules both provision the required AWS infrastructure, as well as deploying the built binaries for Psoxy as lambdas in the target account.

Example configurations using those modules can be found in `infra/examples/.

You'll ultimately provision infrastructure represented in green in the following diagram:

![AWS data flow](./2022-02 Psoxy Data Flow.png)

See infra/modules/aws/ for more information.

Encryption Keys in AWS

As of June 2023, the following resources provisioned by Psoxy modules support use of CMEKs:

Lambda function environment variables
SSM Parameters
Cloud Watch Log Groups
S3 Buckets

Pre-existing Key

The psoxy-example-aws example provides a project_aws_key_arn variable, that, if provided, will be set as the encryption key for these resources. A few caveats:

The AWS principal your Terraform is running as must have permissions to encrypt/decrypt with the key (it needs to be able to read/write the lambda env, ssm params, etc)
The key should be in the same AWS region you're deploying to.
CloudWatch must be able to use the key, as described in AWS CloudWatch docs

In example-dev/aws-all/kms-cmek.tf, we provide a bunch of lines that you can uncomment to use encryption on S3 and properly set key policy to support S3/CloudWatch use.

For production use, you should adapt the key policy to your environment and scope as needed to follow your security policies, such as principle of least privilege.

Provisioning a Key


resource "aws_kms_key" "key" {
  description             = "KMS key for Psoxy"
  enable_key_rotation     = true
  is_enabled              = true
}

# then replace all use of `var.project_aws_key_arn` with `aws_kms_key.key.arn` in your `main.tf`

More options

If you need more granular control of CMEK by resource type, review the main.tf and variables exposed by the aws-host module for some options.

Protips

Some ideas on how to support scenarios and configuration requirements beyond what our default examples show:

Encryption Keys

see encryption-keys.md

Tagging ALL infra created by your Terraform Configuration

If you're using our AWS example, it should support a default_tags variable.

You can add the following in your terrform.tfvars file to set tags on all resources created by the example configuration:

default_tags = {
  Vendor = "Worklytics"
}

If you're not using our AWS example, you can add the following to your configuration, then you will need to modify the aws provider block in your configuration to add a default_tags. Example shown below:

See: [https://registry.terraform.io/providers/hashicorp/aws/latest/docs#default_tags]

provider "aws" {
  region = var.aws_region

  assume_role {
    role_arn = var.aws_assume_role_arn
  }

  default_tags {
    Vendor  = "Worklytics"
  }

  allowed_account_ids = [
    var.aws_account_id
  ]
}

Extensibility

To support extensibility, our Terraform examples/modules output the IDs/names of the major resources they create, so that you can compose them with other Terraform resources.

Buckets

The aws-host module outputs bulk_connector_instances; a map of id => instance for each bulk connector. Each of these has two attributes that correspond to the names of its related buckets:

sanitized_bucket_name
input_bucket_name

So in our AWS example, you can use these to enable logging, for example, you could do something like this: (YMMV, syntax etc should be tested)

local {
  id_of_bucket_to_store_logs = "{YOUR_BUCKET_ID_HERE}"
}

resource "aws_s3_bucket_logging" "logging" {
  for_each = module.psoxy.bulk_connector_instances

  bucket = each.value.sanitized_bucket_name

  target_bucket = local.id_of_bucket_to_store_logs
  target_prefix = "psoxy/${each.key}/"
}

resource "aws_s3_bucket_logging" "logging" {
  for_each = module.psoxy.bulk_connector_instances

  bucket = each.value.input_bucket_name

  target_bucket = local.id_of_bucket_to_store_logs
  target_prefix = "psoxy/${each.key}/"
}

Analogous approaches can be used to configure versioning, replication, etc;

Note that encryption, lifecycle, public_access_block are set by the Workltyics-provided modules, so you may have conflicts issues if you also try to set those outside.

Lambda Execution Role

beta - released from v0.4.50; YMMV, and may be subject to change.

The terraform modules we provide provision execution roles for each lambda function, and attach by default attach the appropriate AWS Managed Policy to each.

Specifically, this is AWSLambdaBasicExecutionRole, unless you're using a VPC - in which case it is AWSLambdaVPCAccessExecutionRole(https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AWSLambdaVPCAccessExecutionRole.html).

For organizations that don't allow use of AWS Managed Policies, you can use the aws_lambda_execution_role_policy_arn variable to pass in an alternative which will be used INSTEAD of the AWS Managed Policy.

Guides

Using API Gateway (V2) - alpha
Lambdas on a VPC
Using a Least-Privileged Provisioning Role - brittle! YMMV.

Using API Gateway

Some organizations require use of API Gateway. This is not the default approach for Psoxy since AWS added support for Lambda Function URLs (March 2022), which are a simpler and more direct way to expose lambdas via HTTPS.

Nonetheless, should you wish to use API Gateway we provide beta support for this. It is needed if you wish to put your Lambda functions on a VPC (See lambdas-on-vpc.md).

In particular:

IAM policy that allows api gateway methods to be invoked by the proxy caller role is defined once, using wildcards, and exposes GET/HEAD/POST methods for all resources. While methods are further constrained by routes and the proxy rules themselves, this could be another enforcement point at the infrastructure level - at expense of N policies + attachments in your terraform plan instead of 1.
proxy instances exposed as lambda function urls have 55s timeout, but API Gateway seems to support 30s as max - so this may cause timeouts in certain APIs

Usage

Prerequisites:

Add the following to your terraform.tfvars file:

Then terraform apply should create of API gateway-related resources, including policies/etc; and destroy lambda function urls (if you've previously applied with use_api_gateway=false, which is the default).

API Gateway v1 - not supported, but FWIW

If you wish to use API Gateway V1, you will not be able to use the flag above. Instead, you'll have to do something like the following:

Additionally, you'll need to set a different handler class to be invoked instead of the default (co.workltyics.psoxy.Handler, should be co.worklytics.psoxy.APIGatewayV1Handler). This can be done in Terraform or by modifying configuration via AWS Console.

Using AWS Secrets Manager

By default, Psoxy uses AWS Systems Manager Parameter Store to store secrets; this simplifies configuration and minimizes costs. However, you may want to use AWS Secrets Manager to store secrets due to organization policy.

In such a case, you can add the following to your terraform.tfvars file:

secrets_store_implementation = "aws-secrets-manager"

This will alter the behavior of the Terraform modules to store everything considered a secret to be stored/loaded from AWS Secrets Manager instead of AWS Systems Manager Parameter Store. Note that Parameter Store is still used for non-secret configuration information, such as proxy rules, etc.

Changes will also be made to AWS IAM Policies, to allow lambda function execution roles to access Secrets Manager as needed.

If any secrets are managed outside of Terraform (such as API keys for certain connectors), you will need to grant access to relevant secrets in Secrets Manager to the principals that will manage these.

Lookup Tables

If you use Psoxy to send pseudonymized data to Worklytics and later wish to re-identify the data that you export from Worklytics to your premises, you'll need a lookup table in your data warehouse to JOIN with that data.

Populating this variable will generate another version of your HRIS data (aside from the one exposed to Worklytics) which you can then import back to your data warehouse.

To enable it, add the following to your terraform.tfvars file:

In sanitized_accessor_role_names, add the name of whatever AWS role that the principal running ingestion of your lookup table from S3 to your data warehouse will assume. You can add additional role names as needed. Alternatively, you can use an IAM policy created outside of our Terraform module to grant access to the lookup table CSVs within the S3 bucket.

After you apply this configuration, the lookup table will be generated in an S3 bucket. The S3 bucket will be shown in the Terraform output:

Use the bucket name shown in your output to build import pipeline to your data warehouse.

Every time a new hris snapshot is uploaded to the hris -input bucket, TWO copies of it will be created: a sanitized copy in the bucket accessible Worklytics, and the lookup variant in the lookup bucket referenced above (not accessible to Worklytics).

The lookup table CSV file will have the following columns: EMPLOYEE_ID,EMPLOYEE_ID_ORIG

If you load this into your Data Warehouse, you can JOIN it with the data you export from Worklytics.

Then the following query will give re-identified aggregate data:

The employeeId column in the result set will be the original employee ID from your HRIS system.

Security and Privacy Considerations

If your HRIS employee ID column is considered PII, then the lookup table and any re-identified data exports you use it to produce should be handled as Personal data, according to your policies, as these now reference readily identifiable Natural Persons.

If you wish limit re-identification to a subset of your data, you can use additional columns present in your HRIS csv to do so, for example:

Advanced

Within the lookup_table_builders map, you can specify the following fields:

input_connector_id - usually hris; this corresponds the whatever bulk connector you want to build the lookup table for.
rules - this follows the rules structure for the bulk connector case. The example above is suited for HRIS data following the schema expected by Worklytics. If you modify this, be sure to review our documentation or contact support to ensure you don't break your lookup table.

Lambdas on a VPC

beta - This is now available for customer-use, but may still change in backwards incompatible ways.

Our aws-host module provides a vpc_config variable to specify the VPC configuration for the lambdas that our Terraform modules will create, analogous to the vpc_config block supported by the AWS lambda terraform resource.

Some caveats:

API connectors on a VPC must be exposed via API Gateway rather than Function URLs (our Terraform modules will make this change for you).
VPC must be configured such that your lambda has connectivity to AWS services including S3, SSM, and CloudWatch Logs; this is typically done by adding a VPC Endpoint for each service.
VPC must allow any API connector to connect to data source APIs via HTTPS (eg 443); usually these APIs are on the public internet, so this means egress to public internet.
VPC must allow your API gateway to connect to your lambdas.

The requirements above MAY require you to modify your VPC configuration, and/or the security groups to support proxy deployment. The example we provide in our vpc.tf should fulfill this if you adapt it; or you can use it as a reference to adapt you existing VPC.

To put the lambdas created by our terraform example under a VPC, please follow one of the approaches documented in the next sections.

Usage - Bring-your-own VPC

If you have an existing VPC, you can use it with the vpc_config variable by hard coding the ids of the pre-existing resources (provisioned outside the scope of your proxy's terraform configuration).

module "psoxy" {
  # lines above omitted ...

  vpc_config = {
    vpc_id             = "vpc-0a1b2c3d4e5f67890"
    security_group_ids = ["sg-0a1b2c3d4e5f67890"]
    subnet_ids         = ["subnet-0a1b2c3d4e5f67890"]
  }
}

Usage - with `vpc.tf`

If you don't have a pre-existing VPC, you wish to use, our aws example repo includes vpc.tf file at the top-level. This file has a bunch of commented-out terraform resource blocks that can serve as examples for creating the minimal VPC + associated infra. Review and uncomment to meet your use-case.

Prerequisites:

the AWS principal (user or role) you're using to run Terraform must have permissions to manage VPCs, subnets, and security groups. The AWS managed policy AmazonVPCFullAccess provides this.
all pre-requisites for the api-gateways (see api-gateway.md)

NOTE: if you provide vpc_config, the value you pass for use_api_gateway_v2 will be ignored; using a VPC requires API Gateway v2, so will override value of this flag to true.

Add the following to "psoxy" module in your main.tf (or uncomment if already present):

module "psoxy" {
  # lines above omitted ...

  vpc_config = {
      vpc_id             = aws_default_vpc.default.id
      security_group_ids = [aws_default_security_group.default.id]
      subnet_ids         = [aws_default_subnet.default.id]
  }
}

Uncomment the relevant lines in vpc.tf in the same directory, and modify as you wish. This file pulls the default VPC/subnet/security group for your AWS account under terraform.

Alternatively, you modify vpc.tf to use a provision non-default VPC/subnet/security group, and reference those from your main.tf - subject to the caveats above.

See the following terraform resources that you'll likely need:

Troubleshooting

Check your Cloud Watch logs for the lambda. Proxy lambda will time out in INIT phase if SSM Parameter Store or your secret store implementation (AWS Secrets Manager, Vault) is not reachable.

Some potential causes of this:

DNS failure - it's going to look up the SSM service by domain; if the DNS zone for the SSM endpoint you've provisioned is not published on the VPC, this will fail; similarly, if the endpoint wasn't configured on a subnet - then it won't have an IP to be resolved.
if the IP is resolved, you should see failure to connect to it in the logs (timeouts); check that your security groups for lambda/subnet/endpoint allow bidirectional traffic necessary for your lambda to retrieve data from SSM via the REST API.

Switching back from using a VPC

Terraform with aws provider doesn't seem to play nice with lambdas/subnets; the subnet can't be destroyed w/o destroying the lambda, but terraform seems unaware of this and will just wait forever.

So:

destroy all your lambdas (terraform state list | grep aws_lambda_function; then terraform destroy --target= for each, remember '' as needed)
destroy the subnet terraform destroy --target=aws_subnet.main

References

https://docs.aws.amazon.com/lambda/latest/dg/foundation-networking.html
https://docs.aws.amazon.com/lambda/latest/dg/configuration-vpc.html

Least-Privileged Provisioning Role

beta - we're not committed that maintaining this under versioning policy; minor proxy iterations may require changes to privileges required in the least-privileged role.

This is a guide about how to create a role for provisioning psoxy infrastructure in AWS, following the principle of least-privilege at permission-level, rather than policy-level.

Eg, as of v0.4.55 of the proxy, our docs provide guidance on using an AWS role to provision your psoxy infrastructure using the least-privileged set of AWS managed policies possible. A stronger standard would be to use a custom IAM policy rather than AWS managed policy, with the least-privileged set of permissions required.

Additionally, you can specify resource constraints to improve security within a shared AWS account. (However, we do not recommend or officially support deployment into a shared AWS account. We recommend deploying your proxy instances in isolated AWS account to provide an implicit security boundary by default, as an additional layer of protection beyond those provided by our proxy modules)

We provide an example IAM policy document in our psoxy-constants module that you can use to create a IAM policy in AWS. You can do this outside terraform, finding the JSON from that policy OR via terraform as follows:

AWS Troubleshooting

Tips and tricks for using AWS as to host the proxy.

Who are you?

If above doesn't happen seem to work as expected, some ideas in the next section may help.

Your AWS Organization uses SSO via Okta or some similar provider

Options:

find credentials output by your SSO helper (eg, aws-okta) then fill the AWS CLI env variables yourself:

if your SSO helper fills default AWS credentials file but simply doesn't set the env vars, you may be able to export the profile to AWS_PROFILE, eg

References: https://discuss.hashicorp.com/t/using-credential-created-by-aws-sso-for-terraform/23075/7

Your AWS User has MFA

Options:

Logs via Cloud Watch

via Web Console

Log into AWS web console
navigate to the AWS account that hosts your proxy instance (you may need to assume a role in that account)
then the region in that account in which your proxy instance is deployed. (default us-east-1)
then search or navigate to the AWS Lambdas feature, and find the specific one you wish to debug
find the tabs for Monitoring then within that, Logging, then click "go to Cloud Watch"

via CLI

Unless your AWS CLI is auth'd as a user who can review logs, first auth it for such a role.

You can do this with a new profile, or setting env variables as follows:

Then, you can do a series of commands as follows:

Errors in Terraform apply

error creating Lambda Function URL

Something like the following:

Your Terraform state is inconsistent. Run something like the following, adapted for your connector:

NOTE: you likely need to change outlook-mail if your error is with a different data source. The \ chars are needed to escape the double-quotes/brackets in your bash command.

Permissions Errors

error reading SSM Parameters

Something like the following:

Check:

the SSM parameter exists in the AWS account
the SSM parameter can be decrypted by the lambda's execution role (if it's encrypted with a KMS key)

Setting IS_DEVELOPMENT_MODE to "true" in the Lambda's Env Vars via the console can enable some additional logging with detailed SSM error messages that will be helpful; but note that some of these errors will be expected in certain configurations.

Our Terraform examples should provide both of the above for you, but worth double-checking.

If those are present, yet the error persists, it's possible that you have some org-level security constraint/policy preventing SSM parameters from being used / read. For example, you have a "default deny" policy set for SSM GET actions/etc. In such a case, you need to add the execute roles for each lambda as exceptions to such policies (find these under AWS --> IAM --> Roles).

AWS Troubleshooting

Tips and tricks for using AWS as to host the proxy.

Who are you?

# figure out how your AWS CLI is authenticated
# (NOTE: this is also the only AWS API cmd that will work regardless of your IAM setup; asking AWS
# who it believes you are doesn't require any permissions)
aws sts get-caller-identity

# figure out if the identity you're authenticated as can assume target role
aws sts assume-role \
--role-arn arn:aws:iam::123456789012:role/InfraAdmin \
--role-session-name TestSession \
--output json

If above doesn't happen seem to work as expected, some ideas in the next section may help.

Your AWS Organization uses SSO via Okta or some similar provider

Options:

execute terraform via
find credentials output by your SSO helper (eg, aws-okta) then fill the AWS CLI env variables yourself:

ls ~/.aws/cli/cached/


...

export AWS_ACCESS_KEY_ID="xxxxxxxxxxxxxxx"
export AWS_SECRET_ACCESS_KEY="xxxxxxxxxxxxxxx"
export AWS_SESSION_TOKEN="xxxxxxxxxxxxxxx"

if your SSO helper fills default AWS credentials file but simply doesn't set the env vars, you may be able to export the profile to AWS_PROFILE, eg


export AWS_PROFILE="production"
terraform plan

# or just inline

AWS_PROFILE="production" terraform plan

References: https://discuss.hashicorp.com/t/using-credential-created-by-aws-sso-for-terraform/23075/7

Your AWS User has MFA

Options:

execute terraform via
use a script such as to get short-lived key+secret for your user.

Logs via Cloud Watch

via Web Console

Log into AWS web console
navigate to the AWS account that hosts your proxy instance (you may need to assume a role in that account)
then the region in that account in which your proxy instance is deployed. (default us-east-1)
then search or navigate to the AWS Lambdas feature, and find the specific one you wish to debug
find the tabs for Monitoring then within that, Logging, then click "go to Cloud Watch"

via CLI

Unless your AWS CLI is auth'd as a user who can review logs, first auth it for such a role.

You can do this with a new profile, or setting env variables as follows:

export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s" \
$(aws sts assume-role \
--role-arn arn:aws:iam::123456789012:role/MyAssumedRole \
--role-session-name MySessionName \
--query "Credentials.[AccessKeyId,SecretAccessKey,SessionToken]" \
--output text))

Then, you can do a series of commands as follows:

aws logs describe-log-streams --log-group-name /aws/lambda/psoxy-azure-ad
aws logs get-log events --log-group-name /aws/lambda/psoxy-azure-ad --log-stream-name [VALUE_FROM_LAST_COMMAND]

Errors in Terraform apply

error creating Lambda Function URL

Something like the following:

Error: error creating Lambda Function URL (psoxy-outlook-mail): ResourceConflictException: Failed to create function url config for [functionArn = arn:aws:lambda:us-east-1:123456789012:function:psoxy-outlook-mail]. Error message:  FunctionUrlConfig exists for this Lambda function
│ {
│   RespMetadata: {
│     StatusCode: 409,
│     RequestID: "dfb1452c-df84-4231-946f-b97deb695ca9"
│   },
│   Message_: "Failed to create function url config for [functionArn = arn:aws:lambda:us-east-1:123456789012:function:psoxy-outlook-mail]. Error message:  FunctionUrlConfig exists for this Lambda function",
│   Type: "User"
│ }
│
│   with module.psoxy-msft-connector["outlook-mail"].aws_lambda_function_url.lambda_url,
│   on ../../modules/aws-psoxy-rest/main.tf line 26, in resource "aws_lambda_function_url" "lambda_url":
│   26: resource "aws_lambda_function_url" "lambda_url" {

Your Terraform state is inconsistent. Run something like the following, adapted for your connector:

terraform import module.psoxy-msft-connector\[\"outlook-mail\"\].aws_lambda_function_url.lambda_url psoxy-outlook-mail

NOTE: you likely need to change outlook-mail if your error is with a different data source. The \ chars are needed to escape the double-quotes/brackets in your bash command.

Permissions Errors

error reading SSM Parameters

Something like the following:

Error loading class co.worklytics.psoxy.Handler: missing config. no value for PSOXY_SALT: java.lang.Error
java.lang.Error: missing config. no value for PSOXY_SALT

Check:

the SSM parameter exists in the AWS account
the SSM parameter can be read by the lambda's execution rule (eg, has an attached IAM policy that allows the SSM parameter to be read; can test this with the , setting 'Role' to your lambda's execution role, 'Service' to 'AWS Systems Manager', 'Action' to 'Get Parameter' and 'Resource' to the SSM parameter's ARN.
the SSM parameter can be decrypted by the lambda's execution role (if it's encrypted with a KMS key)

Our Terraform examples should provide both of the above for you, but worth double-checking.

AWS

Getting Started

Overview

Prerequisites

Provisioning Environment

Your Local Machine or a VM/Container Outside AWS

EC2 Instance

Terraform State Backend

S3

Local File System

Bootstrap

Getting Started

Authentication and Authorization

Authentication

Authorization

Getting Started with Cloud Shell

Other stuff

AWS Development

Prereqs

Build

Run Locally

Deploy to AWS

Encryption Keys in AWS

Pre-existing Key

Provisioning a Key

More options

Protips

Encryption Keys

Tagging ALL infra created by your Terraform Configuration

Extensibility

Buckets

Lambda Execution Role

Guides

Using API Gateway

Usage

API Gateway v1 - not supported, but FWIW

Using AWS Secrets Manager

Lookup Tables

Security and Privacy Considerations

Advanced

Lambdas on a VPC

Usage - Bring-your-own VPC

Usage - with vpc.tf

Troubleshooting

Switching back from using a VPC

References

Least-Privileged Provisioning Role

AWS Troubleshooting

Who are you?

Your AWS Organization uses SSO via Okta or some similar provider

Your AWS User has MFA

Logs via Cloud Watch

via Web Console

via CLI

Errors in Terraform apply

error creating Lambda Function URL

Permissions Errors

error reading SSM Parameters

AWS

Encryption Keys in AWS

Pre-existing Key

Provisioning a Key

More options

Getting Started with Cloud Shell

Other stuff

Using AWS Secrets Manager

AWS Development

Prereqs

Build

Run Locally

Deploy to AWS

Lambdas on a VPC

Usage - Bring-your-own VPC

Usage - with vpc.tf

Troubleshooting

Switching back from using a VPC

References

Protips

Encryption Keys

Tagging ALL infra created by your Terraform Configuration

Usage - with `vpc.tf`

Usage - with `vpc.tf`