Deployment Guide
This guide provides instructions for deploying and managing the FedRAMP High Event-Driven Data Mesh infrastructure.
Prerequisites
Section titled “Prerequisites”- AWS Account with appropriate permissions
- Terraform 1.0+
- AWS CLI configured with appropriate credentials
- kubectl configured for Kubernetes access (if using EKS)
- Databricks CLI configured with workspace credentials
Infrastructure Deployment
Section titled “Infrastructure Deployment”1. Initialize Terraform
Section titled “1. Initialize Terraform”- Clone the repository:
Terminal window git clone https://github.com/frocore/fedramp-data-mesh.gitcd fedramp-data-mesh - Initialize Terraform:
cd platform/infrastructure/terraformterraform init -backend-config=environments/dev/backend.tfvars
### 2. Configure Env VariablesCreate a .env file with the necessary environment variables:```bash# AWS Configurationexport AWS_REGION=us-east-1export AWS_PROFILE=fedramp-data-mesh
# Databricks Configurationexport DATABRICKS_ACCOUNT_ID=your-account-idexport DATABRICKS_ACCOUNT_USERNAME=your-usernameexport DATABRICKS_ACCOUNT_PASSWORD=your-password
Source the envars:
source .env
3. Deploy infrastructure
Section titled “3. Deploy infrastructure”- Plan the deployment:
terraform plan -var-file=environments/dev/terraform.tfvars
- Apply the changes:
terraform apply -var-file=environments/dev/terraform.tfvars
- Take note of the outputs, which include important information about the deployed resources.
4. Deploy Kubernetes Components
Section titled “4. Deploy Kubernetes Components”- Configure kubectl to connect to the newly created EKS cluster:
aws eks update-kubeconfig --name fedramp-data-mesh-eks-dev --region us-east-1
- Deploy Kubernetes components:
cd ../kuberneteskubectl apply -f namespace.yamlkubectl apply -k schema-registrykubectl apply -k kafka-connectkubectl apply -k monitoring
Data Product Deployment
Section titled “Data Product Deployment”1. Create Domain Catalogs
Section titled “1. Create Domain Catalogs”- Log in to Databricks:
databricks configure --token
- Create catalogs for each domain:
# Create Project Management catalogdatabricks unity-catalog catalogs create \ --name project_management \ --comment "Project Management domain catalog"
# Create Financials catalogdatabricks unity-catalog catalogs create \ --name financials \ --comment "Financials domain catalog"
2. Deploy Kafka Connectors
Section titled “2. Deploy Kafka Connectors”- Configure Kafka Connect for source databases:
# Create Projects source connectorcurl -X POST -H "Content-Type: application/json" \ --data @domains/project-management/producers/project-state/connector-config.json \ http://kafka-connect.fedramp-data-mesh.example.com:8083/connectors
3. Deploy Spark Jobs
Section titled “3. Deploy Spark Jobs”- Create a Databricks job for each processor:
# Create Project State Processor jobdatabricks jobs create --json @domains/project-management/processors/spark/job-config.json
Monitoring and Operations
Section titled “Monitoring and Operations”1. Set Up Monitoring
Section titled “1. Set Up Monitoring”- Configure CloudWatch Dashboards:
aws cloudwatch create-dashboard \ --dashboard-name FedRAMP-DataMesh-Overview \ --dashboard-body file://monitoring/cloudwatch-dashboards/overview.json```2. Set up alerts:```bashaws cloudwatch put-metric-alarm \ --alarm-name DataMesh-Kafka-HighLag \ --alarm-description "Alert when Kafka consumer lag is too high" \ --metric-name "kafka-consumer-lag" \ --namespace "AWS/MSK" \ --statistic Average \ --period 300 \ --threshold 1000 \ --comparison-operator GreaterThanThreshold \ --dimensions "Name=ClusterName,Value=fedramp-data-mesh-kafka-dev" \ --evaluation-periods 2 \ --alarm-actions ${SNS_TOPIC_ARN}
2. Regular Maintenance
Section titled “2. Regular Maintenance”- Rotate encryption keys:
# Update KMS key for S3aws kms enable-key-rotation --key-id ${S3_KMS_KEY_ID}
- Update Kafka configurations:
aws kafka update-cluster-configuration \ --cluster-arn ${KAFKA_CLUSTER_ARN} \ --current-version ${CURRENT_CLUSTER_VERSION} \ --configuration-info file://kafka-config-updates.json
- Patch Kubernetes components:
kubectl apply -k schema-registry
Backup and Disaster Recovery
Section titled “Backup and Disaster Recovery”1. Backup Strategy
Section titled “1. Backup Strategy”- S3 data is automatically versioned and cross-region replicated
- Kafka topics should be configured with appropriate replication factor (3)
- Critical configurations are stored in version control
- Database backups are automated through AWS Backup
2. Disaster Recovery
Section titled “2. Disaster Recovery”- In case of region failure, follow these steps:
- Activate standby infrastructure in secondary region
- Update DNS to point to secondary region
- Ensure all credentials and configurations are available
- Test DR procedures regularly:
# Run DR test script./scripts/dr-test.sh
Security Operations
Section titled “Security Operations”1. Access Management
Section titled “1. Access Management”- Rotate credentials regularly:
# Rotate service account credentials./scripts/rotate-credentials.sh
- Review access:
# Generate access report./scripts/access-review.sh > access-review-$(date +%Y-%m-%d).txt
2. Security Monitoring
Section titled “2. Security Monitoring”- Check GuardDuty findings:
aws guardduty list-findings
- Run security scans:
# Run infrastructure security scan./scripts/security-scan.sh
Troubleshooting
Section titled “Troubleshooting”Common Issues
Section titled “Common Issues”1. Kafka Connection Issues
Section titled “1. Kafka Connection Issues”- Check security groups
- Verify credentials
- Check network connectivity
2. Databricks Job Failures
Section titled “2. Databricks Job Failures”- Check job logs
- Verify access to S3
- Check schema compatibility issues
3. Schema Evolution Errors
Section titled “3. Schema Evolution Errors”- Verify schema compatibility
- Check for breaking changes
For more detailed troubleshooting, refer to the Troubleshooting Guide.