2025-02-15 · 7 min read
Cloud Migration Checklist: Planning Your Move to AWS/GCP
A practical, step-by-step checklist for planning and executing a cloud migration without losing your mind (or your data).
Before You Start
Cloud migration is one of those projects that seems straightforward until you're in the middle of it. The difference between a smooth migration and a nightmare usually comes down to preparation. This checklist is based on lessons learned from dozens of migrations — the gotchas, the things that break at 2 AM, and the shortcuts that aren't actually shortcuts.
Phase 1: Assessment
Before writing a single line of Terraform, you need to understand what you're working with.
Inventory Everything
- Servers: What's running, how much CPU/RAM/disk, what OS version
- Databases: Type, version, size, replication setup, backup schedule
- Storage: File systems, object storage, total data volume
- Networking: VPNs, firewalls, load balancers, DNS configuration
- External services: Third-party APIs, SaaS integrations, CDN configuration
- Scheduled jobs: Cron jobs, batch processes, data pipelines
This step is tedious but critical. You will discover servers nobody knew about. You will find cron jobs that haven't run in years but might still be important. Document everything.
Map Dependencies
Draw a diagram of how your services talk to each other. For each service, document:
- What it depends on (databases, APIs, message queues)
- What depends on it (other services, external consumers)
- Communication protocols (HTTP, gRPC, direct DB connections)
- Latency sensitivity (can it tolerate cross-region calls?)
This dependency map drives your migration order. Services with fewer dependencies migrate first.
Establish Baselines
Before migrating, measure your current performance:
- Response times: P50, P95, P99 for all endpoints
- Error rates: By service and by endpoint
- Resource utilization: CPU, memory, disk I/O, network
- Traffic patterns: Peak hours, seasonal variations
You'll need these numbers to validate that the migration didn't degrade performance.
Phase 2: Architecture Design
Choose Your Target Architecture
Don't just "lift and shift" your existing architecture to the cloud. This is your chance to make improvements — but be strategic about what you change.
Lift and shift (rehost): Move VMs as-is to cloud equivalents. Fastest to execute but doesn't leverage cloud-native benefits.
Replatform: Minor adjustments like moving to managed databases (RDS, Cloud SQL) instead of self-managed. Good balance of speed and improvement.
Refactor: Redesign for cloud-native patterns (containers, serverless, managed services). Most beneficial but most time-consuming.
Our recommendation: replatform for the initial migration, then refactor incrementally once you're in the cloud.
Design for Failure
Cloud environments fail differently than on-premises. Design for it:
- Use multiple availability zones for redundancy
- Implement health checks and auto-healing
- Design services to be stateless where possible
- Use managed services to offload operational burden
- Plan for AZ failures, region failures, and service outages
Infrastructure as Code
Define everything in Terraform, Pulumi, or your IaC tool of choice before the migration. This gives you:
- A reviewable, version-controlled infrastructure definition
- The ability to spin up identical environments for testing
- Easy rollback if something goes wrong
- Documentation that stays in sync with reality
Phase 3: Security & Compliance
IAM Strategy
Design your IAM structure before creating any resources:
- Use separate accounts/projects for dev, staging, and production
- Implement least-privilege access with role-based policies
- Enable MFA for all human users
- Use service accounts with scoped permissions for applications
- Set up audit logging from day one
Network Security
- Design your VPC/VNet with proper CIDR planning (leave room to grow)
- Segment networks by environment and sensitivity
- Use private subnets for databases and internal services
- Implement security groups/firewall rules with deny-by-default
- Set up VPN or private connectivity for hybrid scenarios
Data Protection
- Enable encryption at rest for all storage (databases, object storage, volumes)
- Enable encryption in transit (TLS everywhere)
- Plan your key management strategy (cloud KMS vs. self-managed)
- Implement backup strategy with tested restore procedures
- Document data residency requirements (region restrictions)
Phase 4: Migration Execution
Set Up the Landing Zone
Before migrating any workloads:
- Create accounts/projects with proper organizational hierarchy
- Configure networking (VPCs, subnets, peering, DNS)
- Set up IAM roles and policies
- Enable logging and monitoring
- Configure backup policies
- Validate security controls
Migration Order
Migrate in this order to minimize risk:
- Stateless services with few dependencies (lowest risk)
- Internal tools that can tolerate downtime
- Read replicas of databases (test without affecting production)
- Application tier (with database connections pointing to existing databases)
- Databases (the scariest part — do this last)
Database Migration
Database migration deserves special attention. Options from least to most disruptive:
Continuous replication: Set up replication from source to target, let it sync, then cut over. Best option when available (AWS DMS, GCP Database Migration Service).
Dump and restore: Take a backup, restore to the new database, update connection strings. Requires downtime proportional to database size.
Dual-write: Application writes to both old and new databases during transition. Complex but zero-downtime. Only use if replication isn't an option.
The Cutover
Plan your cutover like a military operation:
- Runbook: Step-by-step document for every action during cutover
- Rollback plan: Specific steps to revert if something goes wrong
- Communication plan: Who to notify before, during, and after
- Timing: Choose a low-traffic window
- Team: Everyone involved should be online and on a call
- Monitoring: Extra dashboards and alerts during the cutover period
Practice the cutover in staging at least once. Ideally twice.
Phase 5: Validation
Smoke Tests
Immediately after cutover:
- All endpoints return expected responses
- Authentication works
- Database reads and writes succeed
- Scheduled jobs execute
- Integrations with external services work
- Email/notification systems function
Performance Validation
Compare against your baselines:
- Response times are within acceptable range
- Error rates haven't increased
- Resource utilization is as expected
- No memory leaks or connection pool exhaustion
Security Validation
- All network access controls are in place
- No unintended public exposure
- Encryption at rest and in transit verified
- IAM policies are correct
- Audit logging is capturing events
Phase 6: Post-Migration
Optimization
Once stable in the cloud, optimize:
- Right-size instances based on actual utilization
- Implement auto-scaling for variable workloads
- Evaluate reserved instances or committed use discounts
- Move appropriate workloads to spot/preemptible instances
- Enable cloud-native monitoring and alerting
Decommission Old Infrastructure
Don't rush this. Keep old infrastructure running (but not serving traffic) for at least 2-4 weeks. You'll be glad you did when you discover something you forgot to migrate.
Document Everything
Update your documentation to reflect the new architecture:
- Network diagrams
- Runbooks and playbooks
- On-call procedures
- Disaster recovery plan
- Cost monitoring and alerts
Common Pitfalls
- Underestimating data transfer time: Moving terabytes takes longer than you think. Start early.
- Forgetting about DNS TTL: Lower your TTLs well before the migration so DNS changes propagate quickly.
- Not testing backups: A backup you haven't restored isn't a backup.
- Skipping the staging rehearsal: If you haven't practiced the migration end-to-end, you're not ready.
- Trying to migrate and refactor at the same time: Migrate first, optimize later.
- Ignoring costs: Cloud resources start costing money immediately. Set up billing alerts before you start.
Planning a cloud migration? We've guided dozens of teams through successful migrations to AWS and GCP. Let's plan yours together.