Route 53 Failover Configuration: My Implementation Notes

Project Overview

Implemented a fault-tolerant routing solution using Amazon Route 53 for a café website. The goal was to ensure continuous service availability by automatically failing over to a backup instance if the primary server becomes unavailable.

Starting Environment:

CloudFormation-provisioned infrastructure with:
- Two EC2 instances with LAMP stack and café website
- Primary instance (CafeInstance1) in us-west-2a
- Secondary instance (CafeInstance2) in us-west-2b
Both instances pre-configured with:
- Full LAMP stack installation
- Identical café website deployment
- Public subnet placement for accessibility

Project Objectives:

Implement automated health monitoring for the primary website
Configure email-based alerting for downtime
Set up DNS-based failover routing
Ensure seamless user experience during failover events

Environment Validation

AWS Console Navigation:

Accessed AWS details:
- Clicked "Details" at top of page
- Selected "Show" for AWS credentials
- Copied down all instance information
Navigating to EC2:
- Opened AWS Management Console
- Used search bar to find and select "EC2"
- Selected "Instances" from left navigation pane

Instance Verification:

Retrieved essential information from CloudFormation outputs:
- IP addresses for both instances
- Primary and secondary website URLs
Performed initial testing:
- Accessed both café websites independently
- Verified server information displayed correct AZs
- Tested functionality by placing sample orders
- Confirmed order timestamps reflected server timezones

Route 53 Health Check Configuration

Console Navigation:

Accessed Route 53:
- Opened Services menu in AWS Console
- Searched for and selected "Route 53"
- Ignored IAM-related warning messages (expected in lab environment)
- Selected "Health checks" from left navigation

Primary Endpoint Monitoring:

Name: Primary-Website-Health
Monitor Type: Endpoint
Endpoint Type: IP address
Path: /cafe
Check Interval: 10 seconds
Failure Threshold: 2

Alert System Setup:

Created new SNS topic for notifications
Configured email alerts for health check failures
Added advanced monitoring settings for faster response
Verified email subscription through confirmation link

DNS Configuration

Accessing Route 53 DNS Settings:

Navigation steps:
- In Route 53 console, selected "Hosted zones" from left navigation
- Located my unique domain name (format: XXXXXX_XXXXXXXXXX.vocareum.training)
- Selected domain to view existing records
- Used "Create record" button for new entries

Hosted Zone Setup:

Used provided domain: XXXXXX_XXXXXXXXXX.vocareum.training
Preserved existing NS and SOA records
Implemented failover routing strategy

Primary A Record:

Record Name: www
Type: A Record
TTL: 15 seconds
Routing Policy: Failover (Primary)
Target: CafeInstance1 IP
Health Check: Primary-Website-Health
Record ID: FailoverPrimary

Secondary A Record:

Record Name: www
Type: A Record
TTL: 15 seconds
Routing Policy: Failover (Secondary)
Target: CafeInstance2 IP
Health Check: None
Record ID: FailoverSecondary

Failover Testing Process

Console Navigation for Testing:

Stopping primary instance:
- Returned to EC2 console via Services menu
- Selected CafeInstance1 from instances list
- Used Instance state menu → Stop instance
- Confirmed action in Stop instance dialog
Monitoring failover:
- Returned to Route 53 via Services menu
- Selected Health checks from left navigation
- Found Primary-Website-Health check
- Accessed Monitoring tab in lower pane

Testing Steps:

Initial validation:
- Accessed website through Route 53 domain
- Confirmed traffic routing to primary instance
- Verified correct AZ display (us-west-2a)
Failover simulation:
- Initiated controlled shutdown of CafeInstance1
- Monitored health check status changes
- Observed time to "Unhealthy" status
- Received and verified email notification
Failover verification:
- Refreshed café website after status change
- Confirmed automatic switch to us-west-2b
- Validated full functionality on secondary instance
- Monitored DNS propagation time

Summary: Implementation Insights

10-second health check interval provides good balance of responsiveness and reliability
15-second TTL enables quick failover but may need adjustment in production
DNS propagation requires patience during failover events
SNS notifications provide reliable alerting for system health
Secondary instance requires no health check configuration
Implementation provides automated failover with minimal service disruption

This implementation demonstrates how Route 53 can provide robust failover capabilities for web applications. The combination of health checks, DNS routing, and notification systems creates a reliable high-availability solution that can handle instance failures with minimal impact on service availability.