How can I troubleshoot issues with my weighted routing policy in Route 53?

7 minute read
0

I get unexpected results when testing the DNS resolution for a weighted routing policy in Amazon Route 53.

Short description

Suppose that you created a text (TXT) record with the name "weighted.awsexampledomain.com". The record has a Time to Live (TTL) of 300 seconds, and has weights configured as follows:

NameTypeTTLValuesWeightHealth check status
weighted.awsexampledomain.com.TXT300"Record with Weight 0"Weight=0Health check associated
weighted.awsexampledomain.com.TXT300"Record with Weight 20"Weight=20Health check associated
weighted.awsexampledomain.com.TXT300"Record with Weight 50"Weight=50Health check associated
weighted.awsexampledomain.com.TXT300"Record with Weight 70"Weight=70Health check associated

This configuration is referenced in the following examples.

Resolution

Note: If you receive errors when running AWS Command Line Interface (AWS CLI) commands, make sure that you're using the most recent AWS CLI version.

Test your weighted routing policy to identify the issue

Send multiple (over 10,000) queries to test your weighted routing policy. Test the DNS resolution from multiple locations or directly query the authoritative name servers to understand the policy. Use the following scripts to send multiple DNS queries for your domain name.

Send DNS queries using the recursive resolver:

#!/bin/bash
for i in {1..10000}
do
domain=$(dig <domain-name> <type> @RecursiveResolver_IP +short)
echo -e  "$domain" >> RecursiveResolver_results.txt
done

Send DNS queries directly to the authoritative name servers:

#!/bin/bash
for i in {1..10000}
do
domain=$(dig <domain-name> <type> @AuthoritativeNameserver_IP +short)
echo -e  "$domain" >> AuthoritativeNameServer_results.txt
done

Example output using the awk tool in the AWS CLI:

$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @172.16.173.64 +short); echo -e  "$domain" >> RecursiveResolver_results.txt; done
$ awk ' " " ' RecursiveResolver_results.txt | sort | uniq -c
1344 "Record with Weight 20"
3780 "Record with Weight 50"
4876 "Record with Weight 70"

Use your test results to troubleshoot your specific issue

Issue: Endpoint resources of the weighted records aren't receiving the expected traffic ratio.

Route 53 sends traffic to resources based on the weight assigned to the record as a proportion of the total weight for all records. Intermediate DNS resolvers cache the DNS responses for the duration of the record TTL. Clients are directed to only specific endpoints for the duration due to the cached response.

Example

You query against the caching DNS resolver 192.168.1.2:

$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @192.168.1.2 +short); echo -e  "$domain" >> CachingResolver_results.txt; done

$ awk ' " " ' CachingResolver_results.txt | sort | uniq -c
3561 "Record with Weight 20"
1256 "Record with Weight 50"
5183 "Record with Weight 70"

Notice that the preceding results aren't as expected because the cache at the recursive DNS resolver.

Issue: Some weighted records aren't returned.

Example

Some health checks are failing:

NameTypeTTLValuesWeightHealth check status
weighted.awsexampledomain.com.TXT300"Record with Weight 0"Weight=0Health Check Success
weighted.awsexampledomain.com.TXT300"Record with Weight 20"Weight=20Health Check Success
weighted.awsexampledomain.com.TXT300"Record with Weight 50"Weight=50Health Check Fail
weighted.awsexampledomain.com.TXT300"Record with Weight 70"Weight=70Health Check Success
$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @192.168.1.2 +short); echo -e  "$domain" >> HealthCheck_results.txt; done

$ awk ' " " ' HealthCheck_results.txt | sort | uniq -c
3602 "Record with Weight 20"
6398 "Record with Weight 70"

In this example, the "Record with Weight 50" isn't returned by Route 53 because its health check is failing.

Issue: All weighted records are unhealthy.

Even if none of the records in a group of records are healthy, Route 53 must still provide a response to the DNS queries. However, there's no basis for choosing one record over another. In this case, Route 53 considers all the records in the group to be healthy. One record is selected based on the routing policy and the values that you specify for each record.

Example

NameTypeTTLValuesWeightHealth check status
weighted.awsexampledomain.com.TXT300"Record with Weight 0"Weight=0Health Check Fail
weighted.awsexampledomain.com.TXT300"Record with Weight 20"Weight=20Health Check Fail
weighted.awsexampledomain.com.TXT300"Record with Weight 50"Weight=50Health Check Fail
weighted.awsexampledomain.com.TXT300"Record with Weight 70"Weight=70Health Check Fail
$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @205.251.194.16 +short); echo -e  "$domain" >> All_UnHealthy_results.txt; done

$ awk ' " " ' All_UnHealthy_results.txt | sort | uniq -c
1446 "Record with Weight 20"
3554 "Record with Weight 50"
5000 "Record with Weight 70"

In this example, Route 53 considered all records healthy (Fail Open). Route 53 responded to the DNS requests based on the configured proportions. "Record with Weight 0" isn't returned because its weight is zero.

Note: If you set nonzero weights to some records and zero weights to others, then health checks work the same as when all records have nonzero weights. However, there are a few exceptions:

  • Route 53 initially considers only the healthy nonzero weighted records, if any.
  • If all nonzero records are unhealthy, then Route 53 considers the healthy zero weighted records.

Example

NameTypeTTLValuesWeightHealth Check Status
weighted.awsexampledomain.com.TXT300"Record with Weight 0"Weight=0Health Check Pass
weighted.awsexampledomain.com.TXT300"Record with Weight 20"Weight=20Health Check Pass
weighted.awsexampledomain.com.TXT300"Record with Weight 50"Weight=50Health Check Fail
weighted.awsexampledomain.com.TXT300"Record with Weight 70"Weight=70Health Check Fail
$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @192.168.1.2 +short); echo -e  "$domain" >> HealthCheck_results.txt; done

$ awk ' " " ' HealthCheck_results.txt | sort | uniq -c
10000 "Record with Weight 20"

In this example, Route 53 doesn't consider the record with weight 0. Unless all weighted records are unhealthy, Route 53 doesn't return the zero-weighted records.

If you set an equal weight for all records in a group, then traffic is routed to all healthy resources with equal probability. If you set "Weight" to zero for all records in a group, then traffic is routed to all healthy resources with equal probability.

Related information

Choosing a routing policy

How Amazon Route 53 chooses records when health checking is configured

AWS OFFICIAL
AWS OFFICIALUpdated a year ago