Oracle cloud dns blachole fix

DNS Latency Resolution in Multi-Node Subnets

In Oracle Cloud Infrastructure (OCI) environments, DNS latency occurs when multiple Kubernetes nodes occupy a single subnet. The issue is asymmetric: traffic between different VCNs or tenants functions correctly due to mandatory SNAT at the Dynamic Routing Gateway (DRG) level, whereas intra-subnet traffic between nodes fails or lags significantly.

Root Cause Analysis

When nodes reside in the same subnet, the CNI typically avoids SNAT to maintain Pod IP visibility. Consequently, a Pod on Node A attempting to reach CoreDNS on Node B uses raw Pod IP addresses from the 10.244.x.x range.

  1. Intra-subnet traffic: The OCI Virtual Cloud Network (VCN) is unaware of the 10.244.x.x Pod CIDR blocks by default.

  2. Routing failure: Packets without a specific route in the VCN Route Table match the 0.0.0.0/0 rule and are forwarded to the Internet Gateway.

  3. Blackhole effect: The Internet Gateway drops private Pod traffic. This causes 5-second UDP timeouts until the resolver attempts a secondary nameserver or the request eventually fails.

Technical Solution

The resolution requires synchronizing the OCI L3 routing table with the Kubernetes overlay. This applies to both the node filesystem and the pods, as the host resolver suffers from the same routing gaps.

IMPORTANT
OCI VNIC security prevents using an instance as a routing target unless specifically configured. You must modify the VNIC attributes before updating the route tables.

  1. Disable Source/Destination Check: Each VNIC in the subnet must have the Skip Source/Destination Check attribute enabled. Without this, OCI will reject any route rule targeting a private IP.

  2. Static Route Implementation: For every node in the subnet, a route must be added to the VCN Route Table.

# Example Route Table Entries
# Destination: [Node B Pod CIDR] -> Target Type: Private IP -> Target: [Node B Private IP]
# Destination: [Node C Pod CIDR] -> Target Type: Private IP -> Target: [Node C Private IP]

Testing and Verification

Run the following loop to measure latency. This command can be executed directly on the host or inside any pod, as the routing issue affects the entire node stack.

for i in {1..10}; do
  curl -4 -w "Attempt $i | DNS: %{time_namelookup}s | TCP: %{time_connect}s | TLS: %{time_appconnect}s | Total: %{time_total}s\n" \
  -o /dev/null -s https://ghcr.io/v2/
  sleep 1
done

CoreDNS Deployment Strategy

To ensure high availability during updates and prevent transient DNS failures, use a conservative rolling update strategy.

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 0
    maxSurge: 1

NOTE
By setting maxUnavailable: 0, you ensure that the old DNS pods are only terminated after the new replicas pass their readinessProbe.

CAUTION
Manual route management is required for every new node added to the subnet. Failure to update the OCI Route Table will result in immediate DNS degradation for the new node.