High-Availability VPN Cluster Deployment: Redundant Link Design with Keepalived and IPsec

4/28/2026 · 3 min

Introduction

In modern enterprise networks, VPNs are critical for connecting remote sites and mobile users. However, a single point of failure can disrupt the entire VPN service, leading to business losses. By deploying a high-availability VPN cluster using Keepalived for virtual IP (VIP) failover and IPsec for encrypted tunnels, you can significantly enhance network reliability and security.

Architecture Design

Components

  • Keepalived: Implements VRRP for VIP management and health checks. When the primary node fails, the backup node automatically takes over the VIP, ensuring service continuity.
  • IPsec: Provides data encryption and authentication, supporting IKEv1/IKEv2 protocols for site-to-site or remote access scenarios.
  • Cluster Nodes: At least two servers, configured as MASTER and BACKUP roles.

Network Topology

[Internet] <--> [VIP: 203.0.113.10] <--> [Node1 (MASTER): 10.0.0.1]
                                     <--> [Node2 (BACKUP): 10.0.0.2]

The VIP exposes the VPN service externally, while internal nodes communicate via private IPs. Keepalived monitors the IPsec process; upon primary failure, the VIP floats to the backup node.

Deployment Steps

1. Environment Preparation

  • OS: Ubuntu 22.04 LTS or CentOS 7+
  • Install packages: strongswan (IPsec) and keepalived
  • Ensure network connectivity between nodes, and open UDP ports 500, 4500 (IPsec) and VRRP multicast address (224.0.0.18)

2. Configure IPsec

Edit /etc/ipsec.conf with connection parameters, for example:

conn site-to-site
    left=10.0.0.1
    leftsubnet=192.168.1.0/24
    right=203.0.113.20
    rightsubnet=192.168.2.0/24
    auto=start

Note: Both nodes should use the same IPsec configuration, but left should point to their respective actual IPs.

3. Configure Keepalived

Primary node /etc/keepalived/keepalived.conf:

vrrp_script chk_ipsec {
    script "/usr/bin/pgrep -x charon"  # Check strongSwan process
    interval 2
    weight -20
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1234
    }
    virtual_ipaddress {
        203.0.113.10/24 dev eth0
    }
    track_script {
        chk_ipsec
    }
}

Backup node configuration is similar, with state BACKUP and priority 90.

4. Start Services

systemctl enable strongswan keepalived
systemctl start strongswan keepalived

Verify VIP binding: ip addr show eth0.

Failover Testing

  1. Stop IPsec on the primary node: systemctl stop strongswan
  2. Check Keepalived logs: tail -f /var/log/syslog – you should see VIP moving to the backup node.
  3. Attempt to connect to the VIP from a remote site and verify the VPN tunnel is established.

Optimization Tips

  • Enhanced Health Checks: Beyond process checks, implement scripts that test IPsec tunnel connectivity.
  • Session Synchronization: For stateful VPNs like IPsec, use connection sync mechanisms (e.g., strongSwan's charon-cmd) to avoid interrupting existing connections during failover.
  • Monitoring and Alerting: Integrate with Prometheus or Nagios to monitor VIP status and IPsec tunnel counts.

Conclusion

Combining Keepalived with IPsec provides a cost-effective high-availability VPN cluster. This solution is suitable for small to medium enterprises, effectively mitigating single-node failures and ensuring stable remote access.

Related reading

Related articles

A Comprehensive Guide to Enterprise VPN Deployment: From Architecture Design to Security Configuration
This article provides IT administrators with a comprehensive guide to enterprise VPN deployment, covering the entire process from initial planning and architecture design to technology selection, security configuration, and operational monitoring. We will delve into the key considerations for deploying both site-to-site and remote access VPNs, emphasizing critical security configuration strategies to help businesses build a secure, efficient, and reliable network access environment.
Read more
Enterprise VPN Deployment in Practice: A Guide to Security Architecture Design and Performance Tuning
This article provides a comprehensive, practical guide for enterprise network administrators and IT decision-makers on VPN deployment. It covers everything from the core design principles of a secure architecture to specific performance tuning strategies, aiming to help businesses build a remote access and site-to-site interconnection environment that is both secure and efficient. We will delve into key aspects such as protocol selection, authentication, encryption configuration, network optimization, and common troubleshooting.
Read more
High-Throughput VPN Gateway Selection Guide: Key Performance Indicators and Real-World Scenario Testing
This article delves into the key considerations for selecting high-throughput VPN gateways, detailing core performance indicators such as throughput, latency, and concurrent connections. It provides testing methods and evaluation frameworks based on real-world business scenarios, aiming to help enterprises build efficient and secure network connections during digital transformation.
Read more
Optimizing VPN Throughput and Latency: A Practical Configuration Guide for Enterprise Network Engineers
This article provides enterprise network engineers with a comprehensive guide to optimizing VPN performance. It covers encryption algorithm selection, MTU adjustment, routing optimization, hardware acceleration, and monitoring strategies, aiming to significantly improve VPN throughput and reduce latency for critical business applications.
Read more
Common Pitfalls in VPN Deployment and How to Avoid Them: A Practical Guide Based on Real-World Cases
VPN deployment appears straightforward but is fraught with technical and management pitfalls. Drawing from multiple real-world enterprise cases, this article systematically outlines common issues across the entire lifecycle—from planning and selection to configuration and maintenance—and provides validated avoidance strategies and best practices to help organizations build secure, efficient, and stable remote access and network interconnection channels.
Read more
VPN Deployment Strategy in Multi-Cloud Environments: Technical Considerations for Secure Interconnection Across Cloud Platforms
This article delves into the key strategies and technical considerations for deploying VPNs in multi-cloud architectures to achieve secure interconnection across cloud platforms. It analyzes the applicability of different VPN technologies (such as IPsec, SSL/TLS, WireGuard) in multi-cloud scenarios and provides practical advice on network architecture design, performance optimization, security policies, and operational management, aiming to help enterprises build efficient, reliable, and secure cross-cloud network connections.
Read more

FAQ

How does Keepalived detect IPsec service health?
Keepalived uses vrrp_script to execute custom scripts, such as checking the strongSwan charon process with pgrep. If the script returns a non-zero value, Keepalived reduces the node's priority, triggering VIP failover.
Will existing IPsec connections be interrupted during failover?
By default, IPsec is stateful, so failover will interrupt existing connections because Security Associations (SAs) are stored in memory and not synchronized. It is recommended to use strongSwan's session synchronization (e.g., charon-cmd) or configure IKEv2 MOBIKE support to minimize disruption.
Can IPsec configurations be shared between primary and backup nodes?
Yes, but ensure the 'left' address points to each node's actual IP. It is advisable to use configuration management tools (e.g., Ansible) to sync config files and ensure consistent pre-shared keys or certificates.
Read more