I came across an interesting issue whilst configuring a site-2-site VPN connection between a vCloud Director 5.1 customer vDC and a remote customer site.
The requirement was for the VPN to be configured to pass through the perimeter hardware firewalls and terminate on the external interface of the vShield Edge device connected to the vCloud external network. Firewall configuration was done on the vShield device connecting the vDC routed organisation network and the vCloud external network – defining the peer and local network configurations, plus vShield firewall rules to allow traffic through the Edge device to reach the VMs inside the organisation.
The problems we had in this instance were multiple – the customer only had a change control window late at night, and also could not configure keep-alive on their firewalls to allow persistent troubleshooting of the connection. This isn’t unusual for large corporations.
In making the connection, we saw that there was some communication between the 2 endpoints of the connection, then these were dropped as a failed connection. Further investigation through the vShield Manager showed that the connection tunnel was established but it was unable to pass traffic between the 2 subnets.
Troubleshooting this connection involved a 2 step approach.
- Creating a parallel VPN connection from an internal firewall to the same vShield Edge device on the external network.
- Collecting the technical support log files from the Edge device acting as the VPN endpoint to compare the connection phase 1 and phase 2 steps for each VPN.
What was found was information about the DPD phase of the connection (this connecting to the Checkpoint):
2012-12-06T23:25:00+00:00 vShieldEdge pluto: [authpriv.warning] “<external-vCDnetwork-GW>_<vDC-OrgNet-network>.0/24-<vpn-peer>_<vpn-local-gatewayroute>/1×1” #1: Dead Peer Detection (RFC 3706): not enabled because peer did not advertise it
This appeared multiple times in the vShield log files, compared to this (connecting to another hardware vendor firewall):
2012-12-06T23:25:00+00:00 vShieldEdge pluto: [authpriv.warning] “<external-vCDnetwork-GW>_<vDC-OrgNet-network>.0/24-<vpn-peer>_<vpn-local-gatewayroute>/1×1” #3: received Vendor ID payload [Dead Peer Detection]
Now vShield Edge is an implementation of OpenSwan project (see project homepage here), and as of yet, there is no way to stop vShield requiring DPD to establish successful connection to pass traffic. (This I assume is by design due the requirements in RFC 3706).
In this case, we were unable to find a suitable workaround to this issue resulting in a successful connection terminating on the vShield Edge device. Instead, we reconfigured then vDC networks for this organisation and terminated the VPN connection on another [hardware] firewall.
This drove me crazy for several hours – eventually finding the root cause of the issue. Hopefully, if you are finding the same issues with Checkpoints then this will save you some hours of hair-pulling.
Jeremy loves all things technology! Has been in IT for years, loves Macs (but doesn't preach to others about their virtues), loves virtualization (and does shout about it's virtues), and sometimes skis, bikes and directs amateur plays!