VPN IPsec down every morning

I want to start by saying that I’m not an expert of Fortigate in general, so sorry If I’ll make any mistakes below…

I have an IPsec VPN between two physical distant sites in HUB-Spoke mode.

The main (HUB) has a Fortigate 100F (firmware 6.2.9) with multiples spokes around the world and the second (Spoke) has a Fortigate 40F (firmware 6.4.7) .

On our main Fortigate, we have 2 ISP, so for every spokes we’ve configured 2 IPsec Tunnels (one principal and one of backup in case the first goes down) that point to HUB.

Issue:

Every morning, on the second Fortigate, every IPsec tunnels are down for some reason (primary and backup, but internet is ok). Phase 1 is enstablished on the primary Tunnel but Phase 2 is down. If I try to bring UP everyphase 2 from GUI, nothing happens.

Meanwhile the main Fortigate seems to be working well with others enstablished spokes (without the problematic spoke above).

What I tried:

  • Whatching logs on the second Fortigate: https://kb.fortinet.com/kb/documentLink.do?externalID=FD46611 . My Phase 1 was UP, but phase 2 was down. I solved temporarily by manually disabling phase 1, and then re-enabling it again (all from CLI).
  • Since we have just one pc on the second site, in “Log & Report → Forward Traffic” I’ve watched the logs related to that pc, to see what happens during the time that there was internet, but no Ipsec VPN. I’ve saw no traffic of Ipsec VPN, only towards internet.

Temporary solutions: (not definitive)

  • Restart Fortigate on the second site (the site with IPsec tunnels down). When It restart, the primary IPsec tunnel is up and just working fine.
  • I’ve disabled the backup tunnel (so only primary stays up) and this solved the issue for 3 days…then problem return again.
  • Restart IPsec tunnel from CLI.
  • Sconfigure IP of the IPsec in the second Fortigate, in “VPN–>IPsec Tunnels”, then trying to bring UP all phase 2, then setting the right IP and again bringing UP all phase 2.

Any ideas on how to fix this issue correctly? Have someone of you had the same problem?

Thanks for reading…

*****************************************************************************

EDIT 22/11/21: SOLUTION

I’d like to share the solution, which may help some people in future.

The problem was caused by microseconds on/off between the office’s cabin and our ISP “headquarter”.

(ISP technicians talked about a misalignment between our office’s router and their central)

This continuous switch, caused that IPsec tunnel after approximately one hour falls down.

Fortigate GUI went crazy and showed incorrect states (like UP when the tunnel was DOWN) that differs from CLI state which was the right one.

Thank you all for help and suggestions.

Does the hub and spoke phase1 and phase2 lifetime match? Does the spoke phase2 have “auto-negotiate” set to enable?

I remember having a FortiGate bug that causes VPNs to not process traffic anymore after some time when NPU Offload is active. Can you try to disable npu offload for the VPN tunnel?

config vpn ipsec phase1/phase1-interface

edit “vpn_name”

set npu-offload disable

next

end

No idea about the issue, but some of the things I do and don’t have issues:

Only use ISP with static IP, no dynamic IP, both ends

Change the IPsec settings for phase 1 and 2 to IKE V2 and just AES/SHA 256 and DH group 16

Have all tunnels in sdwan

Have health checks pinging across all tunnels to loopback interface on remote site

Can you post a (sanitised) copy of the VPN configuration from both hub and spoke?

Sorry for the time. I’ve verified. On the spoke site, the IPsec tunnel has auto-negotiate disabled. The lifetime is the same, on the HUB and on the Spoke.

you generally dont disable except for troubleshooting

I think (and I don’t have actually) that I don’t need dynamic IP on spoke side, since it authenticate itself with credentials to the HUB. I will try it as last try.

I don’t know much about IKE and Phase 1 proposal, but my settings are different. I will try your suggestion for sure . HUB has static IP.

I don’t know how to verify if tunnels are in SDWAN…I’ve searched on google without any help, but I will investigate.

Sorry but I didn’t understand your last point (loopback interface ?). The ping is stable between hub and spoke (from every side).

Thank you very much for the reply and help.

I’ve downloaded the backup string, and this is the phase 1/2 section, which I’ve sanitized.

Is this what you asked for?

^(config vpn ipsec phase1-interface)

^(edit [*****])

^(set interface “wan”)

^(set mode aggressive)

^(set peertype any)

^(set net-device enable)

^(set proposal aes256-sha256)

^(set localid “SPOKES”)

^(set xauthtype client)

^(set authusr [*****])

^(set authpasswd [*****])

^(set mesh-selector-type subnet)

^(set remote-gw [*****])

^(set psksecret [*****])

^(next)

^(edit “HUB-ITALY_BCK”)

^(set interface “wan”)

^(set mode aggressive)

^(set peertype any)

^(set net-device enable)

^(set proposal aes256-sha256)

^(set localid “SPOKES”)

^(set xauthtype client)

^(set authusr [*****])

^(set authpasswd [*****])

^(set mesh-selector-type subnet)

^(set remote-gw [*****])

^(set psksecret [*****])

^(next)

^(end)

^(config vpn ipsec phase2-interface)

^(edit [*****])

^(set phase1name [*****])

^(set proposal aes256-sha256)

^(set auto-negotiate enable)

^(set src-addr-type name)

^(set dst-addr-type name)

^(set src-name [*****])

^(set dst-name [*****])

^(next)

^(edit [*****])

^(set phase1name [*****])

^(set proposal aes256-sha256)

^(set keepalive enable)

^(set src-addr-type name)

^(set dst-addr-type name)

^(set src-name [*****])

^(set dst-name [*****])

^(next)

^(edit [*****])

^(set phase1name [*****])

^(set proposal aes256-sha256)

^(set keepalive enable)

^(set src-addr-type name)

^(set dst-addr-type name)

^(set src-name [*****])

^(set dst-name [*****])

^(next)

^(edit [*****])

^(set phase1name [*****])

^(set proposal aes256-sha256)

^(set keepalive enable)

^(set src-addr-type name)

^(set dst-addr-type name)

^(set src-name [*****])

^(set dst-name [*****])

^(next)

^(end)

auto-negotiate must be set to enable, otherwise a new phase2 won’t be negotiated when the old one times out, which sounds exactly like the problem you are having.

I’ve enabled auto-negotiate on spoke side now. Thank you very much. I’ll update you if I’ll find that this solved the issue.

This didn’t solved the issue, but it sure helped.

An IT consultant told me that auto-negotiate was disabled because it can cause troubles with double Tunnel (main and backup), but since I’ve disabled the backup tunnel, auto-negotiate should be enabled, as you suggested.