A recent project found me migrating from a pair of Juniper firewalls to an HA pair of ASA 5540s. I have never really used Junipers before, but the configuration seemed relatively straightforward. I could tell how they were defining their NAT statements, VPN tunnel configs, etc. No major issues that I saw, just a lot of cut/paste operations by hand. There were roughly 130 or so NAT statements to be migrated, which was tedious, but not difficult. There were some questions I had on the site-to-site configs, but nothing major. I had everything configured and ready to go for an early Saturday morning cutover. Wake up, get to the DC by 0500, out by 0900 or so, right?
Cutover morning, we get to the DC, plug in the pair, all interfaces come up, failover tests out ok, the interwebs are reachable, etc. etc. So we exit the DC and go into the customer area for additional testing. So there were a few NAT statements that I had transposed the IPs on, or accidentally pasted the previous IP I had on the clipboard. No problem…easy fix. Site to site tunnels came up mostly, except for a few mismatched parameters. Another easy fix. All in all, with the exception of some little stuff, things were going ok.
Here’s where the fun starts.
The network was one flat layer 2 domain…sort of. Gateway (ASA) was x.y.5.1/19, which would include x.y.0.1 through x.y.31.254. The internal scheme was such that domain Windows servers were x.y.10.z/20, and their virtual servers were all x.y.20.x/24. ALL hosts on the network had 5.1/19 as their gateway. On their previous firewall, the hosts at x.y.10.z were able to communicate with hosts on x.y.20.z without a problem. Can you spot the issue here before I tell you what it is?
If we look at x.y.10.z/20, this includes x.y.0.1 through x.y.15.254. The virtual server “subnet” was x.y.20.z which includes x.y.20.1 trough x.y.20.254. The problem showed up in that hosts on the x.y.10.z subnet could not communicate with the x.y.20.z hosts. Since 10.z/20 only includes up to 15.254, the hosts would send their packets to the default gateway, destined for the off-net 20.0 subnet. The ASA for whatever reason (and this is the part I haven’t had a chance to really research — I welcome all input as to why though), would drop the traffic going to 20.z, even though the same-security-traffic permit intra-interface was enabled. Nothing was working, and I was getting frustrated trying to figure out the answer.
I opened a TAC case and the engineer, within a few minutes, said that we need to enable TCP state bypass. Not something I’ve ever run into before, but ok. I won’t go into the configuration details, because they’re in the link, but for some reason, this worked fine. Based on the Cisco document behind the link, this occurs more frequently with asymmetric routing issues. I have had a chance to really go back and figure out why this was occurring, but the tcp state bypass resolved the issue.
Can anyone out there elaborate on why the ASA would drop that traffic?
EDIT: The subnets listed above are incorrect: The gateway is correct, the 10/.x is /21, not /20, and the 20.x is /21 as well.