[OpenIndiana-discuss] weird packet garbling problem
Edward Ned Harvey (openindiana)
openindiana at nedharvey.com
Sat Feb 2 15:32:23 UTC 2013
I am having a really hard time coming up with a plausible explanation for this, other than some kind of kernel bug with openindiana...
I have two systems in the office, Dell PowerEdge SC 1435 (Embedded Broadcom 5721 NIC) and Dell PowerEdge 2950 (Embedded Broadcom 5708 NIC), both running OI 151a5 or newer.
Inside the office, everything works fine. But when I go home and VPN into the office, I ssh or vnc to these two boxes, and I get packet garbling and retransmissions and dropped connections, but *only* on these two machines, and *only* from the vpn connection, and *only* for certain specific types of traffic. Here's an example:
I'm on an ssh prompt. I can type in commands all day and night, it always works fine when I'm typing. (One character at a time, typing via keyboard, I can hold down a key and completely fill the screen, 4320 keystrokes no problem.) But I'm following a procedure, so I'm also pasting commands. Sometimes when I paste commands, I get PuTTY Fatal Error: Incoming packet was garbled on decryption. (Disconnected.)
It's not a MTU thing. (First of all, I checked all the MTU's looking for any problems) but a better clue is that I can paste the same command over and over and over (obviously the same packet size each time) and it only fails after the Nth repititon. For testing purposes, I ssh into box, and I paste this command:
echo "hello there buddy, whatcha doing" > /dev/null
Obviously nowhere near the MTU size. I keep pasting it over and over, until connection fails. Count how many times I can successfully paste it before failure. Repeat. My results were: 5, 0, 12, 0, 9, 0. Deterministic inputs, nondeterministic outputs. (Well, probably deterministic, but not determined by the inputs that I'm controlling).
I have a workaround. I ssh into some other machine in the network, and then ssh to the machine in question. Infinite success. Paste the above command until my fingers are tired and I'm satisfied that there's no problem. The problem *only* happens when I ssh (or vnc or whatever) directly to the machine from the vpn client. And obviously, it doesn't happen when I ssh to some other machine from the vpn client (and then ssh to the machine in question).
The only difference between the LAN traffic which works perfectly, and the VPN traffic that's having a problem, is the fact that the VPN traffic needs to go through a router. It's not the router that's messing up the traffic, or else I would expect to see the same problem on a different machine.
It's hard for me to imagine a driver problem that will only affect traffic that requires a router. But maybe. Maybe there's a broadcom driver problem, that doesn't affect LAN traffic but does affect traffic going through a router.
Anyway, I'm at a loss for how to debug further. I suppose I could create a dummy network with a really simple router in between, and see if the problem persists, using a different router and no VPN. Also, if I do that, I'll be able to wireshark both sides, to see what happens. For now, on my VPN, I can only wireshark the OI side of the equation; can't wireshark the traffic at my VPN endpoint.
I also have one Intel NIC I can stick into one of the machines.
More information about the OpenIndiana-discuss
mailing list