SRCDS Steam group


Network problems, mass timeouts
#1
Hey everybody. I run a 32-man Day of Defeat: Source gameserver on a dedicated server hosted at A2ZHost.com. I'm running CentOS.

We have this problem where occasionally 10 to 30 players will timeout from the server all at once. But the weird thing is that those players who don't timeout experience no negative effects whatsoever, and continue playing.

The gameserver doesn't crash or change maps, and the remaining players don't even experience any lag.

I've personally experienced both sides of this while playing on the server. Sometimes I get disconnected with the group, and sometimes I stay on with no problems. Tonight I was monitoring the gameserver on HLSW, and was connected to the console remotely, using PuTTY when this happened, and my PuTTY session froze up as well, and HLSW showed the server as being timed out.

Deducing from the above information, this problem is obviously not related directly to the gameserver software itself or to Steam, since other non-Valve-software connections to the dedicated server can go down as well. On top of that, I would think that a problem with the software, hardware, or with A2ZHost's network would result in problems for every player, and would at least show some form of lag to the players who didn't timeout completely.

From those assumptions (which by no means are perfect) it almost makes more sense to blame it on some crucial routing center out there on the internet, since successful and failing server connections seem to be so individualized.

This problem is quite beyond me, although I do have a plan of action. My first step was to ask around on these forums, since I stumbled on them by accident, and since the Steam forums have been about as useful as tits on a bull.

I've also set up a completely fresh copy of the DoD:S gameserver software with no additional mods or anything changed. I was going to shutdown the regular server and start this one up for a couple days to see if it helped any, and pointed to any of our modifications as the possible culprit. But that's kind of a last resort, because I want to exhaust my resources (like this forum) before I do that.

Lastly I plan on submitting a serious ticket to A2ZHost asking them to thoroughly check out the network and hardware related to our server. But I don't even want to do that until I've tried out the fresh server for a bit.

The only other things I can add are that the crashes seem to happen most when the server is nearly full, and most often right in the middle of map changes. The map the server is changing from/to doesn't seem to be of any consequence. And if the maps themselves aren't at fault it still makes sense happening on map changes, since that's when a great deal of load would be put on the server.

Thanks for any insight or help. We've finally got the the bugs worked out of this server (it's the first dedicated server we've owned) and now this is the only problem standing between us and greatness ;]

== Matt (a.k.a. Roachy)
http://www.goochassassins.org
Reply
#2
Its network problems on their end. Most gameserver hosts will utilize direct-peering with internet companies to have direct connections between their service and the gameserver

Most probably, this happens when a direct-peer line goes down and everyone connected along that line can no longer access your gameservers network. The thing would to do would be just like you said, raise a support ticket with A2Z and ask them to check their network when the problem next happens to narrow down the cause.
Reply
#3
Thanks for your response. That's what I figured would have to be the end result. Just to be clear though, A2Z is a dedicated server host, not a gameserver host. Also, if you look at their "Our Network" page, they are part of a giant complex of hosting companies called GNAX, who supposedly have one kick-ass network and are also on a kick-ass power-grid.

So that's why I want to have tested things thoroughly before talking to them.
Reply
#4
gameserver/dedicated - same thing, same network principle :p

Quote:We have this problem where occasionally 10 to 30 players will timeout from the server all at once. But the weird thing is that those players who don't timeout experience no negative effects whatsoever, and continue playing.

This just shouts network problem to me. The best way to look into this is actually to just talk to your host. They should have logs and details for all network areas and usage and could find where the problem happened.

I have come across on occasion a problem where some peoples pings would jump to 400 and stay there for a while and go back down - which was due to a direct-peer issue. But to totally disconnect everyone.. sounds as if the line was broken somewhere (not literally snapped in 2)
Reply
#5
Heh, ok well thanks again. I guess I'll go ahead and talk to them. Updates as events warrant ;]
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)