Posts: 12
Threads: 1
Joined: Oct 2010
Reputation:
0
Hey guys,
We've been suffering with high CPU usage for a while now (Since the game came out actually). It's a 64slot SRCDS server that runs Zombie* Variations ie Zombie:Reloaded or in the past ZombieMod. SRCDS always manages to pin our core at 100%. We're running both of Monks addons (Adaptive Usleep and LibFastTime), they seem to help. However when 50 or so players join, shit hits the fan and the lag starts. Server FPS drops to an all time low, and InGame pings shoot way up (A2S_INFO replies are unaltered, surprisingly).
We're running Debian Six (Squeeze) x86_64 with the Stock Linux Kernel at the moment (Tried Zen-Kernel, didn't do much; Classic RCU is still broken so we are twiddling our thumbs):
Kernel HZ: 100hz
Dynamic Ticks on
TSC as our systems ClockSource
Disabled SELinux and all of that other *fun* stuff.
CPU: Intel Xeon X3460 Lynnfield 2.8GHz (Runs at 3-3.4 with Turbo Boost)
Mainboard: SUPERMICRO X8SIL-F Micro ATX Server Board
HD: Western Digital Caviar Blue WD5000AAKS
Ram: KVR1333D3D8R9S/2G x2
I've mailed a couple employee's from Valve, they've *forwarded* the information along. As well to this I was told they do not support any servers over 32slots as it's out of their testing domain; 'You're on your own'.
Anyways, if anyone has any ideas on how we can get our CPU usage lower, we're all ears to test out practically anything. Fair warning though, we will not use Windows as signatures change far too often for it, causing me to fire up IDA to find the new hex values. Symbol names are MUCH nicer. I was also told by a numerous amount of people that Linux handles higher slots better, which is another reason why we are currently using it.
Posts: 18
Threads: 3
Joined: Jan 2010
Reputation:
0
10-17-2010, 05:45 PM
(This post was last modified: 10-17-2010, 05:56 PM by cyberthug.)
(10-17-2010, 04:55 PM)jimmy69 Wrote: Hey guys,
We've been suffering with high CPU usage for a while now (Since the game came out actually). It's a 64slot SRCDS server that runs Zombie* Variations ie Zombie:Reloaded or in the past ZombieMod. SRCDS always manages to pin our core at 100%. We're running both of Monks addons (Adaptive Usleep and LibFastTime), they seem to help. However when 50 or so players join, shit hits the fan and the lag starts. Server FPS drops to an all time low, and InGame pings shoot way up (A2S_INFO replies are unaltered, surprisingly).
We're running Debian Six (Squeeze) x86_64 with the Stock Linux Kernel at the moment (Tried Zen-Kernel, didn't do much; Classic RCU is still broken so we are twiddling our thumbs):
Kernel HZ: 100hz
Dynamic Ticks on
TSC as our systems ClockSource
Disabled SELinux and all of that other *fun* stuff.
CPU: Intel Xeon X3460 Lynnfield 2.8GHz (Runs at 3-3.4 with Turbo Boost)
Mainboard: SUPERMICRO X8SIL-F Micro ATX Server Board
HD: Western Digital Caviar Blue WD5000AAKS
Ram: KVR1333D3D8R9S/2G x2
I've mailed a couple employee's from Valve, they've *forwarded* the information along. As well to this I was told they do not support any servers over 32slots as it's out of their testing domain; 'You're on your own'.
Anyways, if anyone has any ideas on how we can get our CPU usage lower, we're all ears to test out practically anything. Fair warning though, we will not use Windows as signatures change far too often for it, causing me to fire up IDA to find the new hex values. Symbol names are MUCH nicer. I was also told by a numerous amount of people that Linux handles higher slots better, which is another reason why we are currently using it.
Xeon X3460 Lynnfield 2.8GHz is the problem i have had one and either way custom kernel or not your system will not handle a 64slot server maybe half that 32 might run ok and you can try one of Terrorkarotte's kernels at here
first try the 2.6.33.5-zen3-ub-100hz kernel and try 2.6.33.5-zen3-ub-1000hz they may work as they have for me
Posts: 2,031
Threads: 27
Joined: Nov 2008
Reputation:
17
10-17-2010, 06:17 PM
(This post was last modified: 10-17-2010, 06:18 PM by BehaartesEtwas.)
64 slots are a lot, it usually takes a lot of effort to get it running somehow (and it will never run really smooth). try this:
- reduce tickrate to 33 (sv_maxcmdrate 33 and sv_maxupdaterate 33, or maybe 34 for both) and match fps the tickrate (fps_max 34 or so). since orange box this will not decrease quality (I am now say 95% sure of this).
- do not use any tricks like "adaptive usleep" or "LibFastTime", what ever they do, it will probably make things worse in your case. those things are usually designed for high-end and low slot count servers and were created before the OB update.
- try out different kernels that are optimized for maximum throughput, not for minimal latency. i.e. do *not* use RT-patches, but try ZEN with settings for servers.
- run srcds with realtime or fifo scheduling (see my howto, "resched.sh").
- disable all plugins you do not absolutely require.
this is only a start. you will have to play around much on your own. also cyperthug could be right and your cpu cannot handle this. most people already get problems when reaching 32 slots or even lower. in theory your cpu must be approximately twice as fast as theirs, which is probably not the case by far... if you happen to think about a new cpu, watch out for those who have the maximum possible performance per core, as srcds is (basically) single threaded...
ah and btw: do not look too much on the cpu usage, quite often it has nothing to do with reality (neither in stats, top, htop or whatever). instead make sure the fps are always stable at (or above) your tickrate.
Posts: 12
Threads: 1
Joined: Oct 2010
Reputation:
0
FPS drops to 10-30, from 937 (Adaptive usleep) that's my issue at the moment. LibFastTime helps my usage, Adaptive Usleep makes my *kernel changes* usable.
Posts: 18
Threads: 3
Joined: Jan 2010
Reputation:
0
10-17-2010, 06:43 PM
(This post was last modified: 10-17-2010, 06:44 PM by cyberthug.)
(10-17-2010, 06:24 PM)jimmy69 Wrote: FPS drops to 10-30, from 937 (Adaptive usleep) that's my issue at the moment. LibFastTime helps my usage, Adaptive Usleep makes my *kernel changes* usable.
Lynnfield is low grade i have proven it before even with BEpingboost.c and the kernel at 100 hz i couldnt get a stable 500 fps and with Turbo Boost
im sure its struggling to keep up Xeons arent meant to boost or overclock in anyway trust me i have tried to get low end servers to run css ob at high fps and 32 slots and another thing is this a dedicated server if so you gotta think about power the power supply could be a cheapie meaning the cpu is under powered there are many things that could cause this just my thoughts
Posts: 1
Threads: 0
Joined: Oct 2010
Reputation:
0
(10-17-2010, 06:43 PM)cyberthug Wrote: (10-17-2010, 06:24 PM)jimmy69 Wrote: FPS drops to 10-30, from 937 (Adaptive usleep) that's my issue at the moment. LibFastTime helps my usage, Adaptive Usleep makes my *kernel changes* usable.
Lynnfield is low grade i have proven it before even with BEpingboost.c and the kernel at 100 hz i couldnt get a stable 500 fps and with Turbo Boost
im sure its struggling to keep up Xeons arent meant to boost or overclock in anyway trust me i have tried to get low end servers to run css ob at high fps and 32 slots and another thing is this a dedicated server if so you gotta think about power the power supply could be a cheapie meaning the cpu is under powered there are many things that could cause this just my thoughts
I'm sorry, but that makes no sense at all. How is the Lynnfield Architecture considered to be low grade? It's based off the i7's Nahalem architecture. The x3460 is essentially a rebinned Core i7 860. And if they were not meant to be boosted, then why is turbo mode enabled on these CPU's by default.
And the CPU not "getting enough power" is not the issue. That's not the way it works. There is no reason to believe that the power supply is a "cheapie". If it were the case, then the chances of having it supply the server with "dirty power" would be high, and every component could be fried by now. If the CPU was not getting enough power, then the server would constantly crash -- we would definitely know there was a hardware issue going on - and, that is not the case here.
Posts: 12
Threads: 1
Joined: Oct 2010
Reputation:
0
10-17-2010, 07:28 PM
(This post was last modified: 10-17-2010, 08:33 PM by jimmy69.)
I've tried Zen, I've optimized the Kernel (haven't touched glibc or anything else) as I said in the initial post. This made all the difference that it could, at the moment I'm looking for other advice besides absolutely nuking my server performance @ a fake 33tick (setting those rates wont even bring down my CPU usage one bit, I've tried it). The preloaded modules only seem to help my performance and CPU usage (What they were designed to do).
Thank you RayW for bringing common sense to the table. At the moment I'm pretty lost with these replies as they're not remotely helpful, even though that was their intention. The OP hasn't been fully read.
EDIT: My FPS is well above 900 before the CPU gets pinned at around 50 players.
Posts: 18
Threads: 3
Joined: Jan 2010
Reputation:
0
(10-17-2010, 07:28 PM)jimmy69 Wrote: I've tried Zen, I've optimized the Kernel (haven't touched glibc or anything else) as I said in the initial post. This made all the difference that it could, at the moment I'm looking for other advice besides absolutely nuking my server performance @ a fake 33tick (setting those rates wont even bring down my CPU usage one bit, I've tried it). The preloaded modules only seem to help my performance and CPU usage (What they were designed to do).
Thank you RayW for bringing common sense to the table. At the moment I'm pretty lost with these replies as they're not remotely helpful, even though that was their intention. The OP hasn't been fully read.
EDIT: My FPS is well above 900 before the CPU gets pinned at around 50 players.
BehaartesEtwas and myself have told you 64 slots are to much
Posts: 12
Threads: 1
Joined: Oct 2010
Reputation:
0
(10-17-2010, 09:41 PM)cyberthug Wrote: BehaartesEtwas and myself have told you 64 slots are to much
I know man. Depending on the map, my FPS starts to drop dramatically due to the core being pinned at around 40-45 players. It drops below the acceptable threshold at around 50-55 players (on some maps this isn't even an issue, and 64 players are just fine).
The point of this thread was to optimize my system and hopefully serve as a guide for others with what they can do since Valve doesn't seem to care the least bit. So far, the replies have been an exact repeat of the initial post.
Posts: 226
Threads: 2
Joined: Aug 2009
Reputation:
1
Engine is only designed to support a maximum amount of 32. Running 64 slots on the new engine is futile at best, the new code is just way too expensive, and throwing lower HZ (100) at it is not going to make the game run any better under load, maybe a little, but nothing you can remotely measure or have it noticeable.
You could do some profiling of it in strace to see what it's spending most of it's time doing, it's probably eating up cpu in nanosleep, so you'll have to lower what fps_max is set to (64 slots should be a fps of 66 anyways (1:1 fps/tick))
Posts: 12
Threads: 1
Joined: Oct 2010
Reputation:
0
Ah shoot.
Code:
root@Tramicia:~# strace -cf -p 9042
Process 9042 attached with 5 threads - interrupt to quit
[ Process PID=9054 runs in 32 bit mode. ]
^CProcess 9042 detached
Process 9047 detached
Process 9052 detached
Process 9053 detached
Process 9054 detached
System call usage summary for 32 bit mode:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
58.99 9.085366 238 38148 9834 futex
24.03 3.701374 59 62575 select
7.66 1.180000 543 2172 nanosleep
5.06 0.778574 429 1816 read
4.06 0.624596 2271 275 fsync
0.19 0.030000 15000 2 2 restart_syscall
0.00 0.000258 1 252 getdents
0.00 0.000221 0 408504 gettimeofday
0.00 0.000090 0 76016 sendto
0.00 0.000024 0 121206 3860 recvfrom
0.00 0.000018 0 37298 recv
0.00 0.000011 0 35181 send
0.00 0.000000 0 2102 write
0.00 0.000000 0 877 441 open
0.00 0.000000 0 438 close
0.00 0.000000 0 69 unlink
0.00 0.000000 0 2768 time
0.00 0.000000 0 132 132 access
0.00 0.000000 0 1 rename
0.00 0.000000 0 22 times
0.00 0.000000 0 1 ioctl
0.00 0.000000 0 120 munmap
0.00 0.000000 0 86 mprotect
0.00 0.000000 0 11020 _llseek
0.00 0.000000 0 220 poll
0.00 0.000000 0 121 mmap2
0.00 0.000000 0 1221 756 stat64
0.00 0.000000 0 325 fstat64
0.00 0.000000 0 1147 fcntl64
0.00 0.000000 0 8224 clock_gettime
0.00 0.000000 0 2 socket
0.00 0.000000 0 2 connect
0.00 0.000000 0 3859 3859 accept
0.00 0.000000 0 1 shutdown
0.00 0.000000 0 5 setsockopt
------ ----------- ----------- --------- --------- ----------------
100.00 15.400532 816208 18884 total
root@Tramicia:~#
Posts: 226
Threads: 2
Joined: Aug 2009
Reputation:
1
Try letting it run for about 10 minutes to get a better sample rate. Running processes in profiling mode will slow them down a little bit
Posts: 12
Threads: 1
Joined: Oct 2010
Reputation:
0
10-18-2010, 10:59 AM
(This post was last modified: 10-18-2010, 11:41 AM by jimmy69.)
I noticed, everyones in game ping shot up to 500 with 60 clients ingame lol >.<
EDIT: 12 or so minutes of tracing (54-60 Real Clients):
Code:
root@Tramicia:~# strace -cf -p 9042
Process 9042 attached with 5 threads - interrupt to quit
[ Process PID=9054 runs in 32 bit mode. ]
^CProcess 9042 detached
Process 9047 detached
Process 9052 detached
Process 9053 detached
Process 9054 detached
System call usage summary for 32 bit mode:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
73.19 132.270169 305 433366 119883 futex
25.73 46.499326 108 430864 select
0.44 0.801059 59 13639 nanosleep
0.38 0.691791 2882 240 fsync
0.22 0.396512 20 19845 read
0.03 0.060000 30000 2 2 restart_syscall
0.00 0.001917 0 3959501 gettimeofday
0.00 0.001128 0 699857 sendto
0.00 0.000548 1 468 getdents
0.00 0.000312 0 1492545 53618 recvfrom
0.00 0.000115 0 204686 send
0.00 0.000105 0 211428 recv
0.00 0.000040 0 111987 clock_gettime
0.00 0.000016 0 53618 53618 accept
0.00 0.000012 0 34779 time
0.00 0.000011 0 67228 _llseek
0.00 0.000009 0 6332 1404 stat64
0.00 0.000000 0 2732 write
0.00 0.000000 0 3452 819 open
0.00 0.000000 0 2654 close
0.00 0.000000 0 60 unlink
0.00 0.000000 0 93 93 access
0.00 0.000000 0 8 8 mkdir
0.00 0.000000 0 260 times
0.00 0.000000 0 16 ioctl
0.00 0.000000 0 1722 munmap
0.00 0.000000 0 1700 mprotect
0.00 0.000000 0 4 flock
0.00 0.000000 0 225 poll
0.00 0.000000 0 1722 mmap2
0.00 0.000000 0 1907 fstat64
0.00 0.000000 0 944 fcntl64
0.00 0.000000 0 32 socket
0.00 0.000000 0 32 connect
0.00 0.000000 0 16 shutdown
0.00 0.000000 0 80 setsockopt
------ ----------- ----------- --------- --------- ----------------
100.00 180.723070 7758044 229445 total
root@Tramicia:~#
I hope that's more helpful then the last trace.
EDIT (again): If Userspace = Userland is there any chance 'kmctrl' is available for public usage?
http://docs.google.com/viewer?url=http://people.summit-servers.com/monk.pdf
Posts: 226
Threads: 2
Joined: Aug 2009
Reputation:
1
Spending alot of time doing futexes, but each futex is eating 300 usecs per call, hrm. Alot of errors, too.
Can you do
strace -o /root/whatever.log -f -p 12345 (whatever the pid is) and look for the futex lines to see what is causing them to error?
kmctrl is just a program that was written to interface with a driver / mode in the kernel, it only works on IA32 systems (no amd64) and it's very beta and it does cause panics which I haven't been able to solve. The patches for all that stuff are about 2k in size, but again, it's all beta.
Posts: 12
Threads: 1
Joined: Oct 2010
Reputation:
0
10-18-2010, 12:18 PM
(This post was last modified: 10-18-2010, 01:56 PM by jimmy69.)
Code:
9047 futex(0xf700e098, FUTEX_WAIT_PRIVATE, 1619845, {0, 49983279} <unfinished ...>
9047 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
9047 futex(0xf700e07c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
9042 <... futex resumed> ) = 0
9047 futex(0xf700e098, FUTEX_WAIT_PRIVATE, 1620049, {0, 80971314} <unfinished ...>
9042 futex(0xf700e098, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0xf700e094, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
9047 <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
9042 <... futex resumed> ) = 0
9047 futex(0xf700e07c, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
9042 futex(0xf700e07c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
9047 <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
9042 <... futex resumed> ) = 0
9047 futex(0xf700e07c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
9042 futex(0xf700e058, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
Grepped futex: http://pastebin.com/UEE10zEP
That sucks about kmctrl.
|