Login

Chas · 08-07-2009, 07:52 AM

QuadCore (Intel® Xeon® CPU X3320 @ 2.50GHz)
4 GB

And i cant find CPUSpeed on the bios

Monk · 08-07-2009, 12:38 PM

CPUSpeed is a program on linux. In the BIOS, turn OFF EIST/Speedstep.

Make *sure* HPET is enabled.

Chas · (This post was last modified: 08-12-2009, 04:08 AM by Chas.)

cpufreq-set -g performance is set. Speedstep is turned off, and HPET cannot be found in the bios.

Any other ideas?

Monk · 08-12-2009, 11:27 AM

It could be lots of things, with Linux, it's probably somewhere between 1.5^32 different issues.

As long as the tickrate is steady I wouldn't worry about FPS anyways.

BehaartesEtwas · 08-12-2009, 05:54 PM

Monk Wrote:There is no difference in 32bit to 64bit, as the binaries are still 32bit.

The linux kernels are behaving much different on 32 and 64 bits! there are a lots of differences!

I have written a lot of things in my howto, especially at the end ("playing around" and "troubeshooting" sections). Did you try all that already? Also you might want to try the kernel I describe there (note: it is 64 bit! this is important!)

Monk Wrote:As long as the tickrate is steady I wouldn't worry about FPS anyways.

I would. The fps are important for the lag compensation. Else there would be no difference between a 100 and a 1000 fps server. But there is, even one can be seen very easily: Have a look at the ping listed in the output of the "status" rcon command (IIRC it's important to run it on the server, not the client!) while changing the server fps (e.g. using fps_max 0 and fps_max 100). It will rise by 5ms if limiting fps to 100.
This is because the server reads and sends network packages only every frame. 100 fps means a delay of 10ms between each frame, thus leading to a rise of 5ms on average (sometimes the packages are received right before a frame so you have no additional latency, sometimes right after and you get 10ms add. latency). Because this varies between 0ms and 10ms this is even a bigger problem. The lag compensation cannot know the real latency and will be imprecise naturally. This problem gets even bigger if the fps itself are not stable...

Chas · 08-12-2009, 06:50 PM

How do you fix the sourcetv fps drops?

I re-tryed doing it with the new kernel, and now it seems to be "fairly" stable without sourcetv, but with it you get the horrible drops.

I've already got a relay running, but that wouldn't fix anything since you need to enable sourcetv to make it relay in the first place

Chas · (This post was last modified: 08-13-2009, 02:06 AM by Chas.)

oh monk just saw your blog!, looks like you are the real srcds guru Toungue

Let me know if i can contact you via irc or something :}

Monk · 08-13-2009, 04:19 AM

BehaartesEtwas Wrote:
Monk Wrote:There is no difference in 32bit to 64bit, as the binaries are still 32bit.

The linux kernels are behaving much different on 32 and 64 bits! there are a lots of differences!

I have written a lot of things in my howto, especially at the end ("playing around" and "troubeshooting" sections). Did you try all that already? Also you might want to try the kernel I describe there (note: it is 64 bit! this is important!)

You don't know what you're talking about. The gameserver BINARIES are 32bit. On a 64bit system, the BINARIES are *EMULATED* via 32bit syscalls. THERE IS NO DIFFERENCE EXCEPT HOW THEY ARE LOADED ON THE STACK. Look at compat.c, specifically the header:

/*
* linux/kernel/compat.c
*
* Kernel compatibililty routines for e.g. 32 bit syscall support
* on 64 bit kernels.
*
* Copyright © 2002-2003 Stephen Rothwell, IBM Corporation
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/

If you look at the code, you will see they are emulated, then when a 32bit binary calls a 32bit syscall, glibc will call the 32bit syscall directly from the kernel. You do NOT get ANY benefit (no vdso/vsyscall support, nothing!) from running 32bit code on a 64bit system except maybe larger memory address space, which is no different than running PAE on 32bit kernels.

Monk Wrote:As long as the tickrate is steady I wouldn't worry about FPS anyways.

I would. The fps are important for the lag compensation. Else there would be no difference between a 100 and a 1000 fps server. But there is, even one can be seen very easily: Have a look at the ping listed in the output of the "status" rcon command (IIRC it's important to run it on the server, not the client!) while changing the server fps (e.g. using fps_max 0 and fps_max 100). It will rise by 5ms if limiting fps to 100.
This is because the server reads and sends network packages only every frame. 100 fps means a delay of 10ms between each frame, thus leading to a rise of 5ms on average (sometimes the packages are received right before a frame so you have no additional latency, sometimes right after and you get 10ms add. latency). Because this varies between 0ms and 10ms this is even a bigger problem. The lag compensation cannot know the real latency and will be imprecise naturally. This problem gets even bigger if the fps itself are not stable...
[/quote]

Lag compensation was borrowed from quake3, and butchered. Everything in the engine is estimated (prediction erroring is apart of life). The problem you don't understand is that FPS/pings is *estimated* from:

1.) A syscall that gives a guess of what the wallclock counter is supposed to be
2.) Another syscall that has no accuracy at all, and calls another syscall that has latency depending on what the scheduler is doing

Even if you fix 1.) and 2.) there's no way to really make FPS stable unless you lie to the engine and make it return 999/1000/10000/1000000

Interrupts do not fire precisely at the same time, therefor when nanosleep is called, it may be called a couple hundred uS later, and you'll see jitter on FPS.

BehaartesEtwas · 08-13-2009, 06:12 PM

please be a little more careful with your choice of words. I definitively know what I am talking about, I am developing software for the better part of my life now (sometimes very close to hardware), and I have some years experience with optimizing game servers.

you are right, srcds and hlds are 32 bit software. but they don't run in some emulated environment. they run in native 32 bit mode. x86_64 cpus can execute both 64 and 32 bit code, so there is no need for any emulation. Linux has both 32 and 64 bit libraries, and the kernel has a specific support to load 32 bit binaries (it has to switch the cpu to 32 bit mode for that process of course). still so far one could believe that there is no difference between 32 and 64 bit for the gameserver.

but: the kernel runs in 64 bit mode (on a 64 bit system), and so does every single system call from a certain point on (all drivers are 64 bit, so if you run e.g. some IO syscall you will end up in 64 bit sooner or later). Also, most important, the whole linux timer and scheduling system is 64 bit. this is the most important part for game servers, as they completely depend on them.

The problems you describe in the last part of your post are the reasons why I recommend using the RT paches. Hardware interrupts fire very precisely (this is "simple" electronics, a precision of a single usec is not a problem -> that's only MHz regime, even 10 nanosecs are not really difficult to reach), the only problem is the software that doesn't recognize them. Mainly this is because the CPU is busy with something else. Linux will handle the hardware ISR always at once (it has no other choice, the CPU runs the ISR when it receives the interrupt), but hardware ISRs need to be kept short so it will only set some flag somewhere in the kernel to be handled later. Then the linux scheduler will run the "real" software ISR when there is time. The RT paches will dramatically reduce the latency between the hardware interrupt and the recognition by the software. If you have a look at the debug facilities of the RT paches, you can see that "maximum" latencies of very few usecs are possible. Of course if heavy disk-IO is going on you will occasionally see higher latencies, that's why I wrote "maximum", and that's why the RT people are speaking of soft realtime.

And oh yes, I am pretty sure that a precision of some usecs of the clocks is possible, just have a look at ntp and what precision can be reached there (keep in mind that this is *absolute* precision which is much harder to reach then the relative precision we are talking about)!

I am sorry if I missed some of your points as some parts of your post are a little incomplete... (what do you mean by "make it return 999/1000/10000/1000000"??)

PS: Don't teach me about error estimation, I am a physicist Toungue

Chas · 08-13-2009, 11:50 PM

What do you recommend i do? Do i leave my massive fps drops as it is then? As there is no way around fixing SourceTV?

Monk · 08-14-2009, 12:58 PM

BehaartesEtwas Wrote:please be a little more careful with your choice of words. I definitively know what I am talking about, I am developing software for the better part of my life now (sometimes very close to hardware), and I have some years experience with optimizing game servers.

you are right, srcds and hlds are 32 bit software. but they don't run in some emulated environment. they run in native 32 bit mode. x86_64 cpus can execute both 64 and 32 bit code, so there is no need for any emulation. Linux has both 32 and 64 bit libraries, and the kernel has a specific support to load 32 bit binaries (it has to switch the cpu to 32 bit mode for that process of course). still so far one could believe that there is no difference between 32 and 64 bit for the gameserver.

but: the kernel runs in 64 bit mode (on a 64 bit system), and so does every single system call from a certain point on (all drivers are 64 bit, so if you run e.g. some IO syscall you will end up in 64 bit sooner or later). Also, most important, the whole linux timer and scheduling system is 64 bit. this is the most important part for game servers, as they completely depend on them.

The problems you describe in the last part of your post are the reasons why I recommend using the RT paches. Hardware interrupts fire very precisely (this is "simple" electronics, a precision of a single usec is not a problem -> that's only MHz regime, even 10 nanosecs are not really difficult to reach), the only problem is the software that doesn't recognize them. Mainly this is because the CPU is busy with something else. Linux will handle the hardware ISR always at once (it has no other choice, the CPU runs the ISR when it receives the interrupt), but hardware ISRs need to be kept short so it will only set some flag somewhere in the kernel to be handled later. Then the linux scheduler will run the "real" software ISR when there is time. The RT paches will dramatically reduce the latency between the hardware interrupt and the recognition by the software. If you have a look at the debug facilities of the RT paches, you can see that "maximum" latencies of very few usecs are possible. Of course if heavy disk-IO is going on you will occasionally see higher latencies, that's why I wrote "maximum", and that's why the RT people are speaking of soft realtime.

And oh yes, I am pretty sure that a precision of some usecs of the clocks is possible, just have a look at ntp and what precision can be reached there (keep in mind that this is *absolute* precision which is much harder to reach then the relative precision we are talking about)!

I am sorry if I missed some of your points as some parts of your post are a little incomplete... (what do you mean by "make it return 999/1000/10000/1000000"??)

PS: Don't teach me about error estimation, I am a physicist

There is no way to have a timer wake up exactly. No way at all. If you're in physics you'd understand time dilation, clocks are never accurate due to their relativistic effects. Even with the best hardware, you'll always need to sync up your clock with another to get a decent feel of time.

The game engine calls usleep(1000) and regardless of what's going on on the system, the best bet is a precision of -5 or -6. This is a limit of x86 hardware in general, and probably a limit of Linux. Even with RT you're still running into x86 issues with high interrupt latency.

Interrupts from hardware are only as good as the quartz crystal driving them. If you have high temps, the crystal will lose precision over time due to thermal variance, but that doesn't matter because interrupt latency is normally high on x86 due to it's design. You'll need ARM or something to lower it.

The drivers in a 64bit system are 64bit. The userland binaries are emulated in glibc, which call the 32bit compat libs inside the kernel. 32bit libs cannot, and do not, call 64bit code due to pointer length and other things. You do not get any benefit at all from running 32bit code on a 64bit system. What you fail to understand is, no matter what codebit is running in the kernel, the syscalls are all 32bit, and their kernel counterparts are 32bit. There is zero way a 32bit can call a 64bit syscall, period. There is a little translation going on with timers and other things.Timers inside the linux kernel are very complex, and they differ between 32bit and 64bit. Native 64bit binaries have a great deal more benefit than running 32bit in Long mode.

Another thing to consider is, the VALVe code is broken. It's not optimized for 64bit, so you lose an additional register on the stack (%ebx), plus the way it tracks time is pretty crappy too.

bigtin · 08-14-2009, 04:53 PM

Is there any benefit to running the amd binaries rather than i686 or are they just as bad as each other?

BehaartesEtwas · 08-14-2009, 05:41 PM

Monk Wrote:If you're in physics you'd understand time dilation, clocks are never accurate due to their relativistic effects. Even with the best hardware, you'll always need to sync up your clock with another to get a decent feel of time.

let's stop that discussion right here, it's getting silly.

My final statement is simply: In my experience, an 64 bit RT kernel works best for our needs, if used correctly.

@Chas: I personally never observed this sourcetv problems, on my system I don't get any drops when enabling the tv. Try reducing the load added by sourcertv by using only a relay slot (and an external sourcetv server) and try the following sourcetv settings (I took them from my war server):

Code:

tv_enable 1

tv_port 27120

tv_autorecord 0

tv_debug 0

tv_delay 90

tv_dispatchmode 2

tv_maxclients 1

tv_maxrate 0

tv_name "SourceTV"

tv_overrideroot 0

tv_password "xxx"

tv_relaypassword "yyy"

tv_snapshotrate 20

tv_transmitall 1

tv_delaymapchange 1

I hope this helps...

bigtin Wrote:Is there any benefit to running the amd binaries rather than i686 or are they just as bad as each other?

Some say yes, but I never observed any difference. Simply try it out.

MS|Illuminati · 12-08-2009, 12:50 PM

Can you clarify on how to setup a relay, i have not messed with sourceTV much but it is becoming required

BehaartesEtwas · 12-08-2009, 06:25 PM

on your war server limit the tv server to a single slot (e.g. using the config I posted in #28). that will be your relay slot where your real tv server connects. then you need a second server for the spectators to connect. start it like this:

./srcds_run -ip x.x.x.x -game cstrike -port 28015 +tv_port 28115 +tv_relay x.x.x.x:27015 +password somepassword

assuming your war server runs at port 27015 and has a tv_password set to somepassword. spectators will connect to port 28115, while rcon commands must go to 28015.

I hope I am not missing something, it's a long time ago I configured my relay server ;-)

Login
Username:
Password:	Lost Password?
	Remember me