How many more servers can we run on this box before we start to slow them down?
Some say "one server per core". You might disagree based on your current experience.
Try to install Munin and my "Source dedicated server statistics" plugin for it. Then follow how the FPS changes during the day. As long as you get only ~50 FPS fluctuation everything should be okay.
Also, does Hyperthreading really affect performance that much, and if so is there an easy tutorial for turning it off (I don't want to recompile the kernel unless I have to)?
Are you sure hyperthreading is even enabled on your system? See "cat /proc/cpuinfo" and calculate do you see four CPUs or more. If there's only four then ht is not enabled.
Hyperthreading most likely will decrease your server performance. Hyperthreading is meant for single-core CPUs to get better response times for GUI stuff. In a server environment there's too much overhead in the "multi-core emulation" aka. hyperthreading, so it's better be disabled.
Finally, what other things can I do to get the most performance out of this box?
They say custom compiled 1000 tick kernel is the way to go. I've been there done that with zillion different configs, from dynticks to real-time kernels with all combinations. The best this far is not the most hi-tuned server, but sort of basic kernel - although the server does also other things than just the game server. I think some players might see some difference on some LAN server with hi-tuned 1000 tick server, but if you're going to run multiple servers on your box on a "high-latency" internet connections, then it's probably better to go little less-tuned server but steadily.
Another tip which I've seen posted on many forums is to assign one core per one server. You can do it like "taskset -c 1 ./srcds_run -game cstrike ...." and the system will run the game server using only the CPU core #1 (which actually is core #0
).
In your case I'd probably try something little more complex (because you'll have multiple servers per one core anyway), so do it like "taskset -c 0,1 ./srcds_run -game dods ..." (run the first 3x13 servers like this) and then "taskset -c 2,3 ./srcds_run -game ...." for the other 3 servers. Then the three 500 fps dods servers will have cores 1 and 2 dedicated for them and cores 3 and 4 will run the other servers.
The idea in this is that the cores don't need to do so called "context switches" so much. One process will run much longer on one core, so there's less overhead in the internal processing when the processes are divided between the cores. If you run all the servers without assigning them to certain processors, then the processes will be run on all cores and every time the process is changed to another core, there will be little overhead in copying process memory etc. Most likely this is yet another myth that it'll slow things down, but you never know