HardwareA Linux cluster with
dedicated Intel Core 2 quad core, AMD Phenom quad core, dual quad and hexa core Xeons. |
SoftwareCharmm,
molecular dynamics, PB, molecular mechanics etc. |
|
Machine |
CPU |
Memory |
Dock |
Min |
Dyn |
mod |
g98 |
|---|---|---|---|---|---|---|---|
|
Linux g77 |
Cyrix 6x86-200 |
32Mb |
|
15,012 |
7,920 |
6,480 |
|
|
Linux Pgroup |
Winchip2 -240 |
48Mb |
|
7,632 |
3,840 |
|
|
|
IBM 3CT |
Power2 |
128mb |
46,786 |
6,768 |
3,340 |
3,136 |
6,082* |
|
IBM 590 |
Power2 |
64mb |
46,613 |
6,696 |
3,310 |
3,094 |
|
|
SGI Indigo2 |
R8K-75Mhz |
96Mb |
38,540 |
6,156 |
3,196 |
3,322 |
|
|
SGI PwrChal |
R8K-75Mhz |
192Mb |
38,523 |
6,084 |
3,068 |
3,381 |
5,288 |
|
SGI indy |
R5K-180 |
192Mb |
102,911 |
5,904 |
2,967 |
3,960 |
|
|
SGI O2 |
R5K-180SC |
128Mb |
40,346 |
4,968 |
2,445 |
3,180 |
10,899 |
|
Linux g77 |
Pii-266 |
384Mb |
|
4,428 |
2,304 |
2,266 |
|
|
SGI O2 |
R5K-300 |
128Mb |
|
3,672 |
1,845 |
|
|
|
HP 9000 |
S Class |
4Gb |
|
3,372 |
1,689 |
2,460 |
2,743* |
|
Linux Pgroup |
Pii-266 |
384Mb |
53,040 |
3,171 |
1,582 |
|
5,900 |
|
Linux Pgroup |
Cel 300 |
256Mb |
|
2,944 |
1,505 |
|
6,433 |
|
SGI O200 |
R10K-180 |
256Mb |
23,455 |
2,555 |
1,275 |
1,744 |
3,238 |
|
SGI O2000 |
R10K-195 |
|
|
|
|
|
3,165 |
|
SGI Octane |
R10K-225 |
512Mb |
|
2,257 |
1,084 |
|
2,995 |
|
Linux Pgroup |
Cel 400 (66x6) |
128Mb |
|
2,112 |
1,024 |
953 |
5,736 |
|
Linux Pgroup |
Cel 450 (100x4.5; VIA PLE133) |
128Mb |
|
1,997 |
1,020 |
|
5,189 |
|
Linux Pgroup |
Pii-450 |
512Mb |
30,438 |
1,971 |
1,024 |
|
3,808 |
|
Linux Pgroup |
Cel 450 (100x4.5; Intel ZX) |
192Mb |
29,856 |
1,860 |
1,020 |
806 |
4,377 |
|
Linux Compaq |
AS 1200 21164/533Mhz |
512Mb |
|
1,786 |
843 |
|
|
|
Linux Pgroup |
Piii-500 CuMine 100Mhz FSB (BX) |
128Mb |
|
1,716 |
886 |
|
3,268 |
|
Linux Pgroup |
Piii-500 |
512Mb |
|
1,535 |
830 |
753 |
3,272 |
|
Linux Absoft |
Piii-500 |
512Mb |
|
1,470 |
768 |
|
|
|
Linux Pgroup |
Dual Cel 300A |
192Mb |
|
1,470 |
765 |
|
5,137 |
|
Linux Pgroup |
Cel-633 CuMine (i815e) |
128Mb |
|
1,344 |
636 |
|
5,206 |
|
Linux Pgroup |
Piii-667 CuMine 133Mhz FSB (BX) |
128Mb |
|
1,300 |
666 |
|
2,608 |
|
Linux Pgroup |
Piii-773 CuMine 133Mhz FSB (via) |
256Mb |
|
1,243 |
632 |
|
3,130 |
|
Linux Pgroup |
AMD K7-600 |
256Mb |
|
1,216 |
642 |
|
2,546 |
|
Linux Pgroup |
AMD Duron-650 |
256Mb |
|
1,150 |
546 |
|
3,274 |
|
Linux Pgroup |
Cel-800 CuM (SiS) |
256Mb |
|
1,145 |
624 |
|
4,435 |
|
Linux Pgroup |
Piii-850 CuMine 100Mhz FSB BX |
512Mb |
|
1,111 |
546 |
|
3,232 |
|
Linux Compaq |
DS 20 21264/500Mhz |
2Gb |
|
1,086 |
585 |
|
|
|
Linux Pgroup |
AMD-K7-700 |
256Mb |
|
1,020 |
516 |
|
2,166 |
|
Linux Pgroup |
Dual Piii-500 |
512Mb |
|
972 |
462 |
|
1,755 |
|
Linux Pgroup |
AMD-K7-800 |
256Mb |
|
930 |
461 |
|
2,114 |
|
Linux Pgroup |
AMD K7-800(T) |
256Mb |
|
888 |
455 |
435 |
2,104 |
|
Linux Pgroup |
Piii-1GHz (SiS) |
512Mb |
|
880 |
449 |
|
2,566 |
|
Linux Pgroup |
Piii-1Ghz (via) |
1Gb |
|
930 |
461 |
|
2,114 |
|
Linux Pgroup |
AMD K7-1.2G(T) |
1.5Gb |
|
586 |
297 |
|
1,568 |
|
Linux Pgroup |
2x P850 (100Mhz) |
1Gb |
|
568 |
293 |
|
|
|
SGI Octane |
Dual R10K-225 |
512Mb |
|
|
|
|
1,557 |
|
Linux Pgroup |
P4 1.6Ghz Intel 845D |
512Mb DDR |
|
577 |
282 |
|
|
|
Linux Pgroup |
AMD 1.3Ghz Duron (nvidia) |
256Mb |
|
528 |
272 |
|
1,666 |
|
Linux Pgroup |
AMD 1500+ (via) |
1Gb |
|
504 |
259 |
|
1,438 |
|
Linux Pgroup |
2x Piii-1Ghz (via) |
1Gb |
|
490 |
259 |
|
|
|
Linux Pgroup |
2x AMD K7-800 |
256Mb |
|
490 |
259 |
|
|
|
Linux Pgroup |
AMD 1800+ (Ali) |
1GB SDR |
|
446 |
225 |
|
768 (g03) |
|
Linux Pgroup |
P4 2.0 Ghz Intel 845D |
512Mb DDR |
|
439 |
225 |
|
1,095 776(a11) |
|
Linux Pgroup |
AMD 1800+ (Via) |
1.5GB DDR |
|
437 |
222 |
|
1,196 1,111(a11) |
|
Linux Pgroup |
AMD 2400+ Semptron (Via) |
768Mb DDR |
|
430 |
219 |
|
626 (g03) |
|
Linux Pgroup |
AMD 2000+ (Amd) |
1.0GB DDR |
|
405 |
206 |
|
625 (g03) |
| OS X Intel | Intel Core Duo 1.83 | 512mb | 396 | 204 | |||
|
Linux Pgroup |
P4 2.8Ghz HT (800FSB) |
2.0GB DDR400 |
|
330 |
173 |
|
404 (g03) |
|
Linux Pgroup |
2x AMD 2000+ (Amd) |
2.0GB DDR266 |
|
222 |
118 |
|
398 (g03) |
| Linux Pgroup (64bit) |
Celeron D 2.8G (J) | 2.0 GB DDR667 | 346 | 181 | 385 (g03) | ||
|
Linux Pgroup |
P4 3.0Ghz HT (800FSB) |
1.0GB DDR400 |
|
315 |
166 |
|
374 (g03) |
|
Linux Pgroup |
2x AMD 2400+ (Amd) |
2.0GB DDR266 |
|
192 |
100 |
|
363 (g03) |
|
Linux Pgroup |
2x Xeon 2.4 |
1.0GB DDR400 |
|
231 |
115 |
|
297 (g03) |
| Linux Pgroup (64bit) | Pentium dual E2160 | 2.0Gb DDR667 |
175 | 91 | 173 (g03) | ||
| Linux Pgroup (64bit) | Pentium 915D | 2.0GB DDR266 | 190 | 104 | 153 (g03) | ||
| Linux Pgroup (64bit) | Core2 Duo E4400 | 2.0Gb DDR667 |
181 | 95 | 153 (g03) | ||
| Linux Pgroup (64bit) | Pentium 915D | 1.0GB DDR400 | 189 | 101 | 152(g03) | ||
| Linux Pgroup (64bit) | Pentium 915D | 2.0GB DDR667 | 187 | 100 | 150 (g03) | ||
| Linux Pgroup (64bit) | Pentium 930D | 4.0GB DDR667 | 176 | 93 | 137 (g03) | ||
|
Linux Pgroup (64bit) |
2x Opteron 246 |
4.0GB DDR400 |
|
167 |
87 |
|
136 (g03) |
| Linux Pgroup (64bit) | AthlonX2 4400+ | 4.0GB DDR400 |
150 | 79 | 122 (g03) | ||
| Linux Pgroup (64bit) | Core2 Duo 6550/6600 | 8Gb DDR667 | 148 | 81 | 123 (g03) | ||
| Linux Pgroup (64bit) | Xeon Quad E5405 | 16Gb DDR667 | 69 | 39 | 75 (g03) | ||
| Linux Pgroup (64bit) | Dual Xeon Quad E5405 | 8Gb DDR667 | 43 | 31 | 65 (g03) |
The g98 results marked with '*' are not comparible to those without, with '*' were completed with version a5 the others with a7. I am attempting to rerun these as I can as soon as I can
Things that stand out are, with a performance compiler Linux and AMD or P4 chips the results are stunning. The SGI O200 was once good but alas has had its day with the IBMs not being very good at all. I've also tried to use the Absoft compiler under Linux but Gaussian only support the Portland one which worked so well we bought the Portland one and are very very happy with it indeed. For a mid range cluster Celeron chips on a cheap fully integrated mainboard such as an Intel or SiS seem to be a really decent purchase.
Gaussian is g03d2, same binary for all machines listed with the superscript a, SVWN 6-31+G* job, 31 atoms, 381 basis functions.
Dynamics
is parallel Charmm34, a 680 residue dimer in a truncated octahdron with
images, total 60,000 atoms, 10000 steps with 18Angs cutoffs, same
binary is used for the runs with superscript a.
| CPU | Memory | Dynamics /s | Gaussian03 /s |
| Intel Core 2 E2160 dual corea | 2Gb DDR667 | 21,796 | 16,269 |
| Intel Pentium D 915D dual corea | 2Gb DDR667 | 23,121 | 9,074 |
| Amd 2x Opteron 246a | 8Gb DDR400 | 19,233 | 13,181 |
| Intel Core2 Duo E4400b | 2Gb DDR667 | 19,858 | 11,895 |
| Intel Pentium Dual Core E2220a | 4Gb DDR 667 | 16,803 | 11,227 |
| Intel Core 2 Duo E6600a | 8Gb DDR667 | 15,144 | 6,655 |
| Amd Phenom X4 9550a | 4Gb DDR667 | 10,860 | 5,597 |
| Amd Phenom X4 9550b | 4Gb DDR667 | 8.681 | 4,108 |
| Amd Phenom X4 9650b | 4Gb DDR800 | 8,406 | 3,890 |
| Amd Phenom X4 9950b | 4Gb DDR667 | 7,490 | 3,671 |
| Intel Core 2 Quad Q6600a | 8Gb DDR667 | 8,199 | 3,466 |
| Intel Core 2 Xeon Quad E5405a | 8Gb DDR667 | 8,053 | 3,557 |
| AMD PhenomII X4 955b | 4Gb DDR667 | 5,220 | 3,334 |
| Intel Core 2 Quad Q9550b | 8Gb DDR667 | 7,861 | 2,900 |
| Intel Core 2 Dual Xeon Quad E5405b | 8Gb DDR800 | 4,847 | 2,268 |
| Intel Core 2 Xeon Quad X3440b |
4Gb DDR1333 |
5,195 |
2,137 |
| Intel Core i7 Quad 2600b | 8Gb DDR1333 | 3,407 | 1,574 |
| Intel Core 2 Dual Xeon 6-Core E5645b | 32Gb DDR1333 | 2,986 (8 cores) | 1,511 |
| Intel Core 2 Dual Xeon Quad X5450b | 16Gb DDR800 | 3,336 | 1,418 |
| Intel Core 2 Dual Xeon Quad X5550b | 16Gb DDR1333 |
2,998 |
1,113 |
| Intel Core 2 Dual Xeon 6-Core X5650b |
32Gb DDR1333 |
2,787 (8 cores) |
959 |
| Intel Dual Xeon 6-Core E5-2620b |
32Gb DDR1333 |
2,580 (8 cores) |
568 |
aThese results using pgroup compilers with target -tp x86
bThese results using pgroup compilers with target -tp k8-64,p7-64,core2-64,barcelona-64, compiled for 512Mb cache system.
There are benchmarks for Gromacs listed at this site.
This site is a huge Beowulf dedicated to computational Biology, it produces great speedups for simulations with Charmm.
procheck,
protein structure verification.
bbdep,
calulation of bbone and side chain angles.
solvate,
creates solvent shells.
molmol,
molecular graphics.
rasmol,
molecular graphics.
pymol, molecular
graphics
vim,
vi replacement editor.
molden,
molecular graphics.
gnuplot,
graphing package.
perl, a
scripting language.
python,
another scripting language.
mpich,
message passing libraries
Samba,
exports unix filesystems for mounting under Windows.
Cups, common unified print system.
portland compilers,
high performance compiler suite (HPF,F90,F77,C++,C) for Linux.
Intel compilers,
high
performance
comipilers for Linux (F95, F77, C++, C).
firefox,
web broswer.
While NIS was doing a great job for us with the single site in Taipei and indeed also managing the accounts and hosts and also the automount maps here and at the university over the VPN this years (2003) Chinese New Years project was to get LDAP up and running. After about 5 days of total frustration it all started to come togeather in one evening. Five days of no progress and then suddenly it started to work and within another 8 hours I had all the users added (for both Unix and Samba accounts), the hosts up and also the automount maps. Not only that I had also managed to configure a slave LDAP server at the students site which would keep them synchronised with us. As a bonus feature we are only using LDAP via the tls encryption mechanism for additional security. Overall I'm impressed, the package I used was from openldap, the documentation was pretty poor for a beginner really and I found more help from reading the Nov 2002 Linux Journal and then searching the deja and google archives for examples from other people. It also involved recompiling samba (for RedHat 7.3/8.0) using the .src.rpm files to use LDAP features. LDAP is very very frustrating and confusing at first but after you manage to do one thing with it (for me adding my first user) it then becomes much easier to do anything else as you can just keep modifying the scripts. Samba comes with some excellent scripts for converting from smbpasswd to LDAP, openldap itself comes with excellent scripts for populating the directories (useraccounts etc) its just the documentation that is weak. Perhaps I should document sometime what I did sometime rather than complain! Anyway there is no going back for us now, LDAP is far superior for us to use than NIS.
OpenLDAP servers can also provide data to the Apple Desktop machines
with very little needed in the way of configuration. All in all it
works and it works well.
After far too many attacks from crackers which resulted in three machines in our office being cracked (year 1999) and hearing about cracked SGI O2000's from another building we decided to go with a firewall. Instead of our previous 30 IP addresses we moved to just 3. One IP is the firewall ethernet card, the other two handle the communications for the user-mode-linux kernel that acts as the internet server for various things such as smtp, imaps, opensshd etc etc. We are surprised to find that our entire internet connections can be reduced to a single 100Mbit ethernet connection. The firewall is a standard PC running Linux with 2 ethernet cards (hi-quality 3Coms) and the Linux 2.4.21 kernel with iptables and it just works fine. We converted to the private C-class domain in less than one morning and fixed the only problem what came up that afternoon. It is now so much more secure and in fact easier to do things, for instance we can now run a dhcpd server without worring about other people. The firewall config is a Gigabyte board with a Piii-850 cpu and a 3ware 2-Channel 3W-7006-2 raid-1 card. It worked so well we installed another one at our students site and have created a VPN linking the two domains using ipsec from openswan. One thing we did find was that runing NAT and ipsec togeather meant we had to filter the ipsec packets leaving both firewalls so that they were not NAT'd. The freeswan docs had suggestion for that which did not work. After some digging I realised that I could mark the packets entering each firewall based on their destination and then use that mark to tell the netfilter POSTROUTING rules not to SNAT them the lines below do that
iptables -t mangle -I PREROUTING -s 192.168.100.0/24 -d 192.168.101.0/24 -i eth1 -j MARK --set-mark 0x1Since we liked the raid-1 cards from 3Ware so much we bought another two of them and fitted one to each machine to help us with reliability. With the price of disks being so cheap now going for raid makes alot of sense for us. Shame that the 3Ware cards cost so much. We have investigated other cards but the cheap ones all seem to use software raid not hardware. With the excess of CPU available nowadays this can be seen as not a problem but we preferred to use hardware on these machines. From talking with other people it seems that the `Software' raid cards are no faster than using the Linux MD driver for raid-1 so for the workstations we will use that instead of pci cards.
iptables -t nat -I POSTROUTING -m mark --mark 0x1 -j ACCEPT
iptables -t mangle -A FORWARD -s 192.168.0.224/255.255.255.240 -j MARK --set-mark 0x3So all packets from the dynamics IP adresses are marked with 0x3, then if they are from 'good' ports or going to 'good' ports they are remarked at 0x2. This allows us to put different restrictions on good use and not so good use. The Linux command tc then set the actual throttle:
iptables -t mangle -A FORWARD -d 192.168.0.224/255.255.255.240 -j MARK --set-mark 0x3
iptables -t mangle -A FORWARD -d 192.168.0.224/255.255.255.240 -p tcp -m tcp --sport 80 -j MARK --set-mark 0x2
iptables -t mangle -A FORWARD -s 192.168.0.224/255.255.255.240 -p tcp -m tcp --dport 22 -j MARK --set-mark 0x2
TC=/sbin/tcThese TC command act on both the internal and external network cards (hence the eth0 and eth1 parts), the dynamic IP machines only get 5Mbit between them for 'good' internet use and the 'not so good' is very harshy treated with a total bandwidth allowance of 12Kbit during the working day. Most people use the qsc disc for TC but recently a much simpler
$TC qdisc add dev eth0 root handle 11: htb
$TC class add dev eth0 parent 11:0 classid 11:1 htb rate 5Mbit
$TC class add dev eth0 parent 11:0 classid 11:2 htb rate 12Kbit
$TC filter add dev eth0 parent 11:0 protocol ip handle 2 fw flowid 11:1
$TC filter add dev eth0 parent 11:0 protocol ip handle 3 fw flowid 11:2
$TC qdisc add dev eth1 root handle 10: htb
$TC class add dev eth1 parent 10:0 classid 10:1 htb rate 5Mbit
$TC class add dev eth1 parent 10:0 classid 10:2 htb rate 12Kbit
$TC filter add dev eth1 parent 10:0 protocol ip handle 2 fw flowid 10:1
$TC filter add dev eth1 parent 10:0 protocol ip handle 3 fw flowid 10:2
Since the arrival of Apple notebooks we decided to provide a wireless access point, this is using WPA and also filters on the MAC address. The device is a D-Link DWL-7100AP access point and links right into the ethernet network. We used to firewall it off and go via IPSEC to a Linux machine when we used WEP but the WPA is supposed to be enought to keep us secure and it is also easy for the users to manage.
We use OpenVPN, the macs can connect via a VPN using the tunnelblick software to the internal network using . Which basically sets up a host to network transport, the authentation using SSL certificates which where very easy to set up using the openvpn software. Of course the iptables needed a few tweaks to allow the tun connections and we had to push a internal network but by and large this is working well at the moment, it even works if the notebook is behind a firewall on a NAT network which is really really useful.
So the lab to lab VPN runs over OpenSwan and is up and running all the time but for the roadwarriors we have then connect via OpenVPN, using both together on a single firewall is actually less promblematic than trying to do both things with a single package. We can make changes to the roadwarriors without it touching the lab-to-lab VPN and vice-versa. With things being routed over different interfaces we can also limit the connections if needed.
A quick hotwo:
Quite a few people have asked me about out raid setups, this
is essentially quite easy to explain.
The main and backup fileservers have dedicated raid-6 devices connected
via U320 SCSI. These devices are
stand-alone and have 8 hot-swap SATA disk bays and 750Gb disks.
The cluster, firewall and admin machines each have a 2 channel 3Ware
card
and 2 disks in a hot-swap cabinet to ensure that a disk failure will
not taken them off-line. The mail ssh gateway is now running on a QNAP
TS-210 so has internal software raid-1.
The File-servers use Linux software raid and have 2 disks setup
as raid-1, althought we may switch this to 4x 1.5Tb software Raid-5's
in the future so that the internal disks can be used to backup an
external system if need be.
To monitor all these raids is quite easy, 3ware provide a tool
that monitors its raids and can be
connected to via a web browser (3dm). This we set up on all 3Ware raid
machines on a high port. Each night
one machine connects to all the 3Ware raid machines and downloads the
1st page, it then parses the downloaded
file looking for a single report line which is then packaged up with
all the other single report lines.
For the Linux machines things are similar, every hour a perl script
runs on each software raid machine
that checks the /proc/mdstat file and
counts the number of "[UU]" lines it finds, if it finds four the script
then writes a small html file with the hostname,
date and an OK, if there are not four "[UU]" lines instead of "OK" it
says "Degraded". Each software raid machine is running a tiny web
server
which
serves only this
file and also on the same port number as the 3Ware machines. The
machine that checks the 3Ware raids
each night also downloads and parses these files and bundles all the
results togeather and then emails the results to
me. This way each morning I am greeted with an email saying if we have
any disks failed in any of the raids.
In the main office at IBMS
the two file servers are Linux Servers with Xeon 3440 cpus,
these use fully integrated mainboards from Supermicro based on Intel chipsets
and 4Gb of ram. Each machine has 2 gigabit network sockets and a third one
for a dedicated ipmi controller card.
To help with the stability of the machines each has a pair of 250 Gb
disks setup as a raid-1 system using Linux software raid.
Each machine has an LSI U320 SCSI card linking it to its storage of 2x Janus 4.2Tb raid-6 devices
from SA-3340S.
Each machine handles this fine. For backups we have another pair of servers in a different
place but the disks are changed, the main set has WD disks and the backups uses Seagates.
To handle the backups from the students
site another machine has a Linux software raid-5 system based on 5x 750 disks.
To cover the circumstances that one of the raids may actually break
and need to go off for repair we also have a pair of QNAP TS-410s
with 4x 1.5Tb disks. These can be attached to the main servers via
iSCSI and take the place
of one raid (each) if need be. When we upgraded the raids from 500Gb to
750Gb disks we used the QNAPs for a week as the file store for the main
servers and they functioned fine if alittle on the slow side.
At the students site we have gone with a smaller solution, we use a Proware 5 disk hardware raid box with 5x 500Gb disks in it. This connects to a single SATA port on the mainboard and appears as a single huge disk. All the disks were mounted in a 5 tray hot-swap disk box which takes up 3 cd-rom sized bays in the server. The capacity is 2Tb with the server itself is another Intel Core2 E6500 with a Supermicro mainboard with 4Gb of ram and another Intel Gigabit card. Their raid-5 is mirrored each week to one we have in the basement at IBMS.
Recently I have picked up a new MacBookPro 13", 2.4 core duo. Its a nice machine. The rest of the group has metal MacBooks while my boss has a MacBook Air. Everyone is very happy with the notebooks. I think in future we will probably go from the metal Macbooks/Pros to plastic macbooks because they offer better value. Being a Linux lab using OS X is ideal for us for our notebooks and now that they can be intel with the intel C/Fortran compiler it is even more useful for us.
The compute nodes are organised as a Beowulf Cluster consisting of Quad Xeons, Core2's and Phenoms. In total we have ~100 compute CPU's at the moment, all running Scientific Linux 5.4 More details can be found here.
We have a small collection of quite nice printers available all of which support duplex printing, in the main office we have a fast HP lj2430DTN with twin trays and a colour hp lj3800DN and a lj2300DN is in the another office The students are doing okay as well as their office has a fast networked hp lj2200DN and a colour lj2605DN. In case anyone is really interested we use CUPS to manage all our printing. We even use the cups driver under WinXP to send postscript jobs to the cups server and from there to the printers. The integration between CUPS and SAMBA is really really good nowdays. The OS X machines use CUPS as well and so are integrated into the printing network straight from being switched on.
Last update: Wed Jan 19 15:43:25 CST 2011
Comments to: jon _at_ sinica.edu.tw
These pages were created using vim
-a very much vi-improved.