====== Kerrighed Trunk 4977 Installation on Debian GNU/Linux ======
There are 5 steps to do basic research on cluster computing with kerrighed: system and applications installation, building, testing, profiling cluster, and comparing the results.
===== Requirements =====
==== Hardware ====
* Minimum 2 x86 machine connected to a network
* Storage media with enough space that will be shared to our cluster
==== Software ====
* Debian GNU/Linux 6.0 (squeeze)
* Kerrighed with Subversion Trunk 4977. Linux kernel 2.6.20 and kerrighed source are included.
* Access to Debian packages repository, local or network/Internet
* Root access or sudo to the system for packages installation dan system modification
* Additional and development packages: build-essential, bzip2, rsync, xmlto
automake, libtool, pkg-config, lsb-release,
libncurses5-dev((if you want to configure kernel with make menuconfig))
kernel-package
if you want to package the kernel into debian package
===== Compilation and Installation Steps =====
- sha256sum -c krg-20090203.tar.bz2.sha256
- Change directory to /usr/src/cd /usr/src/
- tar jxf /home/stwn/krg-20090203.tar.bz2
- cd krg-20090203
- ./autogen.sh
- Configure source with your system, don't --enable-tests, it's broken./configure --enable-tests********************************************************************
Kerrighed tests configuration is now complete
********************************************************************
- apps : yes
- proc : yes
- ktp : yes
- benchmark : yes
********************************************************************
Kerrighed configuration is now complete
********************************************************************
- Sources dir : /usr/src/krg-20090203
- Kernel source dir : /usr/src/krg-20090203/kernel
- Kernel version : 2.6.20-krg (already patched)
- Kernel configuration : none. Run manually 'make *config' in kernel source tree
- Kerrighed module : yes
- libkerrighed : yes
- Kerrighed tools : yes
- Kerrighed tests : yes
********************************************************************
- Save default configuration for kerrighedmake defconfig
- Configure kernel if you want to add some network drivers you use in cluster# cd kernel
# make menuconfigI added support for many network cards available in 2.6.20 as module and configure built-in kernel for network card I use in cluster like sis900[*] e1000e[*] and [M] for others type which aren't selected as default. [ Device Drivers - Network device support - Ethernet (10 or 100Mbit|Ethernet 1000Mbit ] FIXME
- Compile the kernelcd ..
make kernelorsudo vim /etc/kernel-pkg.conf
make-kpkg --initrd --revision=2.6.20-svn-022009-1 kernel_image kernel_headers
- Install kernelmake kernel-installorcd ..
sudo dpkg -i linux-image-2.6.20-krg_2.6.20-svn-022009-1_i386.deb
- Compile kerrighed modules, tool and libkerrighedmakeNOTES: kerrighed modules are compiled with version 2.6.20-krg, if you want to append-version with make-kpkg, just change the version in kerrighed-2.2.1.
- Install kerrighed tool and libkerrighedmake install
- Remove default kernel in Debian, if you want to :)dpkg -P linux-image-2.6.26-1-686 linux-image-2.6-686
- Reboot
===== Configuration =====
terdapat satu komputer yang akan menyimpan image sistem dan aplikasi dan juga media penyimpan dan lain-lain seperti repo, wiki, demo, hasil
==== Head Node ====
=== Kerrighed ===
* Configure network card, if you haven't configured it yetvim /etc/network/interfacesauto lo eth1
iface lo inet loopback
iface eth1 inet static
address 192.168.0.11
netmask 255.255.255.0
* vim /etc/default/kerrighedENABLE=true
# If kerrighed has this feature
LEGACY_SCHED=true
* vim /etc/fstabconfigfs /config configfs defaults 0 0
* mkdir /config
* /etc/kerrighed_nodessession=1
nbmin=2
192.168.0.11:11:eth1
192.168.0.12:12:eth0
192.168.0.13:13:eth0
192.168.0.14:14:eth0
192.168.0.15:15:eth0
* Set boot parameter in /boot/grub/menu.lst, add "session_id=1 node_id=1"title Debian GNU/Linux, kernel 2.6.20-krg
root (hd0,3)
kernel /boot/vmlinuz-2.6.20-krg root=/dev/sda4 ro session_id=1 node_id=1
initrd /boot/initrd.img-2.6.20-krg
* Set /etc/hosts. NFS and MPI will resolve hostname to IP, so make sure it sets correctly.
* Reboot
=== TFTP & PXE Server ===
* Install atftpd and syslinux((dosfstools mtools syslinux syslinux-common))apt-get install atftpd syslinux
* mkdir /srv/tftpboot
* cp /usr/lib/syslinux/pxelinux.0 /srv/tftpboot/
* cd /srv/tftpboot/
* mkdir pxelinux.cfg
* vim pxelinux.cfg/defaulttimeout 5
# prompt 1
default kerrighed
label kerrighed
kernel vmlinuz
append initrd=initrd root=/dev/nfs nfsroot=192.168.0.11:/NFSROOT/kerrighed ip=dhcp ro session_id=1,rsize=4096,wsize=4096
label local
localboot 0Remove initrd, if you found out a hang state with error message "Wait for root filesystem". It's in initramfs, I don't know it's buggy or something.
* ln -s /boot/vmlinuz-2.6.20-krg /srv/tftpboot/vmlinuz
* sudo ln -s /boot/initrd.img-2.6.20-krg /srv/tftpboot/initrd
* vim /etc/default/atftpdUSE_INETD=true
OPTIONS="--tftpd-timeout 300 --retry-timeout 5 --mcast-port 1758 --mcast-addr 239.239.239.0-255 --mcast-ttl 1 --maxthread 100 --verbose=5 /srv/tftpboot"
* vim /etc/inetd.conftftp dgram udp4 wait nobody /usr/sbin/tcpd /usr/sbin/in.tftpd --tftpd-timeout 300 --retry-timeout 5 --mcast-port 1758 --mcast-addr 239.239.239.0-255 --mcast-ttl 1 --maxthread 100 --verbose=5 /srv/tftpboot
* /etc/init.d/openbsd-inetd restart
=== DHCP Server ===
* apt-get install dhcp3-server
* vim /etc/dhcp3/dhcpd.confddns-update-style none;
option domain-name "lskk.ee.itb.ac.id";
default-lease-time 600;
max-lease-time 7200;
log-facility local7;
option dhcp-max-message-size 2048;
# use-host-decl-names on;
# deny unknown-clients;
deny bootp;
next-server 192.168.0.11;
subnet 192.168.0.0 netmask 255.255.255.0 {
range 192.168.0.12 192.168.0.15;
filename "/srv/tftpboot/pxelinux.0";
option root-path "192.168.0.11:/NFSROOT/kerrighed";
}
* /etc/init.d/dhcp3-server start
=== NFS Server & NFSROOT ===
* apt-get install unfs3 | apt-get install nfs-kernel-server (/etc/init.d/unfs3 stop)
* apt-get install debootstrap
* mkdir /NFSROOT
* debootstrap lenny /NFSROOT/kerrighed http://192.168.0.10/stable
* chroot /NFSROOT/kerrighed/
* passwd
* mount -t proc none /proc/
* apt-get install dhcp3-common nfs-common nfsbooted((dhcp-client libevent1 libgssglue1 libkeyutils1 libkrb53 libldap-2.4-2 libnfsidmap2 librpcsecgss3 nfs-common nfsbooted portmap ucf))
* vim /etc/fstab/dev/hda none swap sw 0 0
none /proc proc defaults 0 0
none /var/run tmpfs defaults 0 0
none /var/lock tmpfs defaults 0 0
none /var/log tmpfs defaults 0 0
none /tmp tmpfs defaults 0 0
none /dev/pts tmpfs defaults 0 0
configfs /config configfs defaults 0 0
192.168.0.11:/media/storage /media/storage nfs rw,hard,nolock 0 0
192.168.0.11:/NFSROOT/home /home nfs rw,hard,nolock 0 0
* mkdir /config
* mkdir /media/storage
* sudo chown -R stwn /media/storage/ (or change to group render?)
* vim /etc/hosts
127.0.0.1 localhost
192.168.0.11 krg-01
192.168.0.12 krg-02
192.168.0.13 krg-03
192.168.0.14 krg-04
192.168.0.15 krg-05
* sudo cp -r /usr/src/* /NFSROOT/kerrighed/usr/src/
* chroot /NFSROOT/kerrighed/
* **apt-get install busybox initramfs-tools klibc-utils libklibc libvolume-id0 udev**
* **apt-get install automake make build-essential**
* **cd /usr/src/krg-20090203/**
* **make install**
* dpkg -i linux-image-2.6.20-krg_2.6.20-1_i386.deb
* adduser stwn
* sudo cp /etc/kerrighed_nodes /NFSROOT/kerrighed/etc/session=1
nbmin=2
192.168.0.11:11:eth1
192.168.0.12:12:eth0
192.168.0.13:13:eth0
192.168.0.14:14:eth0
192.168.0.15:15:eth0
* vim /etc/exports/NFSROOT/kerrighed 192.168.0.0/24(ro,async,no_root_squash,no_subtree_check)
/media/storage 192.168.0.0/24(rw,async,wdelay,no_root_squash,no_subtree_check)
/NFSROOT/home 192.168.0.0/24(rw,async,wdelay,no_root_squash,no_subtree_check)
* mkdir /media/storage
* mkdir /NFSROOT/home
* sudo /etc/init.d/nfs-kernel-server restart
NFS akan mencoba me-resolve alamat IP ke domain jika ada entri di /etc/hosts/krg-system/ krg*(rw,no_root_squash,no_subtree_check,sync,fsid=1)
Apa yang dilakukan di sistem server network booting, lakukan juga pada NFSROOT
==== Compute Node ====
Boot with PXE or gPXE, suite to your network card. Download compiled gPXE/etherboot from rom-o-matic.net
tg3, sis900, rtl8139
===== Testing =====
==== Kerrighed ====
groupadd nobody
fork-test
==== Cpuburn ====
apt-get install cpuburn
chroot /NFSROOT/kerrighed/
apt-get install cpuburn
burnMMX & # 3-4 times :-)
==== Blender ====
* apt-get install blender
* chroot /NFSROOT/kerrighed
* apt-get install blender
* skrip render
* contoh model scene
* buka berkas model, set direktori keluaran hasil render
Blender+OpenMP
copy install/ blender to $HOME
apt-get install libjpeg62
mencoder -mf on:w=640:h=480:fps=12 -ovc copy -o output.avi \*.jpg
==== MPI ====
* Setting up ssh in head-node and system inside NFSROOTssh-keygen
cp .ssh/id_rsa.pub /NFSROOT/kerrighed/home/stwn/.ssh/authorized_keys
chroot /NFSROOT/kerrighed/
su stwn
ssh-keygen
cp /NFSROOT/kerrighed/home/stwn/.ssh/id_rsa.pub /home/stwn/.ssh/authorized_keys
* Make sure /etc/hosts is set with hosts and their IPs, there is a resolving process during mpirun
* Set variable P4_RSHCOMMAND to sshP4_RSHCOMMAND=ssh
* Create a machine list file
krg-node0
krg-node1
* apt-get install mpich-bin libmpich1.0-dev
* mpicc mm-mpi.c -o mm-mpi
* Run MPI program with mpirunmpirun -np 4 ./Pi
* cp id_rsa.pub authorized_keys
* sudo cp /home/stwn/.ssh/authorized_keys /NFSROOT/home/stwn/.ssh/id_rsa.pub
* sudo cp /home/stwn/.ssh/id_rsa.pub /NFSROOT/kerrighed/home/stwn/.ssh/authorized_keys
* cp /home/stwn/.ssh/id_rsa.pub /home/stwn/.ssh/authorized_keys
* chroot /NFSROOT/kerrighed
* ssh-keygen
* mkdir /home/stwn/.ssh
* sudo apt-get install mpich-bin libmpich1.0-dev
* mkdir /media/storage/demo
* vim mm-mpi.c
* vim machinefile
* mpicc mm-mpi.c -o mm-mpi
* krgcapset -d +CAN_MIGRATE
* export P4_RSHCOMMAND=ssh
* mpirun -machinefile machinefile -np 16 ./mm-mpi
vim cpi.c
gcc cpi.c
vim Pi.c
mpicc Pi.c
./a.out
mpirun -np 24 ./a.out
mpirun -np 2 ./a.out
mpirun -np 4 ./a.out
==== OpenMP ====
OpenMP support in Kerrighed is unknown, people from NCHC said there is no support for OpenMP in Kerrighed
based-on reply email from Renauld Lottiaux
* apt-get install gcc-4.2
* vim mm-openmp.c
* Compile itgcc-4.2 -fopenmp mm-openmp.c -o mm-openmp
* export OMP_NUM_THREADS=10
* Run mm-openmptime ./mm-openmp
==== Loop ====
This simple loop program will test the process migration feature of Kerrighed with kernel 2.6.20.
* First, login to your console
* Set Kerrighed capability to CAN_MIGRATEkrgcapset -d +CAN_MIGRATE
* Create a simple program containing infinite loop and compile it
* Copy the exact program to the other nodes with the same absolute directory location
* Run loop program in one nodeloop &
loop &
loop &
loop &
* Akan ada pesan sistemsend_kerrighed_signal: 8 (events/0) -> 820741 (loop)
* Untuk memigrasikan proses secara manual gunakan perintahmigrate [process-id] [node]
===== Kerrighed Commands List =====
==== Status ====
krgadm nodes status
==== Start-Stop ====
krgadm cluster start
krgadm cluster reboot/poweroff
krgadm nodes poweroff -n 13
==== Lain-Lain ====
top ps free 'cat /proc/*'
kerrighed_nodes
kerrighed_session
===== Problems =====
* I run 4 blender process in node and has set capability set to CAN_MIGRATE, but none of this 4 process migrated. They like hung up on something, and some messages appeared:Null mapping count, non null mapping address : [mem-addr]Blender uses relatively big data and process, and is this the reason why blender process could not be migrated to another nodes? strongly connected?
* A program called cpuburn that does FPU calculations and check its result did well, but it still give us some messagesNull mapping count, non null mapping address : [mem-addr]
* The message is on shm_memory_linker.c, this deal with kerrighed's container?
* Programs that run on head node could not migrated to another node
* Muncul pesan pada node ketika melakukan network bootingGave up waiting for root device. Common problems: | Waiting for root filesystem
Boot args
* hati-hati masalah konflik paket untuk versi dan juga dependensinya
===== Tips =====
* cat /proc/cmdline untuk mengetahui boot parameter Linux
* lakukanmkinitramfs -k `uname -r` -o initrd-2.6.20-krgjika ingin menghasilkan initramfs secara manual
===== Reading List =====
* [[http://www.kerrighed.org/wiki/index.php/SchedConfig|Configurable scheduler framework]]
* [[http://source.ggy.bris.ac.uk/wiki/Configure_ssh_for_MPI|Configure SSH for MPI]]
* [[http://www.etherboot.org/wiki/usermanual#testing_etherboot|Etherboot User Manual]]
* [[http://kerrighed.org/wiki/index.php/Installing_Kerrighed_2.3.0|Installing Kerrighed 2.3.0]]
* [[http://kerrighed.org/wiki/index.php/V2.1.0_User_Manual|Kerrighed User Manual]]
* [[http://www.mcs.anl.gov/research/projects/mpi/mpich1/docs/faq.htm|MPICH Frequently Asked Questions]]
* [[http://bioinformatics.rri.sari.ac.uk/drupal/?q=wiki/tutorial_kerrighed|Tutorial: Kerrighed]]
* [[http://spot.river-styx.com/viewarticle.php?id=12|Using Blender with openMosix]]
* man debootstrap