====== Kerrighed ======
===== Kerrighed Installation on Debian Etch =====
==== Requirements ====
* Debian GNU/Linux 4.0r3
* Additional and development packages: bzip2 (You need a local repo here) FIXME
* Linux kernel 2.6.20
* Kerrighed 2.2.1 or 2.3.0
==== Steps ====
- Download linux-2.6.20, .sign file, and public key from kernel.org
- Download kerrighed 2.2.1 from kerrighed.org
- Put them in /usr/src/
- Import GPG key kernel.orggpg --import /home/stwn/kernel.pub
gpg: directory `/root/.gnupg' created
gpg: can't open `/gnupg/options.skel': No such file or directory
gpg: keyring `/root/.gnupg/secring.gpg' created
gpg: keyring `/root/.gnupg/pubring.gpg' created
gpg: /root/.gnupg/trustdb.gpg: trustdb created
gpg: key 517D0F0E: public key "Linux Kernel Archives Verification Key " imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: no ultimately trusted keys found
- GPG key verifygpg --verify /home/stwn/linux-2.6.20.tar.bz2.sign /home/stwn/linux-2.6.20.tar.bz2
gpg: Signature made Mon 05 Feb 2007 02:08:30 AM WIT using DSA key ID 517D0F0E
gpg: Good signature from "Linux Kernel Archives Verification Key "
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: C75D C40A 11D7 AF88 9981 ED5B C86B A06A 517D 0F0E
- apt-get install bzip2
- Extract linux kernel 2.6.20 archive:tar jxvf linux-2.6.20.tar.bz2
- Extract kerrighed archive:tar zxvf kerrighed-2.2.1.tar.gz
- Change directory to kerrighed-2.2.1/cd kerrighed-2.2.1/
- binutils cpp cpp-4.1 gcc gcc-4.1 libssp0 (gcc and its deps)
- build-essential dpkg-dev g++ g++-4.1 libc6-dev libstdc++6-4.1-dev linux-kernel-headers make
- upgraded: libc6 libc6-i686
- apt-get install pkg-config, rsync, lsb-release, xmlto((docbook-xml docbook-xsl libpaper-utils libpaper1 libxml2 libxml2-utils libxslt1.1 sgml-base sgml-data xml-core xmlto xsltproc))
- ./configure --with-kernel=/usr/src/linux-2.6.20/ --enable-tests (you must install some dependencies, i will write it later, see above)
- configure kernel if you want to add some network drivers you use in cluster (install libncurses5-dev), use make menuconfig
- make patch
- make defconfig
- make kernel
- make
- make kernel-install
- make install
- dpkg -P linux-image-2.6.18-6-686 linux-image-2.6-686
- Reboot
==== Configuration ====
* /etc/network/interfaces (if you haven't configured it yet)
* /etc/kerrighed_nodessession=1
nbmin=2
192.168.0.10:1:eth1
192.168.0.11:2:eth1
* boot parameter in grub or lilo: session_id=1 and node_id=1 for master node
* Cek /etc/default/kerrighedENABLE=true
==== Kerrighed Commands List ====
* krgadmkrgadm cluster start
krgadm nodes
krgadm nodes poweroff -n 13
* kerrighed_nodes
* kerrighed_session
* krg_capset -e +CAN_MIGRATE
==== Notes ====
* GCC that works with kerrighed compilation is version 4.2 or below. Read some bug report on GCC 4.3. Check with command:gcc --version
* mv /usr/bin/gcc /usr/bin/gcc.orig
* ln -s /usr/bin/gcc-4.2 /usr/bin/gcc
* There is an error during kerrighed compilation: CC init/version.o
LD init/built-in.o
LD .tmp_vmlinux1
kernel/built-in.o: In function `getnstimeofday':
(.text+0x10185): undefined reference to `__umoddi3'
kernel/built-in.o: In function `do_gettimeofday':
(.text+0x1023a): undefined reference to `__udivdi3'
kernel/built-in.o: In function `do_gettimeofday':
(.text+0x1025d): undefined reference to `__umoddi3'
kernel/built-in.o: In function `do_timer':
(.text+0x10c78): undefined reference to `__udivdi3'
kernel/built-in.o: In function `do_timer':
(.text+0x10c9b): undefined reference to `__umoddi3'
make[1]: *** [.tmp_vmlinux1] Error 1
make[1]: Leaving directory `/usr/src/linux-2.6.20'
make: *** [kernel] Error 2
* TPIC problemTPIC: Blocking bearer eth0
I was thought that I must reconfig kernel with make menuconfig and configure TPIC related feature and e1000 (what? you didn't have any of this device!)
* scpscp linux-2.6.20.tar.bz2.sign root@ip-address:/usr/src/
scp kernel.pub root@ip-address:/usr/src/
* libglib2.0-0 mc (recommended :D)
* pesan kesalahan ketika mengkompilasi 2.3.0, berhubungan dengan libkerrighed. solusi ubah Makefile di libkerrighed dari-Iinclude
ke:
-I/usr/src/linux-2.6.20/include
sepertinya ada kesalahan ketika meng-generate Makefile dari Makefile.am dan Makefile.in(?)
* ada common bug yang muncul pada 2.3.0 yaitu ketika mengetikkan krgadm cluster status
No cluster running
padahal cluster dah running. Perbaikan ada di trunk SVN INRIA, tapi bagaimana mengaksesnya. Sudah kucoba dengan akses anonymous belum bisa :!:
===== Install PXE Boot Server =====
* Install dhcp3-server, atftpd, syslinux, nfs-kernel-serverapt-get install dhcp3-server
apt-get install atftpd
apt-get install syslinux
apt-get install nfs-kernel-server
* mkdir /tftboot
* cp /usr/lib/syslinux/pxelinux.0 /tftpboot/
* ln -s /boot/vmlinuz-2.6.20-krg vmlinuz
* mkdir pxelinux.cfg
* cd pxelinux.cfg/
* copy tftpboot/pxelinux.cfg/default from another config e.g. openssi
* modify default (see kerrighed live-cd)
* vim /etc/exports
* sudo /etc/init.d/nfs-kernel-server restart
* /etc/dhcp3/dhcpd.conf
* /etc/init.d/dhcp3-server restart
* apt-get install debootstrap
* mkdir /krg-system
* debootstrap lenny /krg-system/ http://192.168.0.10/testing/
* echo "proc /krg-system/proc proc none 0 0" >> /etc/fstab
* mount proc /krg-system/proc/ -t proc
* chroot /krg-system/
* vim /etc/exports
* /etc/init.d/nfs-kernel-server restart
* vim default
* install openssh-server for use with MPI
==== Oprekan ====
* cp -r /lib/modules/2.6.20-krg/ /krg-system/lib/modules/
* cp -r kerrighed-2.2.1 /krg-system/usr/src/
* cp -r linux-2.6.20 /krg-system/usr/src/
* make install
* vim /etc/kerrighed_nodes (disesuaikan dengan id pada node kedua dan ip address, dst.)
* apt-get install gcc bzip2 make pkg-config rsync lsb-release xmlto (pada root filesystem node lain yang dibuat dengan debootstrap dan diekspor NFS perlu dilakukan instalasi persis seperti master node)
==== ERROR messages ====
* PXE Error: Violation or something like that, see /tftpboot or change tftpd to atftpd. it must be pxe got tftpd but it couldn't serve the file needed (pxelinux.0). Check permission of the file, directory, and name of the file
* Failure creating bla: failed & Read-only file system -> change rw to bootloader (pxelinux configuration) & /etc/exports
* kerrighed libs 2.3.0 didn't compiled succesfully, there is an error message:
the include header wasn't targetting the true location of the header. Solution: change directory to libkerrighed and edit Makefile, change line-Iinclude
to-I/usr/src/linux-2.6.20/include
I don't know the generation of the Makefile, is there a problem with Makefile.am or Makefile.in? I figured out this problem by diff-ing Makefile libkerrighed from current version and previous version that compiled well
==== Tips ====
* it's not needed anymoredd if=gpxe-git-tg3.usb of=/dev/sdb
147+1 records in
147+1 records out
75268 bytes (75 kB) copied, 0.0665268 s, 1.1 MB/s
* apt-cache search gethostipgethostip
Usage: gethostip [-dxnf] hostname/ip...
digunakan sebagai percobaan apakah pxelinux akan mengambil berkas konfigurasi default ataukah berkas konfigurasi dengan nama dari "hasil perintah gethostip"
* try gethostipgethostip 192.168.0.13
192.168.0.13 192.168.0.13 C0A8000D
* /bin/netstat -pln|less
* tail -f /var/log/daemon.log
* tail -f /var/log/message
===== Global Scheduling =====
Kerrighed versi terbaru yang dikhususkan untuk kernel Linux 2.6.x mempunyai framework scheduler baru, tidak seperti versi sebelumnya pada 2.4.x. Untuk dapat menggunakan fitur ini kita harus mempunyai Kerrighed dari trunk 2008-08-22, jadi jelas versi 2.3.0 yang dirilis April 2008 tidak memilikinya.
Yang perlu kita lakukan adalah mengunduh quilt patchset 2008-08-22 dan menambal Kerrighed 2.3.0. Cara yang termudah ternyata mengambil kode sumber dari trunk SVM INRIA.
* Checkout from [[https://gforge.inria.fr/plugins/scmsvn/viewcvs.php/trunk/?rev=4977&root=kerrighed#dirlist|SVN INRIA trunk kerrighed]]
* Coba jalankan perintah ini:sudo mount /config/
mount: configfs already mounted or /config busy
mount: according to mtab, configfs is already mounted on /config
* Konfigurasi scheduler ada di krg_legacy_scheduler
* Posisi sistem berkas pseudo scheduler ada di /config/
* Cek dukungan Kerrighed yang diperlukan untuk Global Scheduling di /boot/config-2.6.20-krg
* Salin fstab yang udah ada konfigurasi configfs ke sistem berkas root NFSsudo cp /etc/fstab /krg-system/etc/
* sudo vim /etc/exports, NFS akan mencoba me-resolve alamat IP ke domain jika ada entri di /etc/hosts/krg-system/ krg*(rw,no_root_squash,no_subtree_check,sync,fsid=1)
* Compile kerrighed as usual, see instructions above
* cd /boot/
* in case the installation of kerrighed overwrite our previous compiled kernel and configmv config-2.6.20-krg config.old.2
mv vmlinuz-2.6.20-krg vmlinuz-2.6.20-krg.old
mv System.map-2.6.20-krg System.map-2.6.20-krg.old2
cd /lib/modules/
mv 2.6.20-krg/ 2.6.20-krg.old
* Uninstall kerrighed-2.3.0cd /usr/src/kerrighed-2.3.0.old/
make uninstall
* instal kerrighed dari svn trunkcd ../krg-20080203-new3/
make kernel-install
make install
==== Tips SVN ====
* svn checkout -N svn://scm.gforge.inria.fr/svn/kerrighed/trunk krg-20080203-new
* cd krg-20080203-new/
* svn update -N kernel
* cd kernel/
* svn update Documentation
* svn update -N arch
* cd arch/
* svn update i386 x86_64
* cd ..
* svn update block
* svn update crypto
* svn update -N drivers
* cd drivers/
* svn update acom
* svn update tc telephony usb video w1 zorro
* tar cf krg-20080203.tar krg-20080203/
* scp krg-20080203.tar stwn@192.168.0.11:
==== Menjalankan Aplikasi ====
* login
* ketik perintahkrgcapset -d +CAN_MIGRATE
* jalankan loop hasil kompilasi loop.c, jangan lupa setiap harus mempunyai berkas yang sama pada direktori absolut yang samaloop &
loop &
loop &
loop &
* Akan ada pesan sistemsend_kerrighed_signal: 8 (events/0) -> 820741 (loop)
* Untuk memigrasikan proses secara manual gunakan perintahmigrate [process-id] [node]
==== Problem ====
* I run 4 blender process in node and has set capability set to CAN_MIGRATE, but none of this 4 process migrated. They like hung up on something, and some messages appeared:Null mapping count, non null mapping address : [mem-addr]
Blender uses relatively big data and process, and is this the reason why blender process could not be migrated to another nodes? strongly connected?
* A program called cpuburn that does FPU calculations and check its result did well, but it still give us some messagesNull mapping count, non null mapping address : [mem-addr]
* The message is on shm_memory_linker.c, this deal with kerrighed's container?
* Programs that run on head node could not migrated to another node
===== Reading List =====
* [[http://www.kerrighed.org/wiki/index.php/SchedConfig|Configurable scheduler framework]]
* [[http://www.etherboot.org/wiki/usermanual#testing_etherboot|Etherboot User Manual]]
* [[http://kerrighed.org/wiki/index.php/Installing_Kerrighed_2.3.0|Installing Kerrighed 2.3.0]]
* [[http://bioinformatics.rri.sari.ac.uk/drupal/?q=wiki/tutorial_kerrighed|Tutorial: Kerrighed]]
* [[http://kerrighed.org/wiki/index.php/V2.1.0_User_Manual|Kerrighed User Manual]]
* man debootstrap