Kerrighed

Kerrighed Installation on Debian Etch

Requirements

  • Debian GNU/Linux 4.0r3
  • Additional and development packages: bzip2 (You need a local repo here) FIXME
  • Linux kernel 2.6.20
  • Kerrighed 2.2.1 or 2.3.0

Steps

  1. Download linux-2.6.20, .sign file, and public key from kernel.org
  2. Download kerrighed 2.2.1 from kerrighed.org
  3. Put them in /usr/src/
  4. Import GPG key kernel.org
    gpg --import /home/stwn/kernel.pub
    gpg: directory `/root/.gnupg' created
    gpg: can't open `/gnupg/options.skel': No such file or directory
    gpg: keyring `/root/.gnupg/secring.gpg' created
    gpg: keyring `/root/.gnupg/pubring.gpg' created
    gpg: /root/.gnupg/trustdb.gpg: trustdb created
    gpg: key 517D0F0E: public key "Linux Kernel Archives Verification Key <ftpadmin@kernel.org>" imported
    gpg: Total number processed: 1
    gpg:               imported: 1
    gpg: no ultimately trusted keys found
  5. GPG key verify
    gpg --verify /home/stwn/linux-2.6.20.tar.bz2.sign /home/stwn/linux-2.6.20.tar.bz2
    gpg: Signature made Mon 05 Feb 2007 02:08:30 AM WIT using DSA key ID 517D0F0E
    gpg: Good signature from "Linux Kernel Archives Verification Key <ftpadmin@kernel.org>"
    gpg: WARNING: This key is not certified with a trusted signature!
    gpg:          There is no indication that the signature belongs to the owner.
    Primary key fingerprint: C75D C40A 11D7 AF88 9981  ED5B C86B A06A 517D 0F0E
  6. apt-get install bzip2
  7. Extract linux kernel 2.6.20 archive:
    tar jxvf linux-2.6.20.tar.bz2
  8. Extract kerrighed archive:
    tar zxvf kerrighed-2.2.1.tar.gz
  9. Change directory to kerrighed-2.2.1/
    cd kerrighed-2.2.1/
  10. binutils cpp cpp-4.1 gcc gcc-4.1 libssp0 (gcc and its deps)
  11. build-essential dpkg-dev g++ g++-4.1 libc6-dev libstdc++6-4.1-dev linux-kernel-headers make
  12. upgraded: libc6 libc6-i686
  13. apt-get install pkg-config, rsync, lsb-release, xmlto1)
  14. ./configure –with-kernel=/usr/src/linux-2.6.20/ –enable-tests (you must install some dependencies, i will write it later, see above)
  15. configure kernel if you want to add some network drivers you use in cluster (install libncurses5-dev), use make menuconfig
  16. make patch
  17. make defconfig
  18. make kernel
  19. make
  20. make kernel-install
  21. make install
  22. dpkg -P linux-image-2.6.18-6-686 linux-image-2.6-686
  23. Reboot

Configuration

  • /etc/network/interfaces (if you haven't configured it yet)
  • /etc/kerrighed_nodes
    session=1
    nbmin=2
    192.168.0.10:1:eth1
    192.168.0.11:2:eth1
  • boot parameter in grub or lilo: session_id=1 and node_id=1 for master node
  • Cek /etc/default/kerrighed
    ENABLE=true

Kerrighed Commands List

  • krgadm
    krgadm cluster start
    krgadm nodes
    krgadm nodes poweroff -n 13
  • kerrighed_nodes
  • kerrighed_session
  • krg_capset -e +CAN_MIGRATE

Notes

  • GCC that works with kerrighed compilation is version 4.2 or below. Read some bug report on GCC 4.3. Check with command:
    gcc --version
  • mv /usr/bin/gcc /usr/bin/gcc.orig
  • ln -s /usr/bin/gcc-4.2 /usr/bin/gcc
  • There is an error during kerrighed compilation:
      CC      init/version.o
      LD      init/built-in.o
      LD      .tmp_vmlinux1
    kernel/built-in.o: In function `getnstimeofday':
    (.text+0x10185): undefined reference to `__umoddi3'
    kernel/built-in.o: In function `do_gettimeofday':
    (.text+0x1023a): undefined reference to `__udivdi3'
    kernel/built-in.o: In function `do_gettimeofday':
    (.text+0x1025d): undefined reference to `__umoddi3'
    kernel/built-in.o: In function `do_timer':
    (.text+0x10c78): undefined reference to `__udivdi3'
    kernel/built-in.o: In function `do_timer':
    (.text+0x10c9b): undefined reference to `__umoddi3'
    make[1]: *** [.tmp_vmlinux1] Error 1
    make[1]: Leaving directory `/usr/src/linux-2.6.20'
    make: *** [kernel] Error 2
  • TPIC problem
    TPIC: Blocking bearer eth0

    I was thought that I must reconfig kernel with make menuconfig and configure TPIC related feature and e1000 (what? you didn't have any of this device!)

  • scp
    scp linux-2.6.20.tar.bz2.sign root@ip-address:/usr/src/
    scp kernel.pub root@ip-address:/usr/src/
  • libglib2.0-0 mc (recommended :D)
  • pesan kesalahan ketika mengkompilasi 2.3.0, berhubungan dengan libkerrighed. solusi ubah Makefile di libkerrighed dari
    -Iinclude

ke:

-I/usr/src/linux-2.6.20/include

sepertinya ada kesalahan ketika meng-generate Makefile dari Makefile.am dan Makefile.in(?)

  • ada common bug yang muncul pada 2.3.0 yaitu ketika mengetikkan
    krgadm cluster status
    No cluster running

    padahal cluster dah running. Perbaikan ada di trunk SVN INRIA, tapi bagaimana mengaksesnya. Sudah kucoba dengan akses anonymous belum bisa :!:

Install PXE Boot Server

  • Install dhcp3-server, atftpd, syslinux, nfs-kernel-server
    apt-get install dhcp3-server
    apt-get install atftpd
    apt-get install syslinux
    apt-get install nfs-kernel-server
  • mkdir /tftboot
  • cp /usr/lib/syslinux/pxelinux.0 /tftpboot/
  • ln -s /boot/vmlinuz-2.6.20-krg vmlinuz
  • mkdir pxelinux.cfg
  • cd pxelinux.cfg/
  • copy tftpboot/pxelinux.cfg/default from another config e.g. openssi
  • modify default (see kerrighed live-cd)
  • vim /etc/exports
  • sudo /etc/init.d/nfs-kernel-server restart
  • /etc/dhcp3/dhcpd.conf
  • /etc/init.d/dhcp3-server restart
  • apt-get install debootstrap
  • mkdir /krg-system
  • debootstrap lenny /krg-system/ http://192.168.0.10/testing/
  • echo “proc /krg-system/proc proc none 0 0” » /etc/fstab
  • mount proc /krg-system/proc/ -t proc
  • chroot /krg-system/
  • vim /etc/exports
  • /etc/init.d/nfs-kernel-server restart
  • vim default
  • install openssh-server for use with MPI

Oprekan

  • cp -r /lib/modules/2.6.20-krg/ /krg-system/lib/modules/
  • cp -r kerrighed-2.2.1 /krg-system/usr/src/
  • cp -r linux-2.6.20 /krg-system/usr/src/
  • make install
  • vim /etc/kerrighed_nodes (disesuaikan dengan id pada node kedua dan ip address, dst.)
  • apt-get install gcc bzip2 make pkg-config rsync lsb-release xmlto (pada root filesystem node lain yang dibuat dengan debootstrap dan diekspor NFS perlu dilakukan instalasi persis seperti master node)

ERROR messages

  • PXE Error: Violation or something like that, see /tftpboot or change tftpd to atftpd. it must be pxe got tftpd but it couldn't serve the file needed (pxelinux.0). Check permission of the file, directory, and name of the file
  • Failure creating bla: failed & Read-only file system → change rw to bootloader (pxelinux configuration) & /etc/exports
  • kerrighed libs 2.3.0 didn't compiled succesfully, there is an error message:
    
    
    

    the include header wasn't targetting the true location of the header. Solution: change directory to libkerrighed and edit Makefile, change line

    -Iinclude

    to

    -I/usr/src/linux-2.6.20/include

    I don't know the generation of the Makefile, is there a problem with Makefile.am or Makefile.in? I figured out this problem by diff-ing Makefile libkerrighed from current version and previous version that compiled well

Tips

  • it's not needed anymore
    dd if=gpxe-git-tg3.usb of=/dev/sdb
    147+1 records in
    147+1 records out
    75268 bytes (75 kB) copied, 0.0665268 s, 1.1 MB/s
  • apt-cache search gethostip
    gethostip
    Usage: gethostip [-dxnf] hostname/ip...

    digunakan sebagai percobaan apakah pxelinux akan mengambil berkas konfigurasi default ataukah berkas konfigurasi dengan nama dari “hasil perintah gethostip”

  • try gethostip
    gethostip 192.168.0.13
    192.168.0.13 192.168.0.13 C0A8000D
  • /bin/netstat -pln|less
  • tail -f /var/log/daemon.log
  • tail -f /var/log/message

Global Scheduling

Kerrighed versi terbaru yang dikhususkan untuk kernel Linux 2.6.x mempunyai framework scheduler baru, tidak seperti versi sebelumnya pada 2.4.x. Untuk dapat menggunakan fitur ini kita harus mempunyai Kerrighed dari trunk 2008-08-22, jadi jelas versi 2.3.0 yang dirilis April 2008 tidak memilikinya.

Yang perlu kita lakukan adalah mengunduh quilt patchset 2008-08-22 dan menambal Kerrighed 2.3.0. Cara yang termudah ternyata mengambil kode sumber dari trunk SVM INRIA.

  • Coba jalankan perintah ini:
    sudo mount /config/
    mount: configfs already mounted or /config busy
    mount: according to mtab, configfs is already mounted on /config
  • Konfigurasi scheduler ada di krg_legacy_scheduler
  • Posisi sistem berkas pseudo scheduler ada di /config/
  • Cek dukungan Kerrighed yang diperlukan untuk Global Scheduling di /boot/config-2.6.20-krg
  • Salin fstab yang udah ada konfigurasi configfs ke sistem berkas root NFS
    sudo cp /etc/fstab /krg-system/etc/
  • sudo vim /etc/exports, NFS akan mencoba me-resolve alamat IP ke domain jika ada entri di /etc/hosts
    /krg-system/ krg*(rw,no_root_squash,no_subtree_check,sync,fsid=1)
  • Compile kerrighed as usual, see instructions above
  • cd /boot/
  • in case the installation of kerrighed overwrite our previous compiled kernel and config
    mv config-2.6.20-krg config.old.2
    mv vmlinuz-2.6.20-krg vmlinuz-2.6.20-krg.old
    mv System.map-2.6.20-krg System.map-2.6.20-krg.old2
    cd /lib/modules/
    mv 2.6.20-krg/ 2.6.20-krg.old
  • Uninstall kerrighed-2.3.0
    cd /usr/src/kerrighed-2.3.0.old/
    make uninstall
  • instal kerrighed dari svn trunk
    cd ../krg-20080203-new3/
    make kernel-install
    make install

Tips SVN

  • svn checkout -N svn://scm.gforge.inria.fr/svn/kerrighed/trunk krg-20080203-new
  • cd krg-20080203-new/
  • svn update -N kernel
  • cd kernel/
  • svn update Documentation
  • svn update -N arch
  • cd arch/
  • svn update i386 x86_64
  • cd ..
  • svn update block
  • svn update crypto
  • svn update -N drivers
  • cd drivers/
  • svn update acom
  • svn update tc telephony usb video w1 zorro
  • tar cf krg-20080203.tar krg-20080203/
  • scp krg-20080203.tar stwn@192.168.0.11:

Menjalankan Aplikasi

  • login
  • ketik perintah
    krgcapset -d +CAN_MIGRATE
  • jalankan loop hasil kompilasi loop.c, jangan lupa setiap harus mempunyai berkas yang sama pada direktori absolut yang sama
    loop &
    loop &
    loop &
    loop &
  • Akan ada pesan sistem
    send_kerrighed_signal: 8 (events/0) -> 820741 (loop)
  • Untuk memigrasikan proses secara manual gunakan perintah
    migrate [process-id] [node]

Problem

  • I run 4 blender process in node and has set capability set to CAN_MIGRATE, but none of this 4 process migrated. They like hung up on something, and some messages appeared:
    Null mapping count, non null mapping address : [mem-addr]

    Blender uses relatively big data and process, and is this the reason why blender process could not be migrated to another nodes? strongly connected?

  • A program called cpuburn that does FPU calculations and check its result did well, but it still give us some messages
    Null mapping count, non null mapping address : [mem-addr]
  • The message is on shm_memory_linker.c, this deal with kerrighed's container?
  • Programs that run on head node could not migrated to another node

Reading List

1)
docbook-xml docbook-xsl libpaper-utils libpaper1 libxml2 libxml2-utils libxslt1.1 sgml-base sgml-data xml-core xmlto xsltproc
 
doc/kerrighed.txt · Last modified: 2013/02/14 01:36 by stwn · [Old revisions]
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki