                     - Documentation for KMSGDUMP v0.4 -
               [Sun Sep 19 19:30:32 CEST 1999] - Willy Tarreau


1. What is KMSGDUMP ?
~~~~~~~~~~~~~~~~~~~~~

KMSGDUMP is an extension to the Linux kernel which allows the user on the
console to dump the last kernel messages onto a floppy diskette, thus
avoiding to take a pen and a paper to copy them when the system is stuck.
Only 3"1/2, 1.44 MB diskettes are supported by default. Other capacities
might work, provided you change the geometry in the file "kmsgdump.h".


2. How does it work ?
~~~~~~~~~~~~~~~~~~~~~

There are two ways of getting a dump :

   - by pressing SysRQ+D (RightAlt - PrintScrn - D together) ;
   - after a kernel panic has occured, a dump may be automatically
     generated.

Before anything else, you MUST KNOW that in order to get maximal
chances to complete the dump succesfully, the CPU is rebooted in
real mode and disk accesses are made via the Bios. This ensures
that even if kernel memory is really corrupted, the dump still
has chances to work, but this also implies that after a dump has
occured, it is IMPOSSIBLE TO CONTINUE TO WORK WITH THE CURRENT
KERNEL. You will have to REBOOT. So when your kernel still responds,
you'd better get a similar dump by entering one of the following
commands :

# dmesg > /dev/fd0                   ( for RAW mode )

or

# dmesg | mwrite a:messages.txt      ( for FAT mode )


Second, be sure that FLOPPY CONTENTS WILL BE LOST AFTER A DUMP.
Even if there are cases in which you can dump at the end of a diskette without
losing the beginning, consider that by default the beginning of the diskette
will be ERASED and you won't be able to recover what's on it. You have been
warned.


3. Modes of operation
~~~~~~~~~~~~~~~~~~~~~

There are two modes of operation : manual and automated.

Manual mode (or interactive mode) is always entered if you hit SysRQ+D.
But it is also entered during a kernel panic if the current mode is set
to "manual". This mode is recommended for a developper's workstation,
or a kernel running under an emulator such as vmware. It's recommended
to disable interactive mode on servers which may crash when nobody is
near to reboot them.

Automatic mode can only be entered during a kernel panic and if automatic
mode was previously configured. Sometimes, the system is really weird and
even kmsgdump can cause recursive crashes (this has been reported to me once).
For this reason I've added a checkpoint mechanism to the code : every little
part of code is checkpointed, and if a crash occurs again, the same part is
not executed again, to prevent loopings. So there are more chances to get
to the reset routine which will, in the worst case, reboot the system, but
not let it loop undefinetely.

3.1. Manual mode
~~~~~~~~~~~~~~~~

Under manual mode, the screen initialized to color 80x25 mode (bios mode 3)
with a blue background.

 [Note: some people asked me to set other colors to avoid confusion
        with another OS' BSOD, but I couldn't find good associations.
        Even though I've received an interesting comment about the way
        to choose colors readable on any color or monochrome display,
        I'm waiting for suggestions, and for the moment we'll say that
        these are the colors of Midnight Commander and call this "BSOL"
        (blue screen of life) because this one is interactive.]

The screen is divided in two portions. The upper one displays the current
status (kernel version, drive unit, printer, format...), and the lower one
the messages captured before switching to real mode. The internal speaker
beeps if a key has not been hit within 3 seconds. This is simply to get
someone's attention, mainly in cases where no monitor is connected to the
PC.

The interface is not case-sensitive about keys pressed. Keys used are :
  Upper arrow : scroll messages to the beginning
  Lower arrow : scroll messages to the end
  B : immediately reBoot the system
  D : Dump messages onto the selected floppy with selected format. Warning:
      no check is done before, and the floppy will simply be overwritten by
      the messages.
  F : select Format, by switching between RAW and FAT12
  H : immediately Halt the system.
  I : display Information, little help about the keys.
  P : Print messages on the currently selected printer. If you press this key
      by accident, wait about one minute for the bios routine to timeout, and
      you'll here the beeps again, stating that you can play again.
  T : select next available prinTer. The system tests if a printer is
      connected at the other end of the cable, and skips the empty ports.
  U : change drive Unit. Although dump is possible on hard disks, they are
      never proposed in the interface to avoid dramatical mistakes.

Other keys are simply ignored.

After a succesful dump or print, 3 quick beeps are played. In case of an error,
only one beep is played. This is important if you act blindly with a keyboard
and no monitor.

3.2. Automatic mode
~~~~~~~~~~~~~~~~~~~

Automated operation is performed by the system only when a kernel panic
occurs. In this case, the system waits for the "panic_timeout" delay
to let you a few seconds if you want to try to play with SysRQ (sync,
unmount filesystems, ...). This delay is configurable by entering a
number of seconds in "/proc/sys/kernel/panic".

After that, the system is rebooted to real mode, and depending on the
mode of operation chosen, either the interactive mode is entered (see above)
or it is the automatic mode, which we'll describe here.

3.2.1. Start of operation
~~~~~~~~~~~~~~~~~~~~~~~~~
Some checks are performed. First, the system sees if the dump feature is
enabled or not. If not, operation ends (see below). If dump is enabled,
and if the "safe" flag is enabled, the diskette is verified to be a real
"KMSGDUMP" diskette and not another one (read section 4 to know how to
prepare a secure diskette for KMSGDUMP). If the diskette isn't a right one,
operation ends. If the diskette is a right one, or if the check has been
disabled, the dump is performed with the current parameters (unit, format...).

3.2.2. End of operation
~~~~~~~~~~~~~~~~~~~~~~~
After completion of an automatic dump, or when a dump is aborted, the system
can either halt or reboot. In case of redundant servers, you may prefer halt
a buggy system, because another one ensures the service continues to work.
But in other cases, you may prefer rebooting to quickly restart services.
This is also configurable (read section 4).


4. How a crash can be prepared
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4.1. Kernel options
~~~~~~~~~~~~~~~~~~~

First, choose the kernel compilation options which matches better your
situation. This may seem obvious, but you can reduce the risks of crash
by not enabling drivers designated for hardware you don't have. Specially
on servers, use only a reduced feature set, because you know exactly what
you need (eg: don't enable NTFS and QNX filesystems if you don't need them).

Configure KMSGDUMP options to match your needs. Don't ask to auto-dump if
you don't have a floppy drive. In this case, you might prefer to enable
interactive mode to display messages on the screen and eventually print
them.

When you use SCSI hard disks, you can sometimes reduce the reset time to
help the system recover faster. Eg: on my system, I have an AHA2940UW which
waits 15 seconds by default. All peripherals still work well with 1 second,
so 14 seconds are won.

If you have changed your messages buffer size (which is 16 kB by default),
you should accord the size in "include/asm/kmsgdump.h", parameter LOG_BUG_LEN.
Some people required 32 kB. But you shouldn't exceed 60 kB since the dump is
done in real mode (16 bits).

4.2. Configure KMSGDUMP
~~~~~~~~~~~~~~~~~~~~~~~

If your kernel supports SYSCTL, you can adjust KMSGDUMP parameters by
writing a string to /proc/sys/kernel/kmsgdump. This string consists in
a concatenation of flags. Most of them are only booleans. For each boolean,
a complementary flag exists to avoid any ambiguous interpretation.
For the moment, the flags are :

   Name Description         Default Complement
   F    FAT mode              Yes   R
   R    Raw mode                    F
   A    Automatic mode        Yes   I
   I    Interactive mode            A
   B    Boot after dump       Yes   H   only used in automatic mode
   H    Halt after dump             B   only used in automatic mode
   S    Safe mode             Yes   O   only used in automatic mode
   O    Overwrite disk              S   only used in automatic mode
   E    Enable dumping        Yes   D   only used in automatic mode
   D    Disable dumping             E   only used in automatic mode
   Txxx Track xxx             0   (N/A) first track is 0 per default
   Uxxx Unit xxx              0   (N/A) bios drive is 0 (A:) per default

Note: default means "default if none specified".

Example: if you enter the following command, a kernel panic will generate
         a dump in FAT mode after verifying that the disk has been prepared
         for a dump, and then it will reboot :

         # echo "FABSE" > /proc/sys/kernel/kmsgdump

This one will ask to dump raw messages at the end of the diskette in drive B
and halt :

         # echo "RABOET79U1" > /proc/sys/kernel/kmsgdump

And this one will ask for a quick reboot :

         # echo "DB" > /proc/sys/kernel/kmsgdump


4.3. Prepare a disk for kmsgdump
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If safe mode is required, before an automatic dump, the system will read
the beginning of the floppy in the drive and will look for the word "KMSGDUMP"
at offset 3 of the first sector. This is the label of the diskette. The dump
will only be performed if this word is found as-is. So if you enable safe mode
don't forget to prepare your diskettes with the following command, provided
your diskette is in drive A :

         # echo "012KMSGDUMP" > /dev/fd0

Please note that when the dump is performed in FAT mode, this word is written
to the same place. This has two side effects :
    - a diskette on which a dump has been done in FAT mode is re-usable without
      intervention.
    - you can prepare a diskette by entering kmsgdump (SysRQ+D) and doing a
      FAT mode dump.

On the other hand, when a RAW dump is done at the beginning of the disk, it
cannot be used again as a "safe kmsgdump disk". Moreover, letting it in the
drive when rebooting will cause the system to hang if the bios tries to boot
from the floppy first.

4.4. Prepare the PC for a crash
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Because you'll have to leave a diskette in a drive, you may have to setup
your bios to boot from hard disk or anything but the floppy first because
the bios will find anything but a bootable system on this floppy. The problem
is with older systems on which the boot sequence cannot be changed. For this
reason, when a diskette is formated in FAT mode, a small code is inserted on
the boot sector which tries to redirect the boot to the first hard disk seen
by the bios. This is *generally* the bootable disk, but this may not be the
right on specific systems, so you may have to do some tests before considering
this option to be the right one for you.

If your system is a server, you may reduce the time the bios tests the PC to
ensure quick reboot. On some systems, you can turn on the option "Quick
power-on self test", and disable testings of memory above 1MB.


5. Reading the messages back
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5.1. FAT-formated disks
~~~~~~~~~~~~~~~~~~~~~~~

If the disk has been formated as FAT12, you'll find on it a file
named "MESSAGES.TXT" which contains all messages buffer. If the
buffer is not full, the end of the file is filled with zeroes, so
it's better to delete them using "tr" under linux.

  - under Linux, either mount the disk :

       # mount -rt msdos /dev/fd0 /mnt
       # cat /mnt/messages.txt | tr -d '\000'
       # umount /mnt

    or read it using mtools :

       # mtype a:messages.txt | tr -d '\000'

  - under DOS, you can simply run EDIT :

       C:\> edit a:messages.txt

  - under Windows, you can open the file with Wordpad. Avoid using
    Notepad since it doesn't understand linefeeds only.

5.2. RAW disks
~~~~~~~~~~~~~~

Raw disks will be readable under linux by using the utility DD. By
default, the dump will be performed from the first sector of the disk.
Example with 16 kB messages :

       # dd if=/dev/fd0 bs=512 count=32 | tr -d '\000'

If you specified "T79" in the parameters to dump on track 79 of the disk,
you have to do some calculations :

A 1.44 MB disk has 18 sectors/track, 2 heads and 512 bytes/sector so
18*2*512 equals 18432 bytes/track. You'll have to skip 18432 bytes for
each unwanted track. But you can also count only with kilobytes : if you
consider that a track is exactly 18 kilobytes, then skip the number of
tracks times 18 kilobytes :

       # expr 79 \* 18
       1422
       # dd if=/dev/fd0 bs=1024 skip=1422 count=16 | tr -d '\000'

The default dd utility reads all data from the start of the disk so this can
be quite long. There are other implementations on the net which do an "lseek"
before the first read.


6. Other speed improvements
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here are some advices to make a system reboot faster, especially if
you don't use filesystem journalling.

When a file server crashes, it may FSCK during a long time. There are good docs
about how to dramatically reduce FSCK time, but at least consider these 
methods :

    - in /etc/fstab, set the sixth field (fs_passno) to 1 for the root fs,
      and 2 for every other fs. FSCK will know it what it can parallelize
      depending on hardware dependencies. In the better case, you can devide
      the total time by the number of physical disks.
      (man fstab and man fsck for more info).

    - when possible, mount filesystems read-only. On an anonymous FTP server,
      for example, it's not always necessary to mount everything RW. So before
      copying files onto an fs, remount it RW :

         # mount -wo remount /mount/point

      At the end, remount it RO :
 
         # mount -ro remount /mount/point

    - change the number of bytes by inode and the block size when formating
      your FS. I personnaly use 16384 bytes/inode, a block size of 4096 bytes,
      the sparse flag set (reduces the number of superblocks). This makes me
      waste about 1% space, but total mount time is about 1 second for a total
      of 8 FS's, 11 gigs on 5 separate disks and the total FSCK time after a
      loosy power-off is less than 3 minutes.

And of course, don't start services you don't need ! Sendmail itself can take
a long time if it cannot resolve the domain name.


7. For more information and/or suggestions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For more informations, you can email me at :

       willy AT meta-x.org

( be patient, I read my mail when I can, and can't always reply. I'm
  used to "tail -1000 $MAIL|less" or "less +G $MAIL" )

For suggestions, you can either email them to me, or share them with
the Linux Kernel Mailing List :

       linux-kernel@vger.kernel.org

Enjoy using it,

Willy

