Command Line

GNAT 2012 for Debian Wheezy

For purposes known mostly within our company, I’ve created a magic makefile that takes care of most of the work involved with getting a GNAT 2012 (and up-to-date various support tools) up and running on a Debian.

The makefile requires you to download some files manually, but that’s okay, as it gives you a great opportunity to visit AdaCore’s libre webpage. The readme file explains it in more detail.

At this point, the makefile will build (and/or install) the following projects.

  • GNAT 2012 – binary
  • GNATCOLL (Gnatlib) – SVN build
  • AWS (Ada Web Server) – Git build
  • Florist GPL 2012 release
  • XML-Ada 4.3 (release)
  • Yolk (AWS support library)
In time, the makefile will be expanded to support more libraries and projects.

The makefile can be found at GitHub, hope you find it useful or want to contribute.

Persistant enumeration of network interfaces across hardware addresses

If you’ve ever created a Linux virtual machine image for distribution, you have probably encountered this problem. The Virtualization is not really important – the problem exists on VirtualBox, VMWare, KVM, Xen and others supporting Linux guests.

So, what is the issue exactly?

The problem is that when you recreate a a virtual machine using an existing image, a new MAC address is generated. Udev sees this new address and creates a static name for it – e.g. eth1. Keeping the original – now removed – eth0 in the persistant naming file.

This interface typically does not exist in the /etc/network/interfaces file, and therefore is never “upped”. This leaves you with a machine that has no network connection out of the box.

Well, how do I fix it then?

The simple solution is to remove the /etc/udev/rules.d/70-persistent-net.rules file and reboot.
This means you have to remove the file as the last thing you do before shipping the image. I you make changes to the image, and reship it – you must do this every time. Plus it prevents easy redistribution of your image.

So, I’ve placed a small bash script at /usr/local/bin/reenumerate_interfaces and called it from /etc/rc.local.

This script also have the advantage, that you do not have to reboot the machine for the changes to take effect.

#! /bin/sh
 
echo Re-enumerating the network interfaces...
 
# Stop networking
/etc/init.d/networking stop
<pre lang="bash">
# Get the drivers and bounce the NICs
lspci -nnv | awk '{
if ($0 ~ /^$/) { getDriver = 0; }
if (getDriver == 0 && $0 ~ /.*Ethernet controller.*/) { getDriver=1; }
if (getDriver == 1 && $0 ~ /.*Kernel driver.*/) { printf ("%s\n", $5); }
}' | sort | uniq | while read a; do
 
echo Unloading network module $a from kernel space
rmmod $a
done
 
# Remove the static enumerations
rm /etc/udev/rules.d/70-persistent-net.rules
 
echo Reloading driver via udev
# Tickle udev
/sbin/udevadm trigger
 
# Start networking
/etc/init.d/networking start

Thinkfan on a Lenovo T420

Warning: tampering with the fan control on you laptop is potentially dangerous, and I will not take any responsibility if anything happens to your machine. With that being said; I have used Thinkfan for various Thinkpads (7 I think) with no problems at all. Just be warned.

A lovely Lenovo T420 has recently dropped into my possession – not the worst thing that can happen. Well, actually it is a company laptop – but enough with the chit-chat.

Linux support on this machine is far from perfect, but one of the most annoying things about it, is the fan noise. It is constantly on at ~3500rpm and very noticeable (read: annoying).

First, install Thinkfan. (I owe my sanity to these developers!)

$ sudo aptitude install thinkfan

To enable fan control, the kernel acpi module must be enabled with explicit support.

$ sudo su -c "echo \"options thinkpad_acpi fan_control=1\" > /etc/modprobe.d/thinkfan.conf"

And reload the module.

$ sudo modprobe -rv thinkpad_acpi && sudo modprobe -v thinkpad_acpi

We also need the coretemp module, which is not autoloaded.

$ sudo modprobe -v coretemp

To make it autoload, append it to /etc/modules.

sudo su -c "echo \"coretemp\" >> /etc/modules"

Now, edit /etc/default/thinkfan to enable the thinkfan service to start. There are a lot of precautions here 😐

START=yes

Then, the following to /etc/thinkfan.conf, just after the comments – before the threshold declarations.

sensor /sys/devices/platform/coretemp.0/temp1_input
sensor /sys/devices/platform/coretemp.0/temp2_input
sensor /sys/devices/virtual/hwmon/hwmon0/temp1_input

Finally, start Thinkfan.

sudo /etc/init.d/thinkfan start

And.. Enjoy the silence :-)

Heads up for the people who led me to the solution:
[1] http://ubuntuforums.org/showthread.php?t=1749186#9
[2] http://thinkpad-wiki.org/Thinkfan

 

Yet another goodbye

For those who are about to halt -p, we salute you.

I’m moving out – of my first vserver that is.

I have had the server running for, well

lucretia:~# uptime 
 19:19:59 up 700 days, 22:51,  2 users,  load average: 0.02, 0.06, 0.02

Wow.. Time flies.

I have been migrating the services running in the machine to another, server. Not is the time to do the inevitable.

Sorry old chap, better you than me.

lucretia:~# halt -p

Broadcast message from root@lucretia (pts/1) (Fri Sep 30 19:23:24 2011):

The system is going down for system halt NOW!
lucretia:~# Connection to lucretia.greenpc.dk closed by remote host.
Connection to lucretia.greenpc.dk closed.

Of course, prior to this, I made sure to make a complete copy of the file system like so. Isn’t rsync the best tool?

rsync --progress -poazuHK -e ssh --delete --exclude /proc --exclude sys --exclude dev / home.greenpc.dk:/mnt/primary/backup/lucretia.greenpc.dk

Endnu en ubrugelig valgprognose

Som tidligere omtalt byggede vi pü min tidligere uddannelse et stemmesystem med det formül at stille entry- og exitpolls tilgÌngelig ved et lokalt valgmøde.

Valgmødet foregik i Nykøbing F. og var primÌrt rettet mod unge under 18. Det var altsü rent teoretisk.

Vi gjorde os dengang følgende observationer:

Resultat før mødet

Man kan tydeligt se en tendes til en højredrejning i prognosen baseret pü stemmer indsamlet ved indgangen.
Til gengĂŚld bliver billedet markant anderledes ved udgangen. Det skal bemĂŚrkes at stemmeterminalerne først blev aktiveret ved mødets afslutning, og derfor kun reprĂŚsenterer meningsfordelingen hos de “vĂŚlgere” der deltog i hele mødet.

Resultat efter mødet

Ud over at en stor del (ca. 50%) af deltagerne har forladt mødet viser tallene ogsĂĽ at der nu er en kraftig venstreskĂŚv fordeling – isĂŚr til Enhedslisten der er gĂĽet markant frem.

Büde Venstre og Socialdemokratiet er güet tilbage. Dansk Folkeparti er det parti der har lidt størst tab under mødet; ud af de 114 stemmer de fik ved indgangen, er 100 vÌlgere enten udvandret eller har skiftet mening.

Jeg syntes observationene er interessante, men de har jo intet at gøre med det forestüende valg. Müske andre kan hive flere spÌndende konklusioner ud fra disse tal?

Til de kodeinteresserede kan serverdelen og kommandolinje klienten hentes her:

valgsystem.tar.gz

Migrating Dovecot 1.2 Maildir to Dovecot 2.0 dbox

I am in the process of migrating to a new mail server. Therefore I need to, as painlessly as possible, move users. The details about the setup is another story for another day – promise.

This guide is targeted for Debian systems, but the concepts apply for all other systems as well.

Dovecot 2.0 comes with a nice tool called dsync which eases migration by a great deal. Unfortunately, my current mail server runs Dovecot 1.2 and therefore does not have the tool.

What to do, then.

Basically I have thought up three options for migrating.

  1. Using dsync on both sides
  2. Using rsync, then dsync
  3. Using dsync over sshfs

This post will serve as documentation for my experiments with mailbox migration.

If you are in a hurry, you can skip to the conclusion.

Using dsync on both sides

Being that I run Dovecot 1.2 and thus do no have dsync available I will need to pull down the sources and compile them myself. (I do not want to use dpkg’s as they may intervene with the existing installation.)

I got as far as getting the source compiled, but have not investigated further. Some paths were wrong – I cowardly quitted.

Later experiments with the two other approaches have shown that this, most likely, will not prove successful.

Using rsync then dsync

Next solution was to create a two step migration solution. First I used rsync to copy my Maildir mailboxes to the new server.

rsync -poazuHK -e ssh \ 
     root@oldmailserver.tld:/var/spool/postfix/virtual/ \ 
     /var/vmail.migrate/

You can log in as root here, as the -o (preserve ownership) maps the uname to the uid on the target system. Clever :-)

Then, run dsync for each user in order to import the new emails.

dsync -R -u myaddress@mydomain.tld backup \
maildir:/var/vmail.migrate/mydomain.tld/myaddress/Maildir/

Mirroring does not really make sense here as we have a local copy of the mailbox

This approach is by far the fastest and easiest.

Using dsync over sshfs

Notice: This only works with backup and not mirror.

Why? Dovecot2 log format is incompatible with Dovecot1’s that will timeout with a message about an unknown record type (0x8000) after a mirror operation.

# apt-get install sshfs
sshfs -o uid=`id -u vmail` -o allow_other \
vmail@oldmailserver:/var/spool/postfix/virtual/ \
/var/vmail.lucretia/

Remember the -o allow_other or the dsync will fail because the vmail user will not have access to the mount point.

Then, run dsync for each user in order to import the new emails.

dsync -R -u myaddress@mydomain.tld backup \
maildir:/var/vmail.oldhost/mydomain.tld/myaddress/Maildir/

Ownerships is of the essence here. Do not use root as this user will take ownership of dovecot metadata files causing your source mail server to coredump or just stall.
vmail is not the best option either – but I was lazy. You should take advantage of the fact that the vmail folders are (usually) gid vmail. Putting a migration user in this group and chmodding will probably be preferred, security-wise.

This approach works well when refined (eg. usĂ­ng the right uid on both sides), but is pretty slow – about 100kb/s sync. This not really acceptable for 1GB+ mailboxes. But as always, your milage may vary.

Your remote Dovecot will keep on running as nothing has happened – if you get the permissions correct. Unfortunately there are problems with the dovecot transaction log resulting in problems with uid of the Mailbox being inconsistent, resulting in something like this:

Error: Corrupted transaction log file /var/vmail/domain.tld/username/dbox/mailboxes/INBOX/dbox-Mails/dovecot.index.log seq 4: indexid changed 1313910265 -> 1313868319 (sync_offset=0)

Conclusion

My previous attempts have lead me to one conclusion: I need to move the mailbox once.

I chose the rsync+dsync approach and then did the following:

  1. Migrated all users to the new server
  2. Updated DNS
  3. rsync’ed first time
  4. Stopped the Dovecot and Postfix service on the old server
  5. rsync’ed second time
  6. dsync’ed the mailboxes
  7. Turned virtual_mailbox_maps and domains into relay_recipient_maps and domains respectively

If you decrease the TTL for you domain up until the move, you can minimize downtime. If you maintain a local DNS – even better.

This is not the fancy minimal down-time approach I had hoped for, but it has been sufficient for my needs. Feel free to contribute feedback.

Troubleshooting

I got a:

dsync(root): Fatal: Mail locations must use the same virtual mailbox
hierarchy separator (specify separator for the default namespace)

Some google-ing revealed that I needed to setup a namespace separator. The technical explanation for this left to the more Dovecot-savy.

In short, add the following to /etc/dovecot/conf.d/10-mail.conf (or uncomment the relevant ones).

namespace {
  separator = /
  inbox = yes
}

An now it works. migration is just a matter of setting up a cron job now, lower the TTL on the domain and move in day or two.

I got some

Error: Can't rename mailbox INBOX to
INBOX_ff3e01082bcf4e4e352c00002b747e8a:
Renaming INBOX isn't supported.

Using rsync->dsync which I haven’t been able to solve yet. Maybe shutting down the Dovecot service on the remote side would help. Race conditions are likely to occur.

Proftpd and LDAP on Debian Squeeze

This is a short howto (hopefully) providing enough information to install Proftpd and use LDAP as user database.

Background

I have become obsessed with LDAP – at least for the time being. It seem to be the answer to my redundancy and distribution plans.

A production server is in the process of being converted (migrated actually) to have a single SSO LDAP structure.

A virtualization host crash (thank you Linode) forced me to move a couple of sites onto this new fancy LDAP server. Shortly after, a user prompted me about the lack of FTP on the new webhost.

Now the shoe needs to fit.

Installing the required packages

This is the easy part.

# apt-get install proftpd-mod-ldap

The LDAP module will depend on the proftpd server so this is really the only thing you need to install.

Requirements for the LDAP server

The LDAP module for Proftpd is hard coded to lookup only users of objectClass: posixUsers which in my opinion is less intuitive than having a specified schema for proftpd.

An example .ldif is shown below. I have added objectClass: domain, which is unnecessary.

The uidNumber and the gidNumber maps the uid and gid on the system. 115 is proftfd user and 65534 is group nobody. From a ftp client owner will appear as domain.tld or whatever you specify as uid.

version: 1

dn: dc=domain.tld,ou=webhosting,dc=example,dc=com
objectClass: domain
objectClass: top
objectClass: posixAccount
cn: domain.tld
dc: domain.tld
gidNumber: 65534
homeDirectory: /var/www/domain.tld/www
uid: domain.tld
uidNumber: 115
loginShell: /bin/false
userPassword::

Configuring the authentication

First you need to edit /etc/proftpd/ldap.conf to match you LDAP setup. Somthing like this is appropriate.

<IfModule mod_ldap.c>
  LDAPServer ldap://example.com/??sub
  LDAPDNInfo "cn=proftpd,dc=example,dc=com" "password"
  LDAPDoAuth on "ou=webhosting,dc=example,dc=com"
</IfModule>

notice the ??sub after the ldap. This is very important as it specifies the search scope. The configuration parameter LDAPSearchScope is apparently ignored.

Again, a sour comment; the bind should have been done as the user logging in, and not as a dedicated user. Admin is a bad choice – create a dedicated user. Besides, the /etc/proftpd/ldap.conf is world readable!

Next you have to tell proftpd to load the module.
Uncomment the line

LoadModule mod_ldap.c

in /etc/proftpd/modules.conf.

Now you have to uncomment the line.

Include /etc/proftpd/ldap.conf

in /etc/proftpd/proftpd.conf to load the Ldap configuration.

Finally:

While editing proftpd.conf you should also lift the RequireValidShell restriction (or give the user a valid loginShell parameter. If do not do this, you will not be able to log in.

Now is the time to take a look at the standard proftpd configuration and make sure that anonymous login is disabled and ditto /etc/passwd users.

HD time-lapse movies with Motion and Linux

NSLU2 with webcam
The system

Background

I have previously experimented with time lapse videos, but wanted a more dedicated platform which could be set up, and run pretty much anywhere.

This is the first iteration, where the the purpose is to get the system up and running with headless operation.

Components

The original idea was to use a PC Engines alix1d system board in a box1c enclosure, but unfortunately the board I had was running very unstable – so I brought in an old friend of mine:

The Linksys NSLU2 aka. “slug”. The one I had ran Debian 5.0 Lenny, but had to be upgraded in order to get the webcam to work.

I recklessly tried doing a dist-upgrade, but ended up with bricked slug. Guess a fresh installation was the right answer indeed.

Debian Squeeze on a NSLU2

Due to a required proprietary firmware, the official Debian 6.0 installer does not ship with support for the on board Ethernet controller – which is bad because this is the only way of communicating the the device. Well, technically you can use the serial pin header or an USB Ethernet device, but I think I have burned the circuit for the serial port in a previous modding attempt :-\

There is a few guides that  give you directions on how to add the proprietary firmware to the installer image, and after about 5 reflashes I finally had one that worked.

Before starting the installation, I checked around for known installation errors. The installation takes about 5 hours, so you really want to get i right the first time.

I learned that others had experienced out of memory errors during the installation. Though luck.

To the rescue came Martin Michlmayr. He has the answer to all my quarrels; a compiled guide, with a complete Debian 6 userspace and kernel. This saved me a lot of time.

 Install and configure Motion

You can install motion by

apt-get install motion

as root or via sudo.

On Debian (Squeeze in my case), Motion is disabled by default – as many other services. Enable it, as mentioned in the notice:

Not starting motion daemon, disabled via /etc/default/motion ... (warning).

Setting the value start_motion_daemon to yes in /etc/default/motion as such:

start_motion_daemon=yes

 

The trick to disable motion detection in Motion, is to set the threshold to 0 in the config file:

threshold 0

Enabling time-lapse by setting the following in /etc/motion/motion.conf:

# Use ffmpeg to encode a timelapse movie
# Default value 0 = off - else save frame every Nth second
ffmpeg_timelapse 10

In this case, I take a pictures every ten seconds.

You should also adjust the width and height parameters, and the target_dir.

You can also get a copy of my preconfigured motion.conf by running the following set of commands

/etc/init.d/motion stop
mv /etc/motion/motion.conf /etc/motion/motion.conf.orig
wget http://retrospekt.dk/files/motion.conf -O /etc/motion/motion.conf
mkdir /home/motion
chown motion:motion /home/motion
chown root:motion /etc/motion/motion.conf
chmod g+r /etc/motion/motion.conf
/etc/init.d/motion start

An example can be seen here: http://retrospekt.dk/files/timelapse.mpg

ZFS drive replacement

This is a post about the robustness of zfs, and can serve as a mini how-to for people who wants to replace disks and do not have hot spare in the system.

Background

Last Monday, our local area was hit by a tremendous rainfall which caused our basement to be flooded. You can see the pictures of the flood  here. Sorry about the quality. The primary objective was to savage various floating hardware :-\

Wet hardware is also the reason fort this post. Upon entering the basement I remembered my fileserver that was standing on the floor and quickly (and heroically) dashed to its rescue.

Unfortunately the server had already taken in quite a lot of water and three of its four raid-z (raid5) disks were already ankle deep in water.

I did not manage to take any pictures at the time, but took some today in order to illustrate where the waterline was.

 

This is the inside of the case side. If you look carefully, you can see the traces after the water.

My crude drawing skills was put to the test in order to create this.

An approximation of the waterlevel

Needless to say, I was quite worried about the state of my data. I quickly removed the power plug and rushed the computer off to dry land (the living room) where a brave team consisting of my girlfriend and son; started drying the disk components after I had disassembled them – well, removed the circuit board at least.

After each disk had been dried, I carefully put them back together and tried to power them on – one by one.
Surprisingly, they all spun up, meaning that the motors were okay – yay!

Next step was to put them back into the fileserver and hope for the best.

And, to my relief, It booted! And the zpool came online! That was amazing! Apparently, nothing was lost. But just to be sure i ran a scrub on the pool.

This is the result:

  pool: pool1p0
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub scrub completed after 5h0m with 0 errors on Tue Aug  2 03:20:10 2011
config:

	NAME        STATE     READ WRITE CKSUM
	pool1p0     ONLINE       0     0     0
	  raidz1    ONLINE       0     0     0
	    ad4     ONLINE       0     0     0
	    ad6     ONLINE       0     0     0
	    ad10    ONLINE      51     0     0  1.50M repaired
	    ad12    ONLINE       0     0     0

errors: No known data errors

I consider myself a very lucky man. Only 1.5M of corruption? 3 of 4 disks partially submerged in water. Wow!

Anyway. I rushed to buy three new disks, and replaced one of them (ad10) as soon as it arrived I started replacing them, one by one.

I of course did a full rsync of the date in the storage pool to a another computer.

Replacing the disks

Upon replacing the first diske, (I chose ad10 as this was the one that was marked as bad) I got this error:

nas1:~# zpool status
state: DEGRADED
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub resilver in progress for 6h22m, 86.62% done, 0h59m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	pool1p0                    DEGRADED     0     0    10
	  raidz1                   DEGRADED     0     0    60
	    ad4                    ONLINE       0     0     0  194M resilvered
	    ad6                    ONLINE       0     0     0  194M resilvered
	    replacing              DEGRADED     0     0     0
	      6658299902220606505  REMOVED      0     0     0  was /dev/ad10/old
	      ad10                 ONLINE       0     0     0  353G resilvered
	    ad12                   ONLINE       0     0     0  161M resilvered

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x32>

The zfs administrators guide explains that the corruption is located in the meta-object set (MOS), but does not give any hint on how to remove or replace the set. Admitted, I have not looked thoroughly into what the MOS actually is.

I put the original (faulted) ad10 disk back in, and the error went away (after a reboot).

Then I decided to try again. This time with ad4. Physical replacing the disk on the sata channel revealed this:

nas1:~# zpool status
  pool: pool1p0
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
	invalid.  Sufficient replicas exist for the pool to continue
	functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: none requested
config:

	NAME                     STATE     READ WRITE CKSUM
	pool1p0                  DEGRADED     0     0     0
	  raidz1                 DEGRADED     0     0     0
	    2439714831674233987  UNAVAIL      0    32     0  was /dev/ad4
	    ad6                  ONLINE       0     0     0
	    ad10                 ONLINE       0     0     0
	    ad12                 ONLINE       0     0     0

errors: No known data errors

Okay, then the replacement.

nas1:~# zpool replace pool1p0 2439714831674233987 /dev/ad4

… And the resilvering started. The eta eventually settled at ~5:00 but took about 7,5 hours – which was probably caused by the relative slow Atom processor, being the bottleneck.

nas1:~# zpool status
  pool: pool1p0
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.00% done, 708h0m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	pool1p0                    DEGRADED     0     0     0
	  raidz1                   DEGRADED     0     0     0
	    replacing              DEGRADED     0     0     0
	      2439714831674233987  REMOVED      0     0     0  was /dev/ad4/old
	      ad4                  ONLINE       0     0     0  2.30M resilvered
	    ad6                    ONLINE       0     0     0  1.53M resilvered
	    ad10                   ONLINE       0     0     0  1.52M resilvered
	    ad12                   ONLINE       0     0     0  1.38M resilvered

errors: No known data errors

The resilvering revealed a total of 4 corrupted files, which I could replace from backup.

However, this lead me to the next challenge:

Clearing errors, and merging replacement disks

I could get rid of the errors, effectively leaving the zpool in a permanent degraded state. Every document I could dust up lead me to conclusion that I should remove the files – which I did, and then run zfs clean on the pool to clear the errors.

The solution was to reboot after I had removed the files, and let it resilver again. This worked and let me to believe that I could have simply done a clean and thena scrub to verify the consistency of the data.

After this, I could repeat the somewhat lengthy process for the next disk.

Summery

In total I have had ~10 minutes downtime, caused by replacing the disks.
Plus of course a couple of hours downtime while the server dried. This is, in my opinion, very impressive. Another vote for zfs, or +1 on google+ :-)

I have actually found this zfs recovery exercise very enlightening. It is something you usually do not get to do under such “relaxed” circumstances as I had been privileged with.

Update: The new disks does not support temperature polling, apparently Western Digital has removed the feature.

screenshot
Only the remaining "old" disk now support temperature monitoring