Articles

Software - Sub Categories

Resize Nested LVM inside KVM Machines

This was supposed to be easy - extending logical volumes. But if you install a virtual machine, then it all becomes a mess. Search the web for how to extend a partition "nested" in an LV, and there are only questions and no answers.

KVM Disk Management Issues shows an alternative to using the standard install - "just put a filesystem on it and you are done" which basically means that manual partitioning should be chosen during a Ubuntu install. Resize KVM Image shows another alternative which basically involves deleting the swap partition inside the KVM which allows the root partition to be enlarged. That would be necessary when there is no nested LVM, when the partitions were created in the hosts' logical volume. And for a general introduction to LVM see Logical Volume Management (IBM).

virt-manager makes it easy to install the virtual machine using an .iso image of the OS to install. It is not easy to resize storage on a KVM virtual machine, if installed following the standard instrutions - make a logical volume on the host machine, let the virtual machine installer use it like a raw disk to create its partitions, and install the OS.

Well, the trouble is now that the standard LVM resize procedures are not helpful. This is what the picture looks like, where VG is the host machine volume group:

Host Machine:

   vg:VG
    -- lv:kvm1
    -- lv:unused (lot of unused space on the volume group VG)

lv:kvm1 is the logical volume used to install the virtual machine. Let us say it is a Debian-based system, like Ubuntu. Using the default "Guided Partitioning", it will install a logical volume on the disk.

    -- lv:kvm1
       -- vg:VIRT
          -- lv:vroot     --- this is the partition we want to extend
          -- lv:vswap
(click here for the entire post)

Drupal is a lot of trouble

This site uses Drupal. Drupal has turned into a nightmare. It was fine when there was a single 4.x version out there, but soon after 4.x, there was 5.x. Then 6.x. Upgrading from a older version is near impossible.

There always was the assumption that some amount of coding would be required by anyone running a Drupal site. But be prepared - you will be hacking modules left-and-right to get any thing running. At this time, one has to question whether the amount of hacking required to get things to run are worth it. Maybe all CMSes have this problem, but certainly Drupal is really a poster-child for impossible-to-ever-upgrade software.

The problem occurs because Drupal changes the API every release, adds new incompatible features, and modules and themes become unusable. And since modules and themes are merely someone's weekend project, it can be months or years before a module becomes compatible with the newer Drupal version.
Core drupal does not have image handling capabilities or spam fighting capability so even a basic site will need to use external modules. Add things like forums, automatic aliases, FAQs, it becomes a large collection of non-core modules.

The advantage of Drupal is that it is extensively customizable, and has a wide range of modules. This is exactly the same thing that makes a Drupal site near-impossible to upgrade. Once a site is up and starts to depend on a bunch of modules, rest assured that when a new Drupal version comes out quite a few required modules will not make it to that new version!

Drupal core does get upgraded without problems. But Drupal itself has become super-bloated. Web hosts that worked fine with Drupal 4.7 will not support Drupal 6.x because of heavily increased memory and CPU requirements.

(click here for the entire post)

Spam Email Counts

Is email on the way out? That is probably not yet an easy question, but the amount of spam seems to be holding steady, with periodic bursts of spam email storms.

Here are some graphs of spam at one of my mailboxes. This is for a very public email address. The spam detection is using spamassassin which runs under procmail with a customized whitelist and blacklist. Over the few years I've used this, there have been only 1-2 false positives for spam (of course, detection of false positives is not easy since this requires digging through 100s of spam messages, but I have no reason to believe that false positives are more prevalent). There have been quite a few false negatives - messages that are spam, but missed by spamassassin. These are usually around 1%-10% of the total detected spam messages, which is low enough that the graphs below are still useful to show the trend of spam message counts.

2010 Spam Counts 2010 Spam Counts
The Spam Counts images are updated periodically, usually every day, to include data of the previous complete 24-hour period.

Older Spam Counts 2009 Spam Counts 2008 Spam Counts 2007 Spam Counts

(click here for the entire post)

DD-WRT for Linksys wrt54g v8

dd-wrt is third-party firmware that can be loaded on many routers and it makes available many additional features such as advanced routing as well as a keep alive mechanism. It is maintained by BrainSlayer.

In the few days of using it, some advantages of dd-wrt are evident. It has been far easier to configure on my network of Linux and Windows computers, which use both static and DHCP IP addressing. The bundled Linksys software on the new WRT54G V8 device had long DNS lookup times on the Linux computers (probably needed to use the remote DNS resolvers instead of pointing to the Linksys box), for all lookups, at all times. But instead of re-configuring the Linux boxes, in the same amount of time, it was quite easy to install dd-wrt micro-edition using these instructions: How to Flash Linksys wrt54g

(click here for the entire post)

Link Filter Drupal Module

Here's yet another URL Link Filter for Drupal, latest version can be downloaded from: linkfilter-4.x-5.x-1.1.zip

Drupal 6.x version is available here: linkfilter-6.x-1.2.zip.

The goal for this filter is to be somewhat like the URL filter included with Drupal, with the additional requirement to be Drupal installation directory independent as well as domain independent so that the URLs in Drupal nodes don't have to be re-edited when a Drupal site is moved to a different sub-directory or a different domain. Additionally, it allows for link text to be specified for the URL, and it preserves the input characters as much as possible, performing no or minimal HTML entity conversions of the input characters. Finally - it distinguishes various links with classes, which can be used to display link icons for specific links. If the link filter tag points to internal Drupal node, then a class containing the type of the node is generated, for example, class="linkfilter-drupal-node-image", which can be used to show distinguishing icons based on Drupal node type. This site uses this filter, and the link icons are displayed based on the class generated by the filter: for external links (linkfilter-urlfull class), images (linkfilter-drupal-node-acidfree or linkfilter-drupal-node-image class), mailto links (linkfilter-mailto class).

Link filter tags [l:URL text] in the input text will be replaced with a link to the given URL, which can be a Drupal link, an external web link, or a local non-Drupal link. Prefixes representing the site url and the Drupal directory are added, as appropriate:
1) Site url is prefixed if URL begins with a / character
2) No prefix is added if the URL has a : in it, as in http: or ftp: etc
3) Site url with Drupal base directory is prefixed in all other cases, this is handled by calling the Drupal l() function.

(click here for the entire post)

Fedora Core 7 Install Notes

Notes on things to look out for working with Fedora Core installs.

perl CGI scripts hanging
Fedora Core 7 current perl package is perl-5.8.8-27 [Jan 2008]. This perl package contains version 3.15 of CGI.pm, which has a problem handling POST_MAX. Any HTTP post of size greater than POST_MAX value will cause the CGI script to peg the CPU at the $q = new CGI statement and not terminate for a very long time. After the CGI script is automatically terminated, a HTTP 500 Internal Sever Error status code returned to the requester.

This problem exists in all distributions using Perl 5.8.8, and is not limited to Fedora Core 7.

This issue is very hard to track down. Most of the time, the CGI scripts will work fine. When they do fail, there will invariably be nothing more than a single line in the Web server log stating that the job took around 300 seconds and a HTTP 500 Internal Server Error message was sent back. The times at which the failures occur will be all over the clock, there will be no pattern to the time when the failure occurs. If you do manage to see this rare failure happening, the CPU will show that all free CPU time is being used by the CGI script - but the script has no logging or database activity thus indicating that it locked up very early in the script. That will eventually lead to looking at the $q = new CGI statement, and to the fix.

The fix is simple - upgrade to CGI.pm version 3.21 or later. But to make this stick will require some additional configuration because while CGI.pm can be updated on its own, it is also bundled with perl. So to avoid a future yum update restoring CGI.pm back to version 3.15 when perl 5.8.8 is updated, requires preventing perl from being updated, or keeping two copies of CGI.pm around and changing scripts to load the right one.

(click here for the entire post)

Show blocked hosts on web

This script uses PHP and MySQL to create a web page that lists all the blocked hosts.

It uses a IP to country mapping table to show country flags.

To see this working visit tanchaz.hu/blockhosts/

That page also includes a link to download the software.

Drupal MySQL Performance Problem

A very small site started running into performance problems - some pages taking too long to load, and certain MySQL queries taking over 5000 to 6000 milliseconds, and being killed because of resource limits set on the hosting computer. The pages affected were the watchdog log display pages - one of which is the Menu -> administer page when logged in as the adminstrator, and it displays data from the watchdog table.

This seemed odd - for a site with less than 200 nodes, and very low traffic, there should be no performance issue, and no single database query should be taking as long as 6000 milliseconds.

So the options were to increase the time limit for queries, or to spend the time debugging the problem.

Drupal is very feature rich, and this may have negative impact on performance, but in this case, it turned out to be a database issue. The Drupal site has many performance related tips, including a subsection on Tuning MySQL for Drupal.

After looking around in the database for the site, it was discovered that the overhead for the watchdog table was over 40 times its actual size! So, the size was 176MiB, and the overhead was 172MiB. Running optimize on this table got the size down to under 4MiB, overhead to 0, and got the queries to be much faster - way below the 6000 millisecond time limit, and the administer and log display pages now rendered much faster, way below the old times.

One question remains - why did removing overhead fix the query times?

(click here for the entire post)

strftime in Python

Time has never been easy to work with, and while it is best to use UTC to store time, it is necessary to use local time zones when displaying time to the user.

With regards to displaying local timezones using strftime and the format specifications %Z for timezone name and %z for the RFC-2822 conformant [+-]hhmm displays, Perl and PHP work fine, at least on Fedora FC5. Python 2.4 does not yet have support for %z, and the best support in Python for timezone is using the basic time module; the enhanced datetime module has no built-in support for time zones.

That takes care of strftime, what about strptime? Unfortunately, that is a topic that is even more convoluted, so try all variations out before you use it on any system.

Here's a strftime support summary for Python date/time users:

  • Use the datetime module if you don't need any timezone handling. The standard library comes with no support for any timezones, not even local timezones, so datetime module is useless for those who need to use time zone information. This can be done of course, but requires coding up your own timezone routines.
  • Use the time module if you can live with just %Z and don't need %z. Python 2.4 time module always prints wrong value +0000 for %z, even while it gets %Z correct.
  • Need %z, the RFC-2822 conformant time display? This requires writing your own code, Python does not support this. Using the basic time module, here's example code on how to get these values:
    lt = localtime(t)
    if lt.tm_isdst > 0 and time.daylight:
        tz = time.tzname[1]
        utc_offset_minutes = - int(time.altzone/60)
    else:
        tz = time.tzname[0]
        utc_offset_minutes = - int(time.timezone/60)
    utc_offset_str = "%+03d%02d" % (utc_offset_minutes/60.0, utc_offset_minutes % 60)

    Note: in the utc_offset_str computation, the use of 60.0 float in the / operation is necessary to get a value rounded to 0 instead of negative infinity, for example, -90 minutes offset should be -0130 and not -0230.

Here's the sample code and the output from Perl, Php, and Python, for two example strftime calls:

(click here for the entire post)

Updating Drupal

Updating drupal using the standard instructions is very time consuming - have to turn off modules/themes, update settings, reinstall modules/themes.

But many users find that for many updates to the same major release, for example, 4.7.x series, simpler upgrades can work - official Drupal install instructions do not allow this, nor is the structure of Drupal folder structure a help, since user installed modules/themes are in the same folder as the drupal files (would be good to have these separate), and no easy way exists to try out a new release before switching to it on a running site.

Given all that, here's how an update can be done fast - very important to read the UPGRADE.txt Drupal document first, and do all backups, and be ready to restore quickly if things don't work.
No guarantees on this method, but it has been known to work.

Assume that old drupal install is in current/ and new one is new/

  1. extract the new drupal release, into the new/ directory.
  2. copy over all the new files and directories from your current install, to this new/ directory. See script "update.sh" below which does this.
  3. update the config files - sites/default
  4. login to current drupal as admin
  5. backup: rename current/ to current.backup/
  6. rename new/ dir to current/
  7. run current/update.php from drupal and on success, log off the admin user. and test it out

And here's the update.sh script - edit $OLD variable, run this from inside the new/ directory, and redirect output to update.run, take a look at the update.run commands, and then run the commands in update.run:

#!/bin/sh
OLD="current/"

echo "# Current Directory: " `pwd`

for i in `cd $OLD; find . -print`
do
    if [ ! -e "$i" ]
    then
        if [ -d "$OLD/$i" ]
        then
            echo mkdir -p -v "$i"
        fi
        if [ -f "$OLD/$i" ]
        then
            echo cp -p -v -i --reply=no "$OLD/$i"  `dirname "$i"`
        fi
    fi
done