Valid HTML 4.01 Transitional

Jim Carter's Bugfixes

James F. Carter, last updated 2009-06-29

This is a kind of blog-oid of minor (and not-so-minor) bugfixes that I've found, that I'm posting where search engines can find it.

Windows Vista Service Pack Won't Install

2009-06-29
Symptom:

Microsoft Windows Vista Service Pack 2 is downloaded. It goes through the whole installation procedure without error, reboots, gets to 100% of Stage 3 of 3, and then announces Service Pack did not install. Reverting changes. On the next login it exudes a dialog box saying Installation was not successful. Unspecified error. Details: Error: E_FAIL(0x80004005).

A forum poster reports the same problem with SP1, and I wouldn't be too surprised to find the same problem with XP service packs.

What's going on:

In this forum posting look for Sekim's contribution. He refers to KB971204 which suggests: Look in %windir%\Logs\CBS\CBS.log ; you may see that C:\Windows\bfsvc.exe reports Failed to get system partition! Last Error = 0x3bc3. If third party disk management tools are used to "clone" discs or partitions, the SP2 installer will be unable to uniquely identify the correct system boot files. (This is my situation.)

In other words, the service pack replaces the Windows kernel and the bootloader (equivalent of grub), and it will need to fill the block numbers into the boot sector of the partition containing the bootloader. If it cannot identify the partition it will be unable to do this last critical step of installation. It should check the partition at the beginning, but it doesn't.

How to fix:

Make the possibly confusing drives disappear (impossible for me) or do a clean installation. Various successful workarounds mentioned in forum postings include:

DNS Query Denied

2008-03-20
Symptom:

The DNS server for a domain is upgraded and starts logging errors similar to this:

Mar 20 13:06:11 dns1.example.com named[3357]: client 128.97.12.3#53: query (cache) 'other.example.net/A/IN' denied

Workstations which rely on this server as a forwarder, particularly Windows boxes, are not getting recursive DNS answers from it. The users, unable to view their favorite websites, are gathering outside your door with pitchforks.

What's going on:

In the transition from BIND 9.3.2 to 9.4.1.P1 (the version I upgraded to) a default got changed: formerly the default for all maps was to allow any site to query them; now any map for which you are authoritative may be queried by anyone, but forwarding requests are allowed only from the subnet that the server is actually connected to. If the client is on another subnet the recursive query will be refused, while the same query on the server's own subnet will be executed.

How to fix:

Following instructions in the BIND administrator's manual, I added explicit allow-query statements to my named.conf file. Here is a shortened version from our master server; slaves are essentially identical except the map stanzas are for slaves. Be careful to include semicolons at the end of everything.

// This conveniently defines our local network (simplified).
// Localhost must be listed explicitly.
acl ournets { 
        128.97.4.0/24; 
        128.97.12.0/24; 
        127.0.0.1;
};
options {
        forwarders {
                164.67.128.1;
        };
        // This option governs recursive queries and 
        // zones lacking "allow-query any".
        allow-query { ournets; };
        allow-transfer {
                128.97.0.0/16;
                164.67.0.0/16;
        };
        // other options omitted
};
// DNS maps for which we are authoritative.  Reverse domain
// is not shown.
zone "math.ucla.edu" {
        type master;
        file "master/math.zone";
        //We want the outside world to be able to resolve our names.
        allow-query { any; };
};
// We don't want the outside world seeing our Windows Active 
// Directory, so this stanza lacks "allow-query any".  Numerous
// other A.D. zones are not shown, nor localhost and hints.
zone "_tcp.math.ucla.edu" {
        type master;
        file "master/math.tcp";
};

    

Wi-fi Won't Connect to New Network

2007-09-27
Symptom:

My laptop is set up with wpa_supplicant and dhclient to handle dynamic network association. /etc/wpa_supplicant.conf includes network blocks for the four nets I use, and ap_scan=1 (wpa_supplicant initiates scan and picks the access point). I start it up on one of the nets. Later I migrate to another, e.g. work to home. But I have no connectivity; the interface is fixated on the previous net and won't switch over to the new one.

What's going on:

Duh, just because it's out of range of the previous net gives it no clue that it's in range of another one. Admittedly it should be a little smarter, but I'm just using the infrastructure; I didn't write the code.

How to fix:

Step 1: wpa_cli reassociate
This locates and associates with an access point on the new net, and loads the appropriate encryption key. /etc/wpa_supplicant.conf has a setting for ctrl_interface_group, and any user in that group can issue this command.

Step 2: You need to kick dhclient to make it obtain an IP address on the new network.

The security on dhclient is a bit worrisome: according to the default /etc/dhclient.conf it listens for OMAPI commands on port 7911 from any network interface, with no authentication key. (Authentication may be planned for a future version.) Your firewall should block out such connections except from localhost, in the likely case on a personal laptop that every user on the machine is authorized to affect dhclient.

Fly in ointment: sometimes but not always, wpa_supplicant gets into a loop in which it scans, seems to associate (packets can be sent), but fails to finish the association, and retries. I don't know whether this is a problem with wpa_supplicant or with the Intel ipw3945 wireless chip's driver or firmware -- I wouldn't be surprised if it were the latter.

Xine-Based Multimedia Players Crash on Initialization

2007-09-18
Symptom:

I tried Xine itself, Amarok, Kaffeine and Xfmedia. All of them crashed (with useless error messages, if any) either immediately on startup or when starting to play an audio or video file.

What's going on:

I have an ATI Radeon Mobility X1400 video card in my laptop, and I'm using the fglrx (FireGL) proprietary driver. It turns out that the crash occurs when the Xine library tries to initialize a graphics window -- in the audio case it's going to show sound-derived random artwork, or for XFmedia the initialization takes place before it realizes that it is not going to show a video. According to the xine-bugreport script that comes with Xine, the XVideo extension on a number of X-Windows drivers is broken and causes this crash.

How to fix:

Xine-lib uses the xv video driver by default. Make it use a different one. Assuming that you have the xine-ui package installed (i.e. the player GUI), do xine --help | more and see the list of available drivers given with the help for the -V switch. The xine-bugreport script suggests using xshm, which is CPU-intensive but works in the most contexts. I just tried all the drivers, and I found that opengl also works on my Radeon, so that's the one I use.

Use the players' --help feature to discover the command line switch to force a video driver -- it's -V for xine. For those players which crash on startup, that will get you a GUI. From there you need to use the settings or preferences dialog to make a permanent change in the video driver.

Since crashes are sometimes exploitable to run arbitrary code as the user executing xine-lib, I've filed a bug report with Xine on this issue.

GStreamer's xvimagesink does not seem to have any problems with the XVideo extension in fglrx.

Black Screen on Logout

2007-09-18
Symptom:

You log out from your desktop environment and the screen goes black. Ctrl-alt-F1 does not make a text console visible -- or if done from within the session, it may induce the black screen itself. But the machine is still on the net, and ctrl-alt-del triggers a normal shutdown, i.e. the keyboard is not frozen.

What's going on:

I did a search on Google for 'xfce4 ATI OR fglrx OR FireGL "black screen"', getting 10800 hits. The symptoms described are rather varied. In my specific situation the desktop environment is Xfce4 (Gnome also gave trouble for other people); the graphics card is an ATI Radeon Mobility X1400 with the fglrx (FireGL) driver (other people report similar symptoms with various nVidia or Intel 950 or SiS onboard video); and I am not using the Composite extension (others say that using it is beneficial for this symptom).

It's not clear what's going on, but it's clearly related to the black screen that bedevils many people when suspending their machines to RAM or disc. In my case, when I upgraded Xfce4 from version 4.2 to 4.4, the machine started doing the black screen thing and also could no longer suspend, i.e. it stopped tasks and tried to shut down drivers, but failed and emerged from suspend mode. The video driver is definitely at fault, but something in the window manager (etc.) makes it choke. A number of users speculate that 3D acceleration (DRI) is the culprit, since users who introduce the Beryl and Compiz window managers often report this failure.

How to fix:

Automatic Volume Mounting Without KDE or Gnome

2007-01-22
Symptom:

On Windows if you stick a disc in the CD drive, or plug in an external disc drive or flash memory stick to USB or Firewire, something useful will happen such as mounting the media or playing a music disc or video. Modern installations of Gnome or KDE do the same thing, if you're lucky. But suppose you're an old troglodyte who uses a non-bloatware desktop manager such as fvwm? Nothing happens.

What's going on:

There has been a lot of work in the past few years on co-opting what Windows does right. In particular, there's a new form of inter-process communication called the dbus. Each user session needs one of these (as well as a system-wide dbus). Gnome and KDE have a default configuration that starts it up, but a troglodyte needs to take care of this by hand, as well as starting gnome-volume-manager.

The system-wide haldaemon taps into the kernel feed of hotplug messages and rebroadcasts them on the dbus. gnome-volume-manager listens on the session dbus for these messages. If suitably configured (and the default configuration does this), it sends messages back to haldaemon asking it to mount newly appeared removeable media, or it may launch a media player for music or video discs, or various other actions.

When you press the eject button on the CD drive, an inverse set of messages is sent for unmounting the media, after which the tray physically pops out (if unmounting succeeded).

How to fix:

By now there are quite a number of pages various places on the web describing how to start up gnome-volume-manager. Here's my contribution.

If not already done, you need to install the packages hal, dbus-1 and gnome-volume-manager. These are the names on my distro, SuSE 10.1; it's possible that the names differ slightly on other distros, e.g. dbus1 versus dbus-1.

I'm assuming that your boot scripts invoke haldaemon successfully. I've been successful invoking the other two daemons like this. I've put this code in /etc/X11/xdm/Xsession conditional on the window manager not being Gnome or KDE, but most people will probably put this invocation in ~/.xsession.

dbus=(`dbus-daemon --session --fork --print-pid=1 --print-address=1`)
export DBUS_SESSION_BUS_ADDRESS=${dbus[0]}
export DBUS_SESSION_BUS_PID=${dbus[1]}
gnome-volume-manager --daemon=yes --sm-disable

You can configure your instance of gnome-volume-manager by running gnome-volume-properties.

To eject a CD, press the tray open button on the drive. But if some process has the drive open or (for a data CD) has its current directory within the media, it will be busy and the unmount will fail. Stop a music player or cd off the disc, and then it can be unmounted.

However, I haven't yet figured out how to correctly unmount a USB storage device, except by doing it as root. The ordinary user does not have permission to do it. I'll bet that Gnome or KDE generate a fancy dbus message that Hal will honor. If I figure it out, I'll post the method.

Fails to Mount Media with VFAT

2007-01-22
Symptom:

You've gotten gnome-volume-manager set up and it automatically mounts a data CD when inserted. However, it doesn't mount a USB flash memory stick or a digital camera that behaves as USB mass storage.

What's going on:

Haldaemon, which runs as root, actually mounts the media upon a request from gnome-volume-manager which it receives over dbus. It is very paranoid, to avoid being tricked into doing evil things as root. In particular, its storage mounting script (/usr/lib/hal/scripts/hal-system-storage-mount on my distro, SuSE 10.1) allows only alphanumeric characters, underbar, equal sign and whitespace in the mount options. When a VFAT filesystem is mounted, Hal provides an option of iocharset=iso8859-1 (which appears to be the default, but Hal specifies it anyway). This is sanitized into iso8859_1, and of course the corresponding kernel module does not exist, causing the mount to fail.

How to fix:

Edit the mounting script, or extract and apply the following patch, to allow hyphens in mount options. Note the hyphen added to the character class as the first byte after '^'.

--- /usr/lib/hal/scripts/hal-system-storage-mount.orig	2006-07-19 18:23:41.000000000 -0700
+++ /usr/lib/hal/scripts/hal-system-storage-mount	2007-01-22 22:00:51.000000000 -0800
@@ -82,7 +82,7 @@
 read GIVEN_MOUNTTYPE
 GIVEN_MOUNTTYPE=${GIVEN_MOUNTTYPE//[^a-zA-Z0-9_=]/_}
 read GIVEN_MOUNTOPTIONS
-GIVEN_MOUNTOPTIONS=${GIVEN_MOUNTOPTIONS//[^a-zA-Z0-9_=[:space:]]/_}
+GIVEN_MOUNTOPTIONS=${GIVEN_MOUNTOPTIONS//[^-a-zA-Z0-9_=[:space:]]/_}
 
 # deny to handle devices listed in fstab, unless the "user" option is given
 # allow only use of specified mountpoint and fail for a different one

This bug has been reported to SuSE (https://bugzilla.novell.com) and has been assigned bug number 237670.

Music Player Won't Play Audio CDs

2007-01-22
Symptom:

Launch xmms and ask it to play an audio CD. It sits around for 30 seconds or so, exudes message: alsa mixer timed out, and things go downhill from there depending on how many buttons you've clicked in the meantime. In any case, no sound comes out. In help requests in quite a number of forums I've seen similar behavior reported with other music player frameworks.

What's going on:

Either the default setting or the setting left over from previous configuration attempts specifies analog playback from the CD drive; in other words, there's supposed to be a wire that goes from the drive to the sound card, either from the drive's own headphone jack or an internal cable. The ALSA library is also supposed to be able to find a mixer that can control the playback volume. While older CD drives have all this, it's vaporware on the drive in my laptop. The ALSA library seems to be not too bright about recognizing the lack of a mixer.

message: device default just means that you've configured the ALSA output module to use its default device; it would say message: device hw0,0 if you selected that device in the configuration dialog. As long as the device exists, the message is of no consequence.

How to fix:

To Auto-Play Audio CD on Insertion

2007-01-22
Symptom:

You've set up dbus and gnome-volume-manager, it mounts data CDs OK, but you insert an audio CD, and nothing happens.

What's happening:

Did you configure gnome-volume-manager with a useable action upon finding an audio CD?

You can sometimes elicit useful information by starting a separate xterm, killing the normal gnome-volume-manager (if any), and executing gnome-volume-manager -n. -n means to not fork as a daemon. It will echo dbus chatter it receives and commands it tries to execute. If a command is not working you may notice a useful error message.

How to fix: