Talk:How to hotswap Ultrabay devices

From ThinkWiki
Jump to: navigation, search

I recently tried using the libata-tj patch tarball for 2.6.16.16, applying this against the newly released 2.6.16.18 kernel (released today.) Patch applied cleanly. Upon boot, I immediately get a multitude of "weird" errors -- strange lockups, programs segmentation fault (running "top" resulted in a seg fault), and ultimately a hard lockup.

I booted back to my vanilla 2.6.16.16, ran fsck (appeared to just replay a few transactions, no major damage), and am back to normal. However, it successfully scared me off - unfortunately can't risk too much downtime (or worse, subtle fs corruption) right now on my main system. Anybody have experiences with this on a T43p using piix driver?

--gsmenden 00:00, 23 May 2006 (EST)


The 2.6.16.16 patch works fine on my T43. There's a git tree (mentioned on the patch's webpage) which is closer to 2.6.18, but AFAIK no simple unified patch was prepred.

--Thinker 08:37, 23 May 2006 (CEST)


Cool. If I get brave I'll try it again on the 43p against 2.6.16.16 proper and report back.

--gsmenden 15:29, 23 May 2006 (EST)


Works fine here on 2.6.16. I got only one crash with Suspend to Ram, which I'm unable to reproduce yet. I renamed the acpi event files because at least my acpid doesn't read files that ends with .conf

--Defiant 21:09, 28 May 2006 (CEST)


Update - patched against 2.6.16.19, works fine. It appears my previous problems were due to a disk error unrelated to the patch. Excellent!

--gsmenden 00:57, 31 May 2006 (EST)

Anybody have time to make a patch of the libata(-tj) .git tree against the recently released 2.6.17? I hope to make one in the future if not...

--gsmenden 22:08, 19 Jun 2006 (EST)

one nit about ultrabay_close script / patch against 2.6.17 available

Howdy,

In ultrabay_close, there is 'sleep 3' for disk spinup, which isn't necessary. libata itself waits for disk spinup and if something breaks (e.g. first reset fails w/ timeout or something), it's libata's fault. Please remove that line and see if anything breaks.

Also, I've uploaded patch against 2.6.17/2.6.17.1 today.

http://home-tj.org/files/libata-tj-stable/libata-tj-2.6.17-20060625-1.tar.bz2

Hmmm... My post looks different from others. This wasn't intentional. Just don't know how to add normal discussion entry. Sorry.

--tj


Right, it works fine without "sleep 3" using the new patches. Sleep removed.

--Thinker 12:35, 1 July 2006 (CEST)


Is it correct, that the ata_piix driver in kernel 2.6.18 RC4 now supports hot swapping like described in the howto and announced here http://lwn.net/Articles/183734/?

--cob 15:53, 23 August 2006

T42 freezing up when trying to hot swap ultrabay.

Hi,

Please bear with me. I am totally new at this and I am making my best effort to understand and learn.

My problem is that when typing "# echo eject > /proc/acpi/ibm/bay" to eject my ultrabay and put another in, I see the power going off in the ultrabay LED, but then my PC freezes completely.

I am running Fedora 6 Test 3, kernel 2.6.17-1.2647 and my notebook is a ThinkPad T42.

Please help! I have to constantly be changing my bay to use information in other hard drives, and I have to shutdown the system completely to not have any problems.

Thanks,

--Barny 09/21/2006@7:46PM EST


Have the same problem on a T40p running SuSE 10.1. Also lt_hotplug module is of no help. Keep me informed in case you have a solution! Thanks, --Ays 19:49, 5 October 2006 (CEST)


I have no problems with kernel 2.6.17-1.2187_1.fc5.cu from suspend2 on my T42p running Fedora Core 5. I have compiled the lt_hotswap module and every thing works fine. Since kernel 2.6.18-1.2200.fc5 my system freeez on loading the modul or on calling "echo eject > /proc/acpi/ibm/bay". Any ideas what has changed in the kernel?

--CoolMischa 2006-11-06@13:24 CET

Second disk not seen correctly on reinsert (T43p) [solved]

(update: see below for solution)

I have followed the instructions on my T43p running Gentoo using 2.6.18. I have a second hard disk in the UltraBay, using ata_piix, so it is seen as /dev/sdb (as described in Problems with SATA and Linux). The eject works fine. When I reinsert it and issue the rescan command, Only the main /dev/sdb device reappears, but not the ones corresponding to the partitions (/dev/sdb1, etc.), so I cannot mount them, and fdisk /dev/sdb says that it cannot open the device.

In dmesg, I see a bunch of errors like these, repeated multiple times:

sd 1:0:0:0: SCSI error: return code = 0x08000002
sdb: Current: sense key=0xb
    ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sdb, sector 0
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)

And at the end:

sdb: Current: sense key=0xb
    ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sdb, sector 0
ata2: EH complete
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back

The situation is not cured by a reboot (I still see only /dev/sdb), I have to power cycle to get the devices back.

Thanks for any ideas.


(2006-10-10) As a followup to my note above, I have noticed that the DVD-RW drive works perfectly after hot-swapping it - it's just the second hard disk that doesg not get recognized properly. I can "scsiping" the /dev/sdb device and it seems to respond OK, I have tried restarting udevd without success, and I'm at a loss as to what to try next.

---

It turned out to be an obvious problem - I had a disk password set on my second disk, so on reinsert it could not be accessed. I turned off the disk password, and now it works perfectly.

ultrabay_open: Problem when using /proc/mounts

I am just working on a perl-free version of the ultrabay_open script. When the script reads the currently mounted devices from /proc/mounts, it may not find all the relevant device files. A file system mounted with a relative device path given to the mount command doesn't show up with the absolute device path in /proc/mounts. Example:

# cd /dev/; mount sdb5 /mnt results in the following line in /proc/mounts:

sdb5 /mnt ext3 rw 0 0

/etc/mtab contains the needed information:

/dev/sdb5 /mnt ext3 rw 0 0

However, /proc/mounts is the more reliable source of information IMHO. The absolute device path is needed to find out its major and minor numbers.

Any suggestions?

--MinioN 01:03, 28 December 2006 (CET)

Can/t mount CD after reinsert on X41

I did all the steps described on this page, and the drive ejects fine, and then when I reinsert it the /dev/scd0 entry reappears, but when I insert a CD Gnome won't mount it automatically, and when I try manually I ger this message:

   mount: wrong fs type, bad option, bad superblock on /dev/scd0,
   missing codepage or other error

and dmesg says:

   isofs_fill_super: bread failed, dev=sr0, iso_blknum=16, block=16

I have to reboot to use the drive again.

P.S. I discovered the following in dmesg when I boot:

   ata2.01: qc timeout (cmd 0xa1)
   ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
   ata2: failed to recover some devices, retrying in 5 secs
   ata2.01: qc timeout (cmd 0xa1)
   ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
   ata2: failed to recover some devices, retrying in 5 secs
   ata2.01: qc timeout (cmd 0xa1)
   ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
   ata2: failed to recover some devices, retrying in 5 secs
   ata2.00: configured for UDMA/33

Is that relevant?

hdparm -Y /dev/<devnode>

is the hdparm part in the ultrabay_eject script really necessary?

It does not work with my dvdram drive (R60):

hdparm -Y /dev/sdb

/dev/sdb:
issuing sleep command
HDIO_DRIVE_CMD(sleep) failed: Input/output error

thanks,

problem with umount_rdev

I tried out the ultrabay_eject script and get this error (debian lenny with kernel 2.6.22.3)

cat: /sys/class/scsi_device/1:0:0:0/device/block:*/*/dev: No such file or directory

What is wrong? Why I need the output of $ULTRABAY_SYSDIR/block\:*/*/dev in

unmount_rdev `cat $ULTRABAY_SYSDIR/block\:*/dev    /
$ULTRABAY_SYSDIR/block\:*/*/dev`  \

?

hotswap with kernel 2.6.24 and archlinux

Since I change to kernle 2.6.24 on my archlinux hotswap isn't working anymore. When I try to swap between harddisk and DVD-drive the system frezzes and the only way to get it back to work ist turn the power of. Has anyone else the same problem? And is there a way to fix it?

---

Yes I stumbled today(6.Apr.) over this annoying regression: On my T42p 2373-KYG with Kubuntu 8.04 Hardy Heron Kubuntu is freezing totally after only removing the DVD drive from the UltraBay slot. This happens on a T42p upgraded from Gutsy and on another T42p with a clean and fresh Hardy install. (/Lophiomys)

---

I think I may know what is causing the issue: Disable ACPI support on libata, or switch to thinkpad-acpi [deprecated] bay support. The bay module is not able to load on 2.6.24+ if libata is compiled with ACPI support.

--hmh 06:30, 7 May 2008 (CEST)

Could you help out with instructions for an interim workaround, how to disable ACPI support on libdata or switch to thinkpad-acpi? Thanks.

--lophiomys 19:13, 7 May 2008 (CEST)

Ultrabay Hotswap on ThinkPad X41

After several hours of debugging and trial+error, I managed to get hotswaping ultrabay devices working in Fedora 9. The kernels I tested were 2.6.25.11 and 2.6.25.14 from fedora updates. The bay.ko driver was not present in the kernel so I just used the ibm-acpi driver (works fine). You need to create only 4 files (/usr/sbin/ultrabay_insert; /usr/sbin/ultrabay_eject; /etc/acpi/events/ultrabay-insert and /etc/acpi/events/ultrabay-eject) available from the linnk here: [1]. NOTE!! - DO NOT use the ultrabay_eject script from the link as it won't work. I have a modified version that works below. I have tested swapping dvdrw with an hdd and it works just fine. Make sure you restart acpid after adding the scripts. Hope this works for others..

/usr/sbin/ultrabay_eject:

#!/bin/bash
ULTRABAY_SYSDIR='/sys/class/scsi_device/1:0:0:0/device'
shopt -s nullglob

# Umount the filesystem(s) backed by the given major:minor device(s)
unmount_rdev() { perl - "$@" <<'EOPERL'  # let's do it in Perl
	for $major_minor (@ARGV) {
		$major_minor =~ m/^(\d+):(\d+)$/ or die;
		push(@tgt_rdevs, ($1<<8)|$2);
	}
        # Sort by reverse length of mount point, to unmount sub-directories first
        open MOUNTS,"</proc/mounts" or die "$!";
        @mounts=sort { length($b->[1]) <=> length($a->[1]) } map { [ split ] } <MOUNTS>;
        close MOUNTS;
        foreach $m (@mounts) {
                ($dev,$dir)=@$m;
		next unless -b $dev;  $rdev=(stat($dev))[6];
		next unless grep($_==$rdev, @tgt_rdevs);
		system("umount","-v","$dir")==0  or  $bad=1;
	}
	exit 1 if $bad;
EOPERL
}

# Get the UltraBay's /dev/foo block device node
ultrabay_dev_node() {
	UDEV_PATH="`readlink -e "$ULTRABAY_SYSDIR/block/"*`" || return 1
	UDEV_NAME="`udevinfo -q name -p $UDEV_PATH`" || return 1
	echo /dev/$UDEV_NAME
}

if [ -d $ULTRABAY_SYSDIR ]; then
	sync
	# Unmount filesystems backed by this device
	unmount_rdev `cat $ULTRABAY_SYSDIR/block/*/dev     \
	                  $ULTRABAY_SYSDIR/block/*/*/dev`  \
	|| {
		echo 10 > /proc/acpi/ibm/beep;  # error tone
		exit 1;
	}
        sync
        # Nicely power off the device
	DEVNODE=`ultrabay_dev_node` && hdparm -Y $DEVNODE
        # Let HAL+KDE notice the unmount and let the disk spin down
	sleep 0.5
	# Unregister this SCSI device:
	sync
	echo 1 > $ULTRABAY_SYSDIR/delete
fi

# We really need a 3 sec pause here otherwise the system will freeze..
sleep 3

# Turn off power to the UltraBay:
if [ -d /sys/devices/platform/bay.0 ]; then
	echo 1 > /sys/devices/platform/bay.0/eject
elif [ -e /proc/acpi/ibm/bay ]; then
	echo eject > /proc/acpi/ibm/bay
fi
# Tell the user we're OK
echo 12 > /proc/acpi/ibm/beep

zin 26 August 2008