Unsolved

This post is more than 5 years old

6 Posts

10554

August 6th, 2007 14:00

SCSI issue on a Dell 1500SC (Solved)

Hi all, I experience a strange problem with my 1500SC. It seems as my HD is going offline after a couple of minutes and does not come back online. It all started out a couple of days ago, I installed Linux (SME Server) in a Raid1 configuration with 2 HD. After 3 days one of the HD gave me error messages. I took out the HD and re-installed the OS in a non-raid configuration but now that HD will go offline after just a couple of minutes. Also, sometimes the HD and the backplane is not found during boot. I do not have a RAID controller in this server. At this point I'm not sure if the problem is the HD or the backplane. Here are the last entrees from /var/log/messages :

Aug 5 18:01:16 office kernel: (scsi0:A:1:0): No or incomplete CDB sent to device. Aug 5 18:01:16 office kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 5 18:01:16 office kernel: (scsi0:A:1:0): No or incomplete CDB sent to device. Aug 5 18:01:16 office kernel: scsi0: Issued Channel A Bus Reset. 2 SCBs aborted

And here is the last dmesg:

[root@office ~]# dmesg Linux version 2.6.9-55.0.2.ELsmp (mockbuild@builder4.centos.org) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-8)) #1 SMP Tue Jun 26 14:30:58 EDT 2007 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) BIOS-e820: 0000000000100000 - 000000004fff0000 (usable) BIOS-e820: 000000004fff0000 - 000000004fffec00 (ACPI data) BIOS-e820: 000000004fffec00 - 000000004ffff000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) 383MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000fe710 Using x86 segment limits to approximate NX protection On node 0 totalpages: 327664 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:16 HighMem zone: 98288 pages, LIFO batch:16 DMI 2.3 present. Using APIC driver default ACPI: RSDP (v000 DELL ) @ 0x000fdc70 ACPI: RSDT (v001 DELL PE1500SC 0x00000002 MSFT 0x0100000a) @ 0x000fdc84 ACPI: FADT (v001 DELL PE1500SC 0x00000002 MSFT 0x0100000a) @ 0x000fdcb4 ACPI: MADT (v001 DELL PE1500SC 0x00000002 MSFT 0x0100000a) @ 0x000fdd28 ACPI: SPCR (v001 DELL PE1500SC 0x00000002 MSFT 0x0100000a) @ 0x000fdd82 ACPI: DSDT (v001 DELL PE1500SC 0x00000002 MSFT 0x0100000a) @ 0x00000000 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) Processor #1 6:11 APIC version 17 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x00] enabled) Processor #0 6:11 APIC version 17 ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) Enabling APIC mode: Flat. Using 0 I/O APICs ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-15 ACPI: IOAPIC (id[0x03] address[0xfec01000] gsi_base[16]) IOAPIC[1]: apic_id 3, version 17, address 0xfec01000, GSI 16-31 ACPI: IRQ9 used by override. Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 50000000 (gap: 4ffff000:aec01000) Built 1 zonelists Kernel command line: ro root=/dev/main/root mapped APIC to ffffd000 (fee00000) Initializing CPU#0 CPU 0 irqstacks, hard=c03f1000 soft=c03d1000 PID hash table entries: 4096 (order: 12, 65536 bytes) Detected 1130.719 MHz processor. Using pmtmr for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1292420k/1310656k available (1883k kernel code, 17140k reserved, 761k data, 188k init, 393152k highmem) Calibrating delay using timer specific routine.. 2261.86 BogoMIPS (lpj=1130933) Security Scaffold v1.0.0 initialized SELinux: Initializing. SELinux: Starting in permissive mode There is already a security framework initialized, register_security failed. selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K CPU: After all inits, caps: 0383f3ff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. CPU0: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 per-CPU timeslice cutoff: 1462.55 usecs. task migration cache decay timeout: 1 msecs. Booting processor 1/0 eip 3000 CPU 1 irqstacks, hard=c03f2000 soft=c03d2000 Initializing CPU#1 Calibrating delay using timer specific routine.. 2260.57 BogoMIPS (lpj=1130285) CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K CPU: After all inits, caps: 0383f3ff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 Total of 2 processors activated (4522.43 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 pin1=0 pin2=-1 checking TSC synchronization across 2 CPUs: passed. Brought up 2 CPUs zapping low mappings. checking if image is initramfs... it is Freeing initrd memory: 1346k freed NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfc7fe, last bus=3 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20040816 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Root Bridge [PCI1] (00:02) PCI: Probing PCI hardware (bus 02) ACPI: PCI Interrupt Routing Table [\_SB_.PCI1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.I960._PRT] ACPI: PCI Root Bridge [PCI2] (00:03) PCI: Probing PCI hardware (bus 03) ACPI: PCI Interrupt Routing Table [\_SB_.PCI2._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 *10 11 12 14) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKI] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKJ] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKK] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKL] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKM] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LNKN] (IRQs 3 4 *5 6 7 9 10 11 12 14) ACPI: PCI Interrupt Link [LNKO] (IRQs *3 4 5 6 7 9 10 11 12 14) ACPI: PCI Interrupt Link [LNKP] (IRQs 3 4 5 6 7 9 10 11 12 14) *0, disabled. ACPI: PCI Interrupt Link [LUSB] (IRQs 3 4 5 6 7 10 *11 12 14) Linux Plug and Play Support v0.97 (c) Adam Belay usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing ACPI: PCI Interrupt Link [LUSB] enabled at IRQ 11 ACPI: PCI Interrupt 0000:00:0f.2 -> GSI 11 (level, low) -> IRQ 11 ACPI: PCI Interrupt 0000:00:0f.3 -> GSI 11 (level, low) -> IRQ 11 ACPI: PCI Interrupt 0000:01:00.0 -> GSI 17 (level, low) -> IRQ 177 ACPI: PCI Interrupt 0000:02:02.0 -> GSI 29 (level, low) -> IRQ 185 ACPI: PCI Interrupt 0000:02:02.1 -> GSI 30 (level, low) -> IRQ 193 ACPI: PCI Interrupt 0000:03:02.0 -> GSI 16 (level, low) -> IRQ 201 apm: BIOS not found. audit: initializing netlink socket (disabled) audit(1186377077.105:1): initialized highmem bounce pool size: 64 pages Total HugeTLB memory allocated, 0 VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) SELinux: Registering netfilter hooks Initializing Cryptographic API ksign: Installing public key data Loading keyring - Added public key F6D125003A6A5D77 - User ID: CentOS (Kernel Module GPG key) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 ACPI: Processor [CPU0] (supports C1) ACPI: Processor [CPU1] (supports C1) Real Time Clock Driver v1.12 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Maximum main memory to use for agp memory: 1185M agpgart: unable to determine aperture size. agpgart: agp_backend_initialize() failed. agpgart-serverworks: probe of 0000:00:00.0 failed with error -22 agpgart: Maximum main memory to use for agp memory: 1185M agpgart: unable to determine aperture size. agpgart: agp_backend_initialize() failed. agpgart-serverworks: probe of 0000:00:00.1 failed with error -22 agpgart: Detected ServerWorks CNB20HE chipset: No AGP present. agpgart: Detected ServerWorks CNB20HE chipset: No AGP present. serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 68 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize divert: not allocating divert_blk for non-ethernet device lo Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SvrWks CSB5: IDE controller at PCI slot 0000:00:0f.1 SvrWks CSB5: chipset revision 146 SvrWks CSB5: not 100% native mode: will probe irqs later ide0: BM-DMA at 0x08b0-0x08b7, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0x08b8-0x08bf, BIOS settings: hdc:pio, hdd:pio Probing IDE interface ide0... hda: CRD-8482B, ATAPI CD/DVD-ROM drive hda: Disabling (U)DMA for CRD-8482B (blacklisted) Using cfq io scheduler ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... Probing IDE interface ide1... Probing IDE interface ide2... Probing IDE interface ide3... Probing IDE interface ide4... Probing IDE interface ide5... hda: ATAPI 48X CD-ROM drive, 128kB Cache Uniform CD-ROM driver Revision: 3.20 ide-floppy driver 0.99.newide usbcore: registered new driver hiddev usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.0:USB HID core driver mice: PS/2 mouse device common for all mice md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 NET: Registered protocol family 2 IP route cache hash table entries: 65536 (order: 6, 262144 bytes) TCP established hash table entries: 262144 (order: 10, 4194304 bytes) TCP bind hash table entries: 262144 (order: 9, 3145728 bytes) TCP: Hash tables configured (established 262144 bind 262144) Initializing IPsec netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 ACPI wakeup devices: PCI0 PCI1 PCI2 ACPI: (supports S0 S4 S5) Freeing unused kernel memory: 188k freed SCSI subsystem initialized ACPI: PCI Interrupt 0000:02:02.0 -> GSI 29 (level, low) -> IRQ 185 ACPI: PCI Interrupt 0000:02:02.1 -> GSI 30 (level, low) -> IRQ 193 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs (scsi0:A:1): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) Vendor: SEAGATE Model: ST318406LC Rev: 8A03 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:1:0: Tagged Queuing enabled. Depth 4 SCSI device sda: 35566478 512-byte hdwr sectors (18210 MB) SCSI device sda: drive cache: write through SCSI device sda: 35566478 512-byte hdwr sectors (18210 MB) SCSI device sda: drive cache: write through sda: sda1 sda2 Attached scsi disk sda at scsi0, channel 0, id 1, lun 0 Vendor: DELL Model: 1x6 U2W SCSI BP Rev: 1.28 Type: Processor ANSI SCSI revision: 02 scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs device-mapper: 4.5.5-ioctl (2006-12-01) initialised: dm-devel@redhat.com cdrom: open failed. EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: dm-0: orphan cleanup on readonly fs ext3_orphan_cleanup: deleting unreferenced inode 1721984 EXT3-fs: dm-0: 1 orphan inode deleted EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. Attached scsi generic sg0 at scsi0, channel 0, id 1, lun 0, type 0 Attached scsi generic sg1 at scsi0, channel 0, id 6, lun 0, type 3 inserting floppy driver for 2.6.9-55.0.2.ELsmp Floppy drive(s): fd0 is 1.44M FDC 0 is a National Semiconductor PC87306 Ethernet Channel Bonding Driver: v2.6.3-rh (June 8, 2005) bonding: MII link monitoring set to 200 ms divert: allocating divert_blk for bond0 Intel(R) PRO/1000 Network Driver - version 7.2.7-k2-NAPI Copyright (c) 1999-2006 Intel Corporation. ACPI: PCI Interrupt 0000:03:02.0 -> GSI 16 (level, low) -> IRQ 201 e1000: 0000:03:02.0: e1000_probe: (PCI:66MHz:64-bit) 00:c0:9f:06:ad:0f divert: allocating divert_blk for eth0 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) ACPI: PCI Interrupt 0000:00:0f.2 -> GSI 11 (level, low) -> IRQ 11 ohci_hcd 0000:00:0f.2: OHCI Host Controller ohci_hcd 0000:00:0f.2: irq 11, pci mem f882a000 ohci_hcd 0000:00:0f.2: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. ACPI: Power Button (FF) [PWRF] EXT3 FS on dm-0, internal journal cdrom: open failed. cdrom: open failed. loop: loaded (max 8 devices) kjournald starting. Commit interval 5 seconds EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 2031608k swap on /dev/main/swap. Priority:-1 extents:1 divert: freeing divert_blk for eth0 IA-32 Microcode Update Driver: v1.14 microcode: CPU0 already at revision 0x1c (current=0x1c) microcode: CPU1 already at revision 0x1c (current=0x1c) microcode: No new microdata for cpu 0 microcode: No new microdata for cpu 1 IA-32 Microcode Update Driver v1.14 unregistered ip_tables: (C) 2000-2002 Netfilter core team ip_conntrack version 2.1 (8192 buckets, 65536 max) - 340 bytes per conntrack Intel(R) PRO/1000 Network Driver - version 7.2.7-k2-NAPI Copyright (c) 1999-2006 Intel Corporation. ACPI: PCI Interrupt 0000:03:02.0 -> GSI 16 (level, low) -> IRQ 201 e1000: 0000:03:02.0: e1000_probe: (PCI:66MHz:64-bit) 00:c0:9f:06:ad:0f divert: allocating divert_blk for eth0 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex NET: Registered protocol family 5 [root@office ~]#

I would be thankful for any tips, hints or ideas.

Message Edited by Micropitt on 08-16-2007 11:41 AM

720 Posts

August 14th, 2007 19:00

Looks like you got your network card and the SCSI controller both on IRQ 11, try disabling ports you are not using to free up some IRQ's. At least, if you must share IRQ's pair human interface devices like kbd or a mouse with your high throughput devices like a NIC or a hard drive controller.
 
warwizard

6 Posts

August 15th, 2007 01:00

That is something I didn't look at, wow. Thank you for pointing that out. I can turn of USB support and Serial port which would free up IRQ's. I will give it a try in the morning. By the way, I did run the Dell Diagnostic software from a CD and that didn't report any errors.

6 Posts

August 16th, 2007 14:00

The problem was indeed a IRQ mismatch. After I disabled serial and parallel port to free up some IRQ's, everything was fine.
No Events found!

Top