
Need help with server crashes (/var/crash files included)
Hey everyone. I have OpenBSD 3.1 (GENERIC), FTP install, on a P133.
It acts as my home network's firewall and router, with 3-5 clients
using it. After random amounts of time (days/weeks), the box hangs
and I am forced to reboot. I can not access it over the network and
it does not respond to any commands on its keyboard. It is headless
right now, but I guess I could plug in a monitor if needed.
Other services running on it are OpenSSH, DHCP, SMTP (Postfix), and
HTTP (Apache). I know running these services on your firewall is not
a good idea, but that's a separate issue. More information about
these services are listed below my log. But FYI, pf blocks any
outside WAN traffic to them.
Now my question is how to troubleshoot my{*filter*}/crash problem. I
have 3 new files in /var/crash. The first is bounds, which just has a
1 in it. I also have bsd.0 and bsd.0.core. I read up on some man
pages and did some basic commands to try and figure it out. I have no
idea how to interpret this output, except the very end. (more
comments below log)
--Beginning of log----------------------
# gdb
GNU gdb 4.16.1
Copyright 1996 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i386-unknown-openbsd3.1".
(gdb) file bsd.0.core
"/var/crash/bsd.0.core": not in executable format: File format not
recognized
(gdb) file bsd.0
Reading symbols from bsd.0...(no debugging symbols found)...done.
(gdb) target kcore bsd.0.core
panic: vrele: ref cnt
#0 0x1000 in ?? ()
(gdb) where
#0 0x1000 in ?? ()
#1 0xe02e9e70 in boot ()
#2 0xe01c1062 in panic ()
#3 0xe01de57a in vrele ()
#4 0xe01e4e71 in vn_close ()
#5 0xe01e540c in vn_closefile ()
#6 0xe01ac300 in closef ()
#7 0xe01ab669 in fdrelease ()
#8 0xe01ab6a1 in sys_close ()
#9 0xe02f34ce in syscall ()
#10 0xe0100e71 in Xsyscall ()
can not access 0xdfbfd6fc, invalid address (dfbfd6fc)
can not access 0xdfbfd6fc, invalid address (dfbfd6fc)
Cannot access memory at address 0xdfbfd6fc.
(gdb) quit
# ps ax -M bsd.0.core -N bsd.0
PID TT STAT TIME COMMAND
1 ?? Is 0:00.04 (init)
9733 ?? Is 0:01.24 (dhclient)
19004 ?? Rs 0:11.55 (syslogd)
12210 ?? Is 0:00.76 (dhcpd)
30945 ?? Is 0:00.46 (sshd)
17469 ?? Is 0:33.85 (master)
28568 ?? Is 0:01.42 (cron)
12238 ?? R 4:02.29 (qmgr)
24493 ?? Rs 0:24.89 (httpd)
10329 ?? I 0:04.32 (httpd)
13412 ?? I 0:00.09 (httpd)
16055 ?? I 0:00.09 (httpd)
32172 ?? I 0:00.11 (httpd)
20127 ?? I 0:00.12 (httpd)
9985 ?? I 0:00.07 (httpd)
21837 ?? I 0:00.09 (pickup)
24703 ?? R 0:00.07 (flush)
6142 ?? I 0:00.06 (local)
24905 ?? R 0:00.05 (local)
26779 C0- I 0:00.17 (ez-ipupdate)
12313 C0 Is+ 0:00.03 (getty)
# netstat -M bsd.0.core
Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address
(state)
tcp 0 0 pcp778838pcs.dal.7861 *.*
LISTEN
Out of memory (file table).
--End of log----------------------------
I'm assuming the "out of memory (file table)" is not good and that my
network interfaces or pf's state table have something to do with it.
Here is more info on the processes. dhclient is for my cable modem's
dynamic IP. dhcpd is for my LAN. sshd is OpenSSH_3.4. master, qmgr,
pickup, flush, and local are for Postfix. httpds are Apache 2.0.40.
ez-ipupdate is a client for dyndns.org.
As you can see, I am generally up to date with all of my software.
After this first started happening, I updated all of my third party
software to the latest stable release to make sure any known issues
have been resolved.
Here is my dmesg for good measure. Thanks in advance for any help you
guys can offer.
--Beginning of dmesg--------------------
# dmesg
OpenBSD 3.1 (GENERIC) #59: Sat Apr 13 15:28:52 MDT 2002
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: F00F bug workaround installed
cpu0: Intel Pentium (P54C) ("GenuineIntel" 586-class) 133 MHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8
real mem = 49917952 (48748K)
avail mem = 40706048 (39752K)
using 634 buffers containing 2596864 bytes (2536K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 12/20/95, BIOS32 rev. 0 @
0xfd9b0
pcibios0 at bios0: rev. 2.1 @ 0xf0000/0x10000
pcibios0: PCI BIOS has 4 Interrupt Routing table entries
pcibios0: PCI Interrupt Router at 000:07:0 ("Intel 82371FB PCI-ISA"
rev 0x00)
pcibios0: PCI bus #0 is the last bus
bios0: ROM list: 0xc0000/0x8000 0xea000/0x2000
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 82437FX" rev 0x02
pcib0 at pci0 dev 7 function 0 "Intel 82371FB PCI-ISA" rev 0x02
vga1 at pci0 dev 8 function 0 "S3 Trio32/64" rev 0x00
wsdisplay0 at vga1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
ep0 at pci0 dev 17 function 0 "3Com 3c595 100Base-TX" rev 0x00:
address 00:a0:24:50:3e:34, utp/100-TX default utp/autoselect irq 11
dc0 at pci0 dev 19 function 0 "Davicom Technologies DM9102" rev 0x31:
irq 11 address 00:80:ad:79:8c:26
ukphy0 at dc0 phy 1: Generic IEEE 802.3u media interface
ukphy0: OUI 0x00606e, model 0x0004, rev. 0
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
wdc0 at isa0 port 0x1f0/8 irq 14
wd0 at wdc0 channel 0 drive 0: <QUANTUM FIREBALL1280A>
wd0: 8-sector PIO, LBA, 1222MB, 2484 cyl, 16 head, 63 sec, 2503872
sectors
wd0(wdc0:0:0): using BIOS timings
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0/16: using exception 16
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask 4040 netmask 4840 ttymask 4842
pctr: 586-class performance counters and user-level cycle counter
enabled
dkcsum: wd0 matched BIOS disk 80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
--End of dmesg--------------------------