bit-teufel
Eroberer
Hallo ,
ich habe unter einem SLES 11 SP3 mit custom kernel 3.10.9 ein GFS2 Volume (iSCSI) eingebunden.
Soweit funktioniert das auch aber nach einiger Zeit mit Datenzugriffen auf die LUN kommt folgender Fehler
[ 6240.651154] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6240.651159] httpd D ffff88007f619c58 0 19957 19893 0x00000000
[ 6240.651170] ffff88007653f9d8 0000000000000086 0000000000000000 ffff880036c27bc0
[ 6240.651180] 0000000000000000 0000000000000000 ffff88007653f9a8 ffff88007653e010
[ 6240.651189] ffff88007653e000 ffff88007653e010 ffff88007653e000 ffff88007653e000
[ 6240.651208] Call Trace:
[ 6240.651222] [<ffffffff81184f8d>] ? find_get_page+0x4d/0xc0
[ 6240.651229] [<ffffffff811852a5>] ? find_lock_page+0x25/0x80
[ 6240.651235] [<ffffffff811864ba>] ? find_or_create_page+0x3a/0xa0
[ 6240.651242] [<ffffffff81098163>] ? default_spin_lock_flags+0x13/0x30
[ 6240.651249] [<ffffffff818cf4f4>] schedule+0x24/0x70
[ 6240.651268] [<ffffffffa05c9dc9>] gfs2_glock_holder_wait+0x9/0x10 [gfs2]
[ 6240.651274] [<ffffffff818cd41a>] __wait_on_bit+0x5a/0x90
[ 6240.651287] [<ffffffffa05c9dc0>] ? gfs2_glock_demote_wait+0x10/0x10 [gfs2]
[ 6240.651301] [<ffffffffa05c9dc0>] ? gfs2_glock_demote_wait+0x10/0x10 [gfs2]
[ 6240.651306] [<ffffffff818cd4c4>] out_of_line_wait_on_bit+0x74/0x90
[ 6240.651314] [<ffffffff810d6430>] ? autoremove_wake_function+0x40/0x40
[ 6240.651327] [<ffffffffa05cbcac>] ? gfs2_glock_put+0x4c/0x260 [gfs2]
[ 6240.651341] [<ffffffffa05ca7de>] gfs2_glock_wait+0x3e/0x80 [gfs2]
[ 6240.651355] [<ffffffffa05cdc70>] gfs2_glock_nq+0x2f0/0x3d0 [gfs2]
[ 6240.651372] [<ffffffffa05d9851>] gfs2_glock_nq_init+0x21/0x40 [gfs2]
[ 6240.651417] [<ffffffffa05da371>] gfs2_permission+0xf1/0x100 [gfs2]
[ 6240.651434] [<ffffffffa05d9849>] ? gfs2_glock_nq_init+0x19/0x40 [gfs2]
[ 6240.651441] [<ffffffff811fbad6>] __inode_permission+0x46/0xf0
[ 6240.651447] [<ffffffff811fbbbd>] inode_permission+0x3d/0x60
[ 6240.651453] [<ffffffff811fed4d>] link_path_walk+0x46d/0x9b0
[ 6240.651459] [<ffffffff811fb525>] ? lock_rcu_walk+0x15/0x20
[ 6240.651465] [<ffffffff811ff3d3>] path_lookupat+0x53/0x8a0
[ 6240.651471] [<ffffffff811fbf03>] ? getname_flags+0x53/0x1b0
[ 6240.651477] [<ffffffff811ffc53>] filename_lookup+0x33/0xd0
[ 6240.651483] [<ffffffff81200dab>] user_path_at_empty+0x7b/0xb0
[ 6240.651490] [<ffffffff8109d91e>] ? bad_area_nosemaphore+0xe/0x10
[ 6240.651496] [<ffffffff818d4fe8>] ? __do_page_fault+0x2d8/0x540
[ 6240.651502] [<ffffffff81200dec>] user_path_at+0xc/0x10
[ 6240.651507] [<ffffffff811f5da1>] vfs_fstatat+0x51/0xb0
[ 6240.651512] [<ffffffff811f5e69>] vfs_lstat+0x19/0x20
[ 6240.651517] [<ffffffff811f5e8f>] SyS_newlstat+0x1f/0x50
[ 6240.651522] [<ffffffff818d5259>] ? do_page_fault+0x9/0x10
[ 6240.651529] [<ffffffff818d1b88>] ? page_fault+0x28/0x30
[ 6240.651535] [<ffffffff818d8f6d>] system_call_fastpath+0x1a/0x1f
oder der hier
[ 2880.650146] INFO: task httpd:3295 blocked for more than 480 seconds.
[ 2880.650152] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2880.650154] httpd D ffff88007f616c58 0 3295 3253 0x00000000
[ 2880.650161] ffff880062735c78 0000000000000082 ffff880062734000 ffff88003737a040
[ 2880.650167] 0000000000000000 00000000000080d0 ffff8800448dd3b0 ffff880062734010
[ 2880.650172] ffff880062734000 ffff880062734010 ffff880062734000 ffff880062734000
[ 2880.650177] Call Trace:
[ 2880.650207] [<ffffffffa05d218e>] ? gfs2_holder_uninit+0x1e/0x40 [gfs2]
[ 2880.650218] [<ffffffffa05d2539>] ? gfs2_glock_dq_uninit+0x19/0x20 [gfs2]
[ 2880.650231] [<ffffffffa05dbed5>] ? gfs2_open+0xe5/0x160 [gfs2]
[ 2880.650240] [<ffffffff81098163>] ? default_spin_lock_flags+0x13/0x30
[ 2880.650251] [<ffffffff818cf4f4>] schedule+0x24/0x70
[ 2880.650261] [<ffffffffa05cfdc9>] gfs2_glock_holder_wait+0x9/0x10 [gfs2]
[ 2880.650265] [<ffffffff818cd41a>] __wait_on_bit+0x5a/0x90
[ 2880.650275] [<ffffffffa05cfdc0>] ? gfs2_glock_demote_wait+0x10/0x10 [gfs2]
....
Ich habe auf der LUN einen Apachen laufen der, der daraufhin abstürtzt bzw. nicht mehr
korrekt arbeitet.
Gibt es irgendwelche Tips und Tricks wie ich das Problem lösen kann ?
Vielen Dank im Voraus
B.-D.
ich habe unter einem SLES 11 SP3 mit custom kernel 3.10.9 ein GFS2 Volume (iSCSI) eingebunden.
Soweit funktioniert das auch aber nach einiger Zeit mit Datenzugriffen auf die LUN kommt folgender Fehler
[ 6240.651154] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6240.651159] httpd D ffff88007f619c58 0 19957 19893 0x00000000
[ 6240.651170] ffff88007653f9d8 0000000000000086 0000000000000000 ffff880036c27bc0
[ 6240.651180] 0000000000000000 0000000000000000 ffff88007653f9a8 ffff88007653e010
[ 6240.651189] ffff88007653e000 ffff88007653e010 ffff88007653e000 ffff88007653e000
[ 6240.651208] Call Trace:
[ 6240.651222] [<ffffffff81184f8d>] ? find_get_page+0x4d/0xc0
[ 6240.651229] [<ffffffff811852a5>] ? find_lock_page+0x25/0x80
[ 6240.651235] [<ffffffff811864ba>] ? find_or_create_page+0x3a/0xa0
[ 6240.651242] [<ffffffff81098163>] ? default_spin_lock_flags+0x13/0x30
[ 6240.651249] [<ffffffff818cf4f4>] schedule+0x24/0x70
[ 6240.651268] [<ffffffffa05c9dc9>] gfs2_glock_holder_wait+0x9/0x10 [gfs2]
[ 6240.651274] [<ffffffff818cd41a>] __wait_on_bit+0x5a/0x90
[ 6240.651287] [<ffffffffa05c9dc0>] ? gfs2_glock_demote_wait+0x10/0x10 [gfs2]
[ 6240.651301] [<ffffffffa05c9dc0>] ? gfs2_glock_demote_wait+0x10/0x10 [gfs2]
[ 6240.651306] [<ffffffff818cd4c4>] out_of_line_wait_on_bit+0x74/0x90
[ 6240.651314] [<ffffffff810d6430>] ? autoremove_wake_function+0x40/0x40
[ 6240.651327] [<ffffffffa05cbcac>] ? gfs2_glock_put+0x4c/0x260 [gfs2]
[ 6240.651341] [<ffffffffa05ca7de>] gfs2_glock_wait+0x3e/0x80 [gfs2]
[ 6240.651355] [<ffffffffa05cdc70>] gfs2_glock_nq+0x2f0/0x3d0 [gfs2]
[ 6240.651372] [<ffffffffa05d9851>] gfs2_glock_nq_init+0x21/0x40 [gfs2]
[ 6240.651417] [<ffffffffa05da371>] gfs2_permission+0xf1/0x100 [gfs2]
[ 6240.651434] [<ffffffffa05d9849>] ? gfs2_glock_nq_init+0x19/0x40 [gfs2]
[ 6240.651441] [<ffffffff811fbad6>] __inode_permission+0x46/0xf0
[ 6240.651447] [<ffffffff811fbbbd>] inode_permission+0x3d/0x60
[ 6240.651453] [<ffffffff811fed4d>] link_path_walk+0x46d/0x9b0
[ 6240.651459] [<ffffffff811fb525>] ? lock_rcu_walk+0x15/0x20
[ 6240.651465] [<ffffffff811ff3d3>] path_lookupat+0x53/0x8a0
[ 6240.651471] [<ffffffff811fbf03>] ? getname_flags+0x53/0x1b0
[ 6240.651477] [<ffffffff811ffc53>] filename_lookup+0x33/0xd0
[ 6240.651483] [<ffffffff81200dab>] user_path_at_empty+0x7b/0xb0
[ 6240.651490] [<ffffffff8109d91e>] ? bad_area_nosemaphore+0xe/0x10
[ 6240.651496] [<ffffffff818d4fe8>] ? __do_page_fault+0x2d8/0x540
[ 6240.651502] [<ffffffff81200dec>] user_path_at+0xc/0x10
[ 6240.651507] [<ffffffff811f5da1>] vfs_fstatat+0x51/0xb0
[ 6240.651512] [<ffffffff811f5e69>] vfs_lstat+0x19/0x20
[ 6240.651517] [<ffffffff811f5e8f>] SyS_newlstat+0x1f/0x50
[ 6240.651522] [<ffffffff818d5259>] ? do_page_fault+0x9/0x10
[ 6240.651529] [<ffffffff818d1b88>] ? page_fault+0x28/0x30
[ 6240.651535] [<ffffffff818d8f6d>] system_call_fastpath+0x1a/0x1f
oder der hier
[ 2880.650146] INFO: task httpd:3295 blocked for more than 480 seconds.
[ 2880.650152] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2880.650154] httpd D ffff88007f616c58 0 3295 3253 0x00000000
[ 2880.650161] ffff880062735c78 0000000000000082 ffff880062734000 ffff88003737a040
[ 2880.650167] 0000000000000000 00000000000080d0 ffff8800448dd3b0 ffff880062734010
[ 2880.650172] ffff880062734000 ffff880062734010 ffff880062734000 ffff880062734000
[ 2880.650177] Call Trace:
[ 2880.650207] [<ffffffffa05d218e>] ? gfs2_holder_uninit+0x1e/0x40 [gfs2]
[ 2880.650218] [<ffffffffa05d2539>] ? gfs2_glock_dq_uninit+0x19/0x20 [gfs2]
[ 2880.650231] [<ffffffffa05dbed5>] ? gfs2_open+0xe5/0x160 [gfs2]
[ 2880.650240] [<ffffffff81098163>] ? default_spin_lock_flags+0x13/0x30
[ 2880.650251] [<ffffffff818cf4f4>] schedule+0x24/0x70
[ 2880.650261] [<ffffffffa05cfdc9>] gfs2_glock_holder_wait+0x9/0x10 [gfs2]
[ 2880.650265] [<ffffffff818cd41a>] __wait_on_bit+0x5a/0x90
[ 2880.650275] [<ffffffffa05cfdc0>] ? gfs2_glock_demote_wait+0x10/0x10 [gfs2]
....
Ich habe auf der LUN einen Apachen laufen der, der daraufhin abstürtzt bzw. nicht mehr
korrekt arbeitet.
Gibt es irgendwelche Tips und Tricks wie ich das Problem lösen kann ?
Vielen Dank im Voraus
B.-D.