赞
踩
某项目Oracle 11g rac环境下,修改用户密码时会话被hang住,并且客户端用户在登录时也会被hang住。
现象如下:
SQL> alter user XXX identified by xxxxx;----->在修改时就会hang住,会话一直在等待
SQL> conn XXX/xxxx;----->无法进入用户会话
此时抓取hang Analyze以及做Systemstate Analysis获取trace分析
sqlplus -prelim / as sysdba
oradebug setorapname reco
oradebug unlimit
oradebug -g all hanganalyze 3
--等一分钟后再次执行hanganalyze分析
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 266
oradebug -g all dump systemstate 266
分析dump文件发现有上百个”library cache lock“等待事件,并且源头都是来自于同一个会话。
附trace log:
*** 2019-02-21 17:33:14.660 =============================================================================== HANG ANALYSIS: instances (db_name.oracle_sid): zjedb.zjedb1, zjedb.zjedb2 oradebug_node_dump_level: 3 analysis initiated by oradebug os thread scheduling delay history: (sampling every 1.000000 secs) 0.000000 secs at [ 17:33:14 ] NOTE: scheduling delay has not been sampled for 0.340580 secs 0.000000 secs from [ 17:33:10 - 17:33:15 ], 5 sec avg 0.000000 secs from [ 17:32:15 - 17:33:15 ], 1 min avg 0.000000 secs from [ 17:28:14 - 17:33:15 ], 5 min avg vktm time drift history =============================================================================== Chains most likely to have caused the hang: [a] Chain 1 Signature: <not in a wait><='library cache lock' Chain 1 Signature Hash: 0x24734cf [b] Chain 2 Signature: <not in a wait><='library cache lock' Chain 2 Signature Hash: 0x24734cf [c] Chain 3 Signature: <not in a wait><='library cache lock' Chain 3 Signature Hash: 0x24734cf =============================================================================== Non-intersecting chains: ------------------------------------------------------------------------------- Chain 1: ------------------------------------------------------------------------------- Oracle session identified by: { instance: 1 (zjedb.zjedb1) os id: 8118 process id: 96, oracle@zjerac1 session id: 6 session serial #: 36237 } is waiting for 'library cache lock' with wait info: { p1: 'handle address'=0x89d9bbab0 p2: 'lock address'=0x89e0f7560 p3: '100*mode+namespace'=0x4f0003 time in wait: 6 min 47 sec timeout after: 8 min 12 sec---->此会话已被阻塞6min以上 wait id: 7 blocking: 0 sessions current sql: <none> short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+2022<-kslwaitctx()+163<-kjusuc()+3400<-ksipgetctxi()+1759<-kqlmLock()+2679<-kqlmClusterLock()+191<-kgllkal()+2183<-kglLock()+1308<-kglget()+343<-kziasfc()+3126<-kpolnb()+6976<-kpoauth()+872<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 wait history: * time between current wait and wait #1: 0.000102 sec 1. event: 'library cache: mutex X' time waited: 0.010957 sec wait id: 6 p1: 'idn'=0x76e11533 p2: 'value'=0xc400000000 p3: 'where'=0x50 * time between wait #1 and #2: 0.000045 sec 2. event: 'latch: ges resource hash list' time waited: 0.003486 sec wait id: 5 p1: 'address'=0x8723bf798 p2: 'number'=0x70 p3: 'tries'=0x0 * time between wait #2 and #3: 0.000034 sec 3. event: 'library cache lock' time waited: 5 min 13 sec wait id: 4 p1: 'handle address'=0x89d9bbab0 p2: 'lock address'=0x89e0f7460 p3: '100*mode+namespace'=0x4f0002 } and is blocked by---->阻塞的源头 => Oracle session identified by: { instance: 1 (zjedb.zjedb1) os id: 7964 process id: 140, oracle@zjerac1 session id: 2273 ----->此为阻塞源头的会话sid session serial #: 52199 } which is not in a wait: { last wait: 0.997701 sec ago blocking: 388 sessions----->阻塞了300多个会话,等待很严重 current sql: <none> short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-__nanosleep()+16<-kpolnb()+6976<-kpoauth()+872<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 wait history: 1. event: 'library cache lock' time waited: 6 min 46 sec wait id: 7 p1: 'handle address'=0x89d9bbab0 p2: 'lock address'=0x88edfd2a8 p3: '100*mode+namespace'=0x4f0003 * time between wait #1 and #2: 0.000314 sec 2. event: 'library cache load lock' time waited: 0.001645 sec wait id: 6 p1: 'object address'=0x89d9bbab0 p2: 'lock address'=0x88edfd0a8 p3: '100*mask+namespace'=0x4f0003 * time between wait #2 and #3: 0.000079 sec 3. event: 'latch: ges resource hash list' time waited: 0.002859 sec wait id: 5 p1: 'address'=0x8723bf798 p2: 'number'=0x70 p3: 'tries'=0x0 } Chain 1 Signature: <not in a wait><='library cache lock' Chain 1 Signature Hash: 0x24734cf ------------------------------------------------------------------------------- =============================================================================== Intersecting chains: ------------------------------------------------------------------------------- Chain 2: ------------------------------------------------------------------------------- Oracle session identified by: { instance: 1 (zjedb.zjedb1) os id: 8149 process id: 160, oracle@zjerac1 session id: 7 session serial #: 24737 } is waiting for 'library cache lock' with wait info: { p1: 'handle address'=0x89d9bbab0 p2: 'lock address'=0x88f8d4ed0 p3: '100*mode+namespace'=0x4f0003 time in wait: 6 min 47 sec timeout after: 8 min 12 sec wait id: 6 blocking: 0 sessions current sql: <none> short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+2022<-kslwaitctx()+163<-kjusuc()+3400<-ksipgetctxi()+1759<-kqlmLock()+2679<-kqlmClusterLock()+191<-kgllkal()+2183<-kglLock()+1308<-kglget()+343<-kziasfc()+3126<-kpolnb()+6976<-kpoauth()+872<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 wait history: * time between current wait and wait #1: 0.000130 sec 1. event: 'latch: ges resource hash list' time waited: 0.005830 sec wait id: 5 p1: 'address'=0x8723bf798 p2: 'number'=0x70 p3: 'tries'=0x0 * time between wait #1 and #2: 0.000044 sec 2. event: 'library cache lock' time waited: 5 min 13 sec wait id: 4 p1: 'handle address'=0x89d9bbab0 p2: 'lock address'=0x88f8d4dd0 p3: '100*mode+namespace'=0x4f0002 * time between wait #2 and #3: 0.000216 sec 3. event: 'latch: ges resource hash list' time waited: 0.005371 sec wait id: 3 p1: 'address'=0x8723bf798 p2: 'number'=0x70 p3: 'tries'=0x0 } and is blocked by 'instance: 1, os id: 7964, session id: 2273',-->其余会话也是被这个sid阻塞 which is a member of 'Chain 1'. Chain 2 Signature: <not in a wait><='library cache lock' Chain 2 Signature Hash: 0x24734cf ---------------------------------------- SO: 0x87998d798, type: 4, owner: 0x869081560, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0x869081560, name=session, file=ksu.h LINE:12729, pg=0 (session) sid: 2273 ser: 52201 trans: (nil), creator: 0x869081560 flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/- flags2: (0x9) -/-/INC DID: , short-term DID: txn branch: (nil) edition#: 0 oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS ksuxds FALSE at location: 0 service name: zjedb client details:---->被阻塞的客户端信息 O/S info: user: WIN-0ELFKQG4A9V$, term: unknown, ospid: 1234 machine: WIN-0ELFKQG4A9V program: JDBC Thin Client application name: JDBC Thin Client, hash value=2546894660 Current Wait Stack: 0: waiting for 'library cache lock' handle address=0x89d9bbab0, lock address=0x886dac9b0, 100*mode+namespace=0x4f0002 wait_id=2 seq_num=3 snap_id=1 wait times: snap=1 min 54 sec, exc=1 min 54 sec, total=1 min 54 sec wait times: max=15 min 0 sec, heur=1 min 54 sec wait counts: calls=231 os=231 in_wait=1 iflags=0x15a2 There is at least one session blocking this session. Dumping 1 direct blocker(s): inst: 1, sid: 1145, ser: 52551 Dumping final blocker: inst: 1, sid: 1145, ser: 52551 Wait State: fixed_waits=0 flags=0x22 boundary=(nil)/-1 Session Wait History: elapsed time of 0.000459 sec since current wait 0: waited for 'SQL*Net message from client'
通过分析之后已经知道了’library cache lock’都是被同一个会话锁阻塞,那么将其手动kill方可解决故障。
alter system kill session 'sid,serial#' immediate;
在11g中存在”密码验证延迟“功能,此特性主要用来防止数据库密码被暴力破解,在极少数场景的生产系统业务修改密码,会导致大量会话在等待登录验证,造成大量’library cache lock’。
附MOS文档
文档 ID 1309738.1
In Oracle 11g Release 2 and higher, in order to disable the wait between login failures the event 28401 needs to be explicitly enabled:
The event can be set as follows:
alter system set event =“28401 TRACE NAME CONTEXT FOREVER, LEVEL 1” scope=spfile;
To unset the event, set as follows:
SQL> Alter system set event= '28401 trace name context off' scope=spfile ;
---->密码验证延迟关闭
For more information see:
Document:7715339.8 Logon failures causes “row cache lock” waits - Allow disable of logon delay
The following functions match per above bug:
nanosleep nanosleep kziasfc kpolnb kpoauth
Note: Care should be taken when setting this event, as this is disabling the sleep time which can leave the system more vulnerable.
赞
踩
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。