📄 oops-tracing.txt

📁 嵌入式系统设计与实例开发实验教材二源码多线程应用程序设计串行端口程序设计 AD接口实验 CAN总线通信实验 GPS通信实验 Linux内核移植与编译实验 IC卡读写实验 SD驱动使
💻 TXT
字号:
Quick Summary-------------Install ksymoops fromftp://ftp.<country>.kernel.org/pub/linux/utils/kernel/ksymoopsRead the ksymoops man page.ksymoops < the_oops.txtand send the output the maintainer of the kernel area that seems to beinvolved with the problem, not to the ksymoops maintainer. Don't worrytoo much about getting the wrong person. If you are unsure send it tothe person responsible for the code relevant to what you were doing.If it occurs repeatably try and describe how to recreate it. Thatsworth even more than the oopsIf you are totally stumped as to whom to send the report, send it to linux-kernel@vger.kernel.org. Thanks for your help in making Linux asstable as humanly possible.Where is the_oops.txt?----------------------Normally the Oops text is read from the kernel buffers by klogd andhanded to syslogd which writes it to a syslog file, typically/var/log/messages (depends on /etc/syslog.conf).  Sometimes klogd dies,in which case you can run dmesg > file to read the data from the kernelbuffers and save it.  Or you can cat /proc/kmsg > file, however youhave to break in to stop the transfer, kmsg is a "never ending file".If the machine has crashed so badly that you cannot enter commands orthe disk is not available then you have three options :-(1) Hand copy the text from the screen and type it in after the machine    has restarted.  Messy but it is the only option if you have not    planned for a crash.(2) Boot with a serial console (see Documentation/serial-console.txt),    run a null modem to a second machine and capture the output there    using your favourite communication program.  Minicom works well.(3) Patch the kernel with one of the crash dump patches.  These save    data to a floppy disk or video rom or a swap partition.  None of    these are standard kernel patches so you have to find and apply    them yourself.  Search kernel archives for kmsgdump, lkcd and    oops+smram.No matter how you capture the log output, feed the resulting file toksymoops along with /proc/ksyms and /proc/modules that applied at thetime of the crash.  /var/log/ksymoops can be useful to capture thelatter, man ksymoops for details.Full Information----------------From: Linus Torvalds <torvalds@transmeta.com>How to track down an Oops.. [originally a mail to linux-kernel]The main trick is having 5 years of experience with those pesky oops messages ;-)Actually, there are things you can do that make this easier. I have two separate approaches:	gdb /usr/src/linux/vmlinux	gdb> disassemble <offending_function>That's the easy way to find the problem, at least if the bug-report is well made (like this one was - run through ksymoops to get the information of which function and the offset in the function that it happened in).Oh, it helps if the report happens on a kernel that is compiled with the same compiler and similar setups.The other thing to do is disassemble the "Code:" part of the bug report: ksymoops will do this too with the correct tools, but if you don't havethe tools you can just do a silly program:	char str[] = "\xXX\xXX\xXX...";	main(){}and compile it with gcc -g and then do "disassemble str" (where the "XX" stuff are the values reported by the Oops - you can just cut-and-paste and do a replace of spaces to "\x" - that's what I do, as I'm too lazy to write a program to automate this all).Finally, if you want to see where the code comes from, you can do	cd /usr/src/linux	make fs/buffer.s 	# or whatever file the bug happened inand then you get a better idea of what happens than with the gdb disassembly.Now, the trick is just then to combine all the data you have: the C sources (and general knowledge of what it _should_ do), the assembly listing and the code disassembly (and additionally the register dump you also get from the "oops" message - that can be useful to see _what_ the corrupted pointers were, and when you have the assembler listing you can also match the other registers to whatever C expressions they were used for).Essentially, you just look at what doesn't match (in this case it was the "Code" disassembly that didn't match with what the compiler generated). Then you need to find out _why_ they don't match. Often it's simple - you see that the code uses a NULL pointer and then you look at the code and wonder how the NULL pointer got there, and if it's a valid thing to do you just check against it..Now, if somebody gets the idea that this is time-consuming and requires some small amount of concentration, you're right. Which is why I will mostly just ignore any panic reports that don't have the symbol table info etc looked up: it simply gets too hard to look it up (I have some programs to search for specific patterns in the kernel code segment, and sometimes I have been able to look up those kinds of panics too, but that really requires pretty good knowledge of the kernel just to be able to pick out the right sequences etc..)_Sometimes_ it happens that I just see the disassembled code sequence from the panic, and I know immediately where it's coming from. That's when I get worried that I've been doing this for too long ;-)		Linus---------------------------------------------------------------------------Notes on Oops tracing with klogd:In order to help Linus and the other kernel developers there has beensubstantial support incorporated into klogd for processing protectionfaults.  In order to have full support for address resolution at leastversion 1.3-pl3 of the sysklogd package should be used.When a protection fault occurs the klogd daemon automaticallytranslates important addresses in the kernel log messages to theirsymbolic equivalents.  This translated kernel message is thenforwarded through whatever reporting mechanism klogd is using.  Theprotection fault message can be simply cut out of the message filesand forwarded to the kernel developers.Two types of address resolution are performed by klogd.  The first isstatic translation and the second is dynamic translation.  Statictranslation uses the System.map file in much the same manner thatksymoops does.  In order to do static translation the klogd daemonmust be able to find a system map file at daemon initialization time.See the klogd man page for information on how klogd searches for mapfiles.Dynamic address translation is important when kernel loadable modulesare being used.  Since memory for kernel modules is allocated from thekernel's dynamic memory pools there are no fixed locations for eitherthe start of the module or for functions and symbols in the module.The kernel supports system calls which allow a program to determinewhich modules are loaded and their location in memory.  Using thesesystem calls the klogd daemon builds a symbol table which can be usedto debug a protection fault which occurs in a loadable kernel module.At the very minimum klogd will provide the name of the module whichgenerated the protection fault.  There may be additional symbolicinformation available if the developer of the loadable module chose toexport symbol information from the module.Since the kernel module environment can be dynamic there must be amechanism for notifying the klogd daemon when a change in moduleenvironment occurs.  There are command line options available whichallow klogd to signal the currently executing daemon that symbolinformation should be refreshed.  See the klogd manual page for moreinformation.A patch is included with the sysklogd distribution which modifies themodules-2.0.0 package to automatically signal klogd whenever a moduleis loaded or unloaded.  Applying this patch provides essentiallyseamless support for debugging protection faults which occur withkernel loadable modules.The following is an example of a protection fault in a loadable moduleprocessed by klogd:---------------------------------------------------------------------------Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97ccAug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000Aug 29 09:51:01 blizard kernel: *pde = 00000000Aug 29 09:51:01 blizard kernel: Oops: 0002Aug 29 09:51:01 blizard kernel: CPU:    0Aug 29 09:51:01 blizard kernel: EIP:    0010:[oops:_oops+16/3868]Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212Aug 29 09:51:01 blizard kernel: eax: 315e97cc   ebx: 003a6f80   ecx: 001be77b   edx: 00237c0cAug 29 09:51:01 blizard kernel: esi: 00000000   edi: bffffdb3   ebp: 00589f90   esp: 00589f8cAug 29 09:51:01 blizard kernel: ds: 0018   es: 0018   fs: 002b   gs: 002b   ss: 0018Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000)Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001 Aug 29 09:51:01 blizard kernel:        00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00 Aug 29 09:51:01 blizard kernel:        bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036 Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128] Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3 ---------------------------------------------------------------------------Dr. G.W. Wettstein           Oncology Research Div. Computing FacilityRoger Maris Cancer Center    INTERNET: greg@wind.rmcc.com820 4th St. N.Fargo, ND  58122Phone: 701-234-7556---------------------------------------------------------------------------Tainted kernels:Some oops reports contain the string 'Tainted: ' after the programcounter, this indicates that the kernel has been tainted by somemechanism.  The string is followed by a series of position sensitivecharacters, each representing a particular tainted value.  1: 'G' if all modules loaded have a GPL or compatible license, 'P' if     any proprietary module has been loaded.  Modules without a     MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by     insmod as GPL compatible are assumed to be proprietary.  2: 'F' if any module was force loaded by insmod -f, ' ' if all     modules were loaded normally.The primary reason for the 'Tainted: ' string is to tell kerneldebuggers if this is a clean kernel or if anything unusual hasoccurred.  Tainting is permanent, even if an offending module isunloading the tainted value remains to indicate that the kernel is nottrustworthy.
💿 文件大小 18508 K
👤 上传用户 scauliaorongjun
📂 所属分类 Linux/Unix编程
🏷️ 相关标签

#实验 #Linux #CAN #GPS
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -