|
|
$ cc -O -o sget sget.c
$ sgetThe program is waiting for input. Type in the following:
src/testfile
testfile: 0 characters, 0 words, 2 lines average characters per line: 0 average words per line: 0How can the file have two lines but zero characters?
$ wc testfile 2 16 58 testfile $ cat testfile this is a test case - line 1 this is a test case - line 2
$ debug sgetThe debugger displays the following:
Warning: No debugging information in sget New program sget (process p1) created HALTED p1 [main] 0x804895d (main+5:) pushl $0x8048920
Without debugging information, you can't set breakpoints on statements (debug doesn't have the information telling it where the code making up a particular line starts and ends), but you can still set breakpoints on functions.
debug> stop copyThe debugger displays the following:
Error: No entry "copy" exists
debug> stop sget.c@copyThe debugger displays the following:
EVENT [1] assigned debug> stop sget.c@scan_line EVENT [2] assigned
debug> run src/testfileWhen it reaches the breakpoint, debug prints the assembly language instruction that is being executed instead of the source line:
STOP EVENT TRIGGERED: sget.c@copy in p1 [copy in sget.c] 0x80487e9 (copy+5:) pushl $08048adc
debug> stop fgets EVENT [3] assigned debug> runThe debugger displays the following:
STOP EVENT TRIGGERED: fgets in p1 [fgets] 0xbffb7190 (fgets+0:) subl $0x20,%esp
debug> stackThe debugger displays the following:
Stack Trace for p1, Program sget *[0] fgets(0x80473d4, 0x400, 0x804a574) [0xbffb7190] [1] copy(0x8049c84, 0x8049c84) [0x804887c] [2] main(0x1, 0x804780c, 0x8047814) [0x8048a47] [3] _start() [0x8048690]
debug> syscall -x read {print -f"bytes read = %d\n" %eax}The debugger displays the following:
EVENT [4] assigned
The event you just defined will make debug print the return value right after the system call. The return value from read(2) is the number of bytes read:
debug> runThe debugger displays the following:
SYSTEM CALL EXIT 3 (read) in p1 [_read] 0xbffc9509 (_read+12:) jae +0x9 <bffc9514> [ _read ] bytes read = 58
debug> runThe debugger displays the following:
STOP EVENT TRIGGERED: sget.c@scan_line in p1 [scan_line in sget.c] 0x80486b2 (scan_line+2:) xorl %edi, %edi debug> print -f"%s" 0x8047a5c this is a test case - line 1The buffer is correct. The program is now at the beginning of scan_line, where it should be counting the number of characters and words in each
debug> disThe debugger displays the following:
Disassembly for p1, Program sget 0x80486b2 (scan_line+2:) xorl %edi, %edi 0x80486b4 (scan_line+4:) movl $0x804a094, %esi 0x80486b9 (scan_line+9:) jmp +0x23 <80486de> [ scan_line ] 0x80486bb (scan_line+11:) incl 0x804a090 [ char_cnt ] 0x80486c1 (scan_line+17:) cmpb $0x20, %bl 0x80486c4 (scan_line+20:) je +0x5 <80486cb> [ scan_line ] 0x80486c6 (scan_line+22:) cmpb $0x9, %bl 0x80486c9 (scan_line+25:) jne +0x4 <80486cf> [ scan_line ] 0x80486cb (scan_line+27:) xorl %edi, %edi 0x80486cd (scan_line+29:) jmp +0xf <80486de> [ scan_line ]
debug> set %verbose = source debug> step -i -c 10The debugger displays the following:
0x80486b4 (scan_line+4:) movl $0x804a094, %esi 0x80486b9 (scan_line+9:) jmp +0x23 <80486de> [ scan_line ] 0x80486de (scan_line+46:) movb (%esi), %bl 0x80486e0 (scan_line+48:) incl %esi 0x80486e1 (scan_line+49:) testb %bl, %bl 0x80486e3 (scan_line+51:) jne +0xffffffd6 <80486bb> [ scan_line ] 0x80486e5 (scan_line+53:) incl 0x804a088 [ line_cnt ] 0x80486eb (scan_line+59:) popl %ebx 0x80486ec (scan_line+60:) popl %esi 0x80486ed (scan_line+61:) popl %edi
debug> dump -c40 0x804a0a0The debugger displays the following:
Raw Dump for p1, Program sget 0x804a094: ........ 0x00000000 0x00000000 0x00000000 ................ 0x804a0a0: 0x00000000 0x00000000 0x00000000 0x00000000 ................ 0x804a0b0: 0x00000000 0x00000000 0x00000000 ............
If you compile without -g, your object file still has to have a minimal level of symbolic information for the link editor's use. Link editing is the step that combines object files to produce an executable. i.e., cc -o macros main.o macro.o. This basic information is limited to the name and address of global and static functions and variables. It doesn't include the types of the symbols, but it does provide enough information to determine if a symbol is a data symbol (variable) or a text symbol (function). Compiling with -g doesn't change any of the basic information but it adds:
If you debug a file with just the minimal level of symbol information, most of the debugger's commands will still work, but they may not be as helpful as you expected. The differences are:
You can combine object files compiled with and without -g.
For example, none of the library functions -- like malloc --
are compiled with -g.
What you will be able to do or see at any time
depends on which compilation unit the program is in when it stops.
You will be able to tell immediately if the compilation unit
was compiled with -g or not from the event notification
(the source line is displayed with -g --
you will see the assembly instruction otherwise).
The warning that debug printed
(No debugging information
)
is only displayed if none of the files in the executable were compiled with -g.
When you stop the process at a function, you may notice that the process stopped several instructions into the function. The instructions skipped constitute the function prologue. The nature and existence of the function prologue is processor specific, but it is usually needed to set up the stack frame and registers for the new function.
The important point to note for debugging purposes is that if the program is stopped in the function prologue, the values in the registers and the arguments in the stack trace may not be consistent. To minimize confusion, debug normally skips the function prologue (with or without -g).
If you don't want to skip the function prologue, set the breakpoint on the address (a hex number) rather than the name of the function:
debug> print func 0x8048924 debug> stop 0x8048924 EVENT [1] assigned debug> run STOP EVENT TRIGGERED: 0x8048924 in p2 [func] 0x8048924 (func+0:) jmp +0x0000012d <8048a56> [ func ]
There is a subtle difference in the scoping information available with and without -g. In both cases, static functions and variables are associated with the name of the compilation unit where they are defined, but the name of the compilation unit is only available for global variables if the file was compiled with -g. If the program is stopped in a static function, whether or not you used -g, or if it is stopped in a global function that was compiled with -g, the current file (%file) will be set to the name of the file, and you can access static variables and other static functions defined in the same compilation unit without having to use qualified names (for example, file@func). On the other hand, if the program is stopped in a global function compiled without -g, %file will be null, and you will have to use a qualified name to access any static function or variable, even though they were defined in the same compilation unit.
For example, even though main and copy are both defined in sget.c, when the program is stopped in main you have to type stop sget.c@copy to set the breakpoint. symbols -f can't show you any of the static variables, either:
debug> print %func, %file "main" 0 debug> symbols -f Symbols for p2, Program sget Name Location Line Warning: No current source fileWhen the program is stopped in copy, you can see the local variables, and you can set a breakpoint on another static function without qualification:
debug> print %func, %file "copy" "sget.c" debug> symbols -f Symbols for p2, Program sget Name Location Line base_name sget.c buffer sget.c char_cnt sget.c copy sget.c exit_code sget.c get sget.c getstats sget.c handler sget.c line_cnt sget.c path sget.c scan_line sget.c word_cnt sget.c debug> stop scan_line EVENT [3] assigned
The syscall (or sys) command creates an event that triggers when the program uses the specified system call. If no options were specified, debug will stop the program when it gets to the system call. The -e option (for enter) has the same effect. The -x option (for exit) will stop the program when it leaves the system call. Specifying both options (-ex) will make it stop at both places. Like stop, you can give syscall a count (-c count) and an optional command list. You can also specify more than one system call in one syscall command:
debug> syscall -x chown mknod chroot {stack}The help command will show you the set of system call names that syscall recognizes:
debug> help sysnames Valid system call names: 1 exit 2 fork 3 read 4 write . . .System call events work even on completely stripped object files.
The return value of a function returning a scalar object is stored in a special register whose designation depends on the machine you are using.
On any machine, you will be able to print the individual registers with print %rn, or you can see all of them at once with the regs command:
debug> regs Register Contents for p1, program sget %r1 r1_value %r2 r2_value %r3 r3_value . . .
where %rn is the register name and rn_value is its value.
You may also look at the individual machine instructions with the dis command. Like the shell-level command dis(1), dis interprets the machine instructions in the program and prints them in a human-readable format similar to the input to the assembler; therefore, this is called a ``dis-assembly''. The disassembly is static like a source listing, and shows the instructions as they are laid out in the object file, not the order in which they are executed.
Instruction stepping, on the other hand, shows you the actual path of execution as each instruction is executed. You may use step -i (-i for instructions) anywhere -- it works the same with or without debugging information. Note that you must use step -i to step at the instruction level; step (without the -i) does not automatically instruction step instead of statement step if you do not have line number information. Without any line number information, typing step is equivalent to typing run.
Without -g, debug doesn't know the type of any object in the program, so it assumes variables are all ints unless you tell it otherwise. You can print a value as any of the basic types (char, long, double, for example) with a cast or the appropriate format string:
debug> print (char *)0x8047114 "this is a test case - line 1\n"You can not, however, cast a value to any user-defined type such as structures and unions, because the debugger does not have the information describing those types. However, you may dump the contents of a structure (or anything else) as a series of bytes and interpret the bytes yourself (assuming you know the layout of the structure). The dump command displays an area of memory as both hex bytes and as characters, if printable. dump prints 16 bytes (4 words) at a time. If the location given isn't on a 16-byte boundary, it will print dots (..) up to the first byte that you asked for:
debug> dump 0x80473cc Raw Dump for p1, Program sget 0x80473cc: ........ ........ ........ 0x73696874 ............this 0x80473d0: 0x20736920 0x65742061 0x63207473 0x20657361 is a test case 0x80473e0: 0x696c202d 0x3120656e 0x0000000a 0x00000000 - line 1........ . . .dump normally tries to organize its output into words, of a size appropriate to the target architecture. For little-endian architectures, this means that the hexadecimal values for each byte will appear in a different order than the values actually appear in memory. You can suppress this with the -b option, which will make dump put out each byte individually, as it is laid out in memory.
debug> dump -b 0x80473cc Raw Dump for p1, Program sget 0x80473cc: .. .. .. .. .. .. .. .. .. .. .. .. 74 68 69 73 ............this 0x80473d0: 20 69 73 20 61 20 74 65 73 74 20 63 61 73 65 20 is a test case 0x80473e0: 2d 20 6c 69 6e 65 20 31 0a 00 00 00 00 00 00 00 - line 1........ 0x80473f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . . .Normally dump displays 256 bytes, but you can control the number of bytes dumped with the -c option or by setting the debugger variable %num_bytes.