Difference between revisions of "Development/GDB"

From bwHPC Wiki
Jump to: navigation, search
Line 76: Line 76:
 
|}
 
|}
   
'''Example:'''
 
We debug the following program called <kbd>bug.c</kbd> which crashes on execution.
 
   
<source lang="c">
 
#include <stdio.h>
 
 
int global = 0;
 
 
void begin() {
 
global = 1;
 
}
 
 
void loop() {
 
int v[2];
 
int i, k;
 
 
for(i = 0; i < 8; i++) {
 
k = i/2*2; /* should have been k = i/(2*2); */
 
v[k] = i;
 
}
 
}
 
 
void end() {
 
global = 2;
 
}
 
 
int main() {
 
begin();
 
loop();
 
end();
 
 
return 0;
 
}
 
</source>
 
 
'''Example GDB session:'''
 
<pre>
 
$ gcc -Wall -O0 -g bug.c -o bug
 
$ gdb ./bug
 
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-11.el8
 
[...]
 
Reading symbols from /pfs/data2/home/xx/xxx/xxxx/bug...done.
 
(gdb) break main
 
Breakpoint 1 at 0x4005b2: file bug.c, line 26.
 
(gdb) run
 
Starting program: /pfs/data2/home/xx/xxx/xxxx/bug
 
 
Breakpoint 1, main () at bug.c:26
 
26 begin();
 
(gdb) next
 
27 loop();
 
(gdb) next
 
 
Program received signal SIGSEGV, Segmentation fault.
 
0x0000000000000005 in ?? ()
 
(gdb) # now we know that the bug is in loop(). start again.
 
(gdb) run
 
The program being debugged has been started already.
 
Start it from the beginning? (y or n) y
 
Starting program: /pfs/data2/home/xx/xxx/xxxx/bug
 
 
Breakpoint 1, main () at bug.c:26
 
26 begin();
 
(gdb) next
 
27 loop();
 
(gdb) step
 
loop () at bug.c:13
 
13 for(i = 0; i < 8; i++)
 
(gdb) next
 
15 k = i/2*2;
 
(gdb) next
 
16 v[k] = i;
 
(gdb) # maybe k gets too big?
 
(gdb) watch (k >= 2)
 
Hardware watchpoint 2: (k >= 2)
 
(gdb) continue
 
Continuing.
 
Hardware watchpoint 2: (k >= 2)
 
 
Old value = 0
 
New value = 1
 
loop () at bug.c:16
 
16 v[k] = i;
 
(gdb) # k is too big
 
(gdb) print k
 
$1 = 2
 
(gdb) print i
 
$2 = 2
 
(gdb) quit
 
</pre>
 
<br>
 
   
 
= Branch record tracing =
 
= Branch record tracing =
 
Starting with GBD-10.1, the debugger has been installed with Intel Processor Trace [https://github.com/intel/libipt libipt], allowing recording and replaying of process state.
 
Starting with GBD-10.1, the debugger has been installed with Intel Processor Trace [https://github.com/intel/libipt libipt], allowing recording and replaying of process state.
 
This allows disassembling previously executed instructions, checking for previously called functions and branch tracing.
 
This allows disassembling previously executed instructions, checking for previously called functions and branch tracing.
 
E.g. with the previous executable <kbd>bug</kbd> we may shorten the debug cycle by turning on recording.
 
 
<pre>
 
$ gdb bug
 
GNU gdb (GDB) 10.1
 
[...]
 
Reading symbols from bug...
 
(gdb) break main
 
Breakpoint 1 at 0x400593: file bug.c, line 24.
 
(gdb) # Before we may turn on recording state, we have to have a running context
 
(gdb) run
 
Starting program: /pfs/data5/home/es/es_es/es_rakeller/C/bug
 
 
Breakpoint 1, main () at bug.c:24
 
(gdb) # Now we may turn on branch tracing with Intel Process Tracing semantics.
 
(gdb) record btrace pt
 
(gdb) cont
 
Continuing.
 
 
Program received signal SIGSEGV, Segmentation fault.
 
0x0000000000000007 in ?? ()
 
(gdb) # Just for information check how many instructions and functions have executed.
 
(gdb) info record
 
Active record target: record-btrace
 
Recording format: Intel Processor Trace.
 
Buffer size: 16kB.
 
Recorded 131 instructions in 4 functions (0 gaps) for thread 1 (process 1153077).
 
(gdb) # The last function call history shows the control flow on function level
 
(gdb) record function-call-history
 
1 main
 
2 begin
 
3 main
 
4 loop
 
(gdb) # Even more detail with the instruction counts and line numbers included:
 
(gdb) record function-call-history /ilc
 
1 main inst 1,2 at bug.c:24
 
2 begin inst 3,8 at bug.c:5,7
 
3 main inst 9,10 at bug.c:25
 
4 loop inst 11,131 at bug.c:9,17
 
(gdb) # Disassembles the last 10 instructions (mixed with source code) executed leading to the crash.
 
(gdb) record instruction-history /m
 
bug.c:15 v[k] = i;
 
122 0x0000000000400565 <loop+30>: mov -0x8(%rbp),%eax
 
123 0x0000000000400568 <loop+33>: cltq
 
124 0x000000000040056a <loop+35>: mov -0x4(%rbp),%edx
 
125 0x000000000040056d <loop+38>: mov %edx,-0x10(%rbp,%rax,4)
 
bug.c:13 for(i = 0; i < 8; i++) {
 
126 0x0000000000400571 <loop+42>: addl $0x1,-0x4(%rbp)
 
127 0x0000000000400575 <loop+46>: cmpl $0x7,-0x4(%rbp)
 
128 0x0000000000400579 <loop+50>: jle 0x400554 <loop+13>
 
bug.c:17 }
 
129 0x000000000040057b <loop+52>: nop
 
130 0x000000000040057c <loop+53>: pop %rbp
 
131 0x000000000040057d <loop+54>: ret
 
(gdb) # And another 10 disassembled instructions, nicely showing the control flow
 
(gdb) record instruction-history -
 
112 0x0000000000400571 <loop+42>: addl $0x1,-0x4(%rbp)
 
113 0x0000000000400575 <loop+46>: cmpl $0x7,-0x4(%rbp)
 
114 0x0000000000400579 <loop+50>: jle 0x400554 <loop+13>
 
115 0x0000000000400554 <loop+13>: mov -0x4(%rbp),%eax
 
116 0x0000000000400557 <loop+16>: mov %eax,%edx
 
117 0x0000000000400559 <loop+18>: shr $0x1f,%edx
 
118 0x000000000040055c <loop+21>: add %edx,%eax
 
119 0x000000000040055e <loop+23>: sar %eax
 
120 0x0000000000400560 <loop+25>: add %eax,%eax
 
121 0x0000000000400562 <loop+27>: mov %eax,-0x8(%rbp)
 
</pre>
 
   
 
Honestly, Segmentation Violations are better caught using [[Valgrind]]. However in this case,
 
Honestly, Segmentation Violations are better caught using [[Valgrind]]. However in this case,
Line 261: Line 103:
 
The following commands are useful for multithreaded debugging:
 
The following commands are useful for multithreaded debugging:
   
{| width=600px class="wikitable"
 
|-
 
! Command
 
! Description
 
|-
 
| info threads
 
| Shows the status of all existing threads.
 
|-
 
| thread ''num''
 
| Switches to the thread with the number ''num''
 
|}
 
 
'''Example:'''
 
We debug the following program called <kbd>thread_bug.c</kbd> which crashes on execution.
 
<source lang="c">
 
#include <stdio.h>
 
#include <pthread.h>
 
 
pthread_t thread;
 
 
void* thread3 (void* d)
 
{
 
int w[2];
 
int c, l;
 
 
for(c = 0; c < 8; c++) {
 
l = c/2*2; /* should have been l = c/(2*2); */
 
w[l] = c;
 
}
 
 
return NULL;
 
}
 
 
void* thread2 (void* d)
 
{
 
int v[2];
 
int i, k;
 
 
for(i = 0; i < 8; i++) {
 
sleep(4);
 
k = i/(2*2); /* should have been k = i/(2*2); */
 
v[k] = i;
 
}
 
 
return NULL;
 
}
 
 
int main (){
 
 
pthread_create (&thread, NULL, thread2, NULL);
 
pthread_create (&thread, NULL, thread3, NULL);
 
 
//Thread 1
 
int count1 = 0;
 
 
while(count1 < 4000) {
 
printf("Thread 1: %d\n", count1++);
 
}
 
 
pthread_join(thread, NULL);
 
return 0;
 
}
 
</source>
 
 
'''Sample GDB thread session:'''
 
<pre>
 
$ gcc -g thread_bug.c -o thread_bug -lpthread
 
$ gdb ./thread_bug
 
[...]
 
Reading symbols from /pfs/data2/home/xx/xxx/xxxx/thread_bug...done.
 
(gdb) break thread3
 
Breakpoint 1 at 0x40060c: file thread_bug.c, line 11.
 
(gdb) break thread2
 
Breakpoint 2 at 0x400650: file thread_bug.c, line 24.
 
(gdb) break main
 
Breakpoint 3 at 0x40069e: file thread_bug.c, line 35.
 
(gdb) run
 
Starting program: /tank/home/doros/.t/thread_bug
 
[Thread debugging using libthread_db enabled]
 
 
Breakpoint 3, main () at thread_bug.c:35
 
35 pthread_create (&thread, NULL, thread2, NULL);
 
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6.x86_64
 
(gdb) info threads
 
* 1 Thread 0x7ffff7fe5700 (LWP 28260) main () at thread_bug.c:35
 
(gdb) next
 
[New Thread 0x7ffff7fe3700 (LWP 28303)]
 
36 pthread_create (&thread, NULL, thread3, NULL);
 
(gdb) info threads
 
2 Thread 0x7ffff7fe3700 (LWP 28303) thread2 (d=0x0) at thread_bug.c:24
 
* 1 Thread 0x7ffff7fe5700 (LWP 28260) main () at thread_bug.c:36
 
(gdb) next
 
[Switching to Thread 0x7ffff7fe3700 (LWP 28303)]
 
 
Breakpoint 2, thread2 (d=0x0) at thread_bug.c:24
 
24 for(i = 0; i < 8; i++) {
 
(gdb) next
 
25 sleep(4);
 
(gdb) next
 
[New Thread 0x7ffff77e2700 (LWP 28344)]
 
[Switching to Thread 0x7ffff77e2700 (LWP 28344)]
 
 
Breakpoint 1, thread3 (d=0x0) at thread_bug.c:11
 
11 for(c = 0; c < 8; c++) {
 
(gdb) info threads
 
* 3 Thread 0x7ffff77e2700 (LWP 28344) thread3 (d=0x0) at thread_bug.c:11
 
2 Thread 0x7ffff7fe3700 (LWP 28303) 0x000000362f8accdd in nanosleep () from /lib64/libc.so.6
 
1 Thread 0x7ffff7fe5700 (LWP 28260) 0x000000362f8725db in _IO_new_file_overflow () from /lib64/libc.so.6
 
(gdb) thread 2
 
[Switching to thread 2 (Thread 0x7ffff7fe3700 (LWP 28303))]#0 0x000000362f8accdd in nanosleep () from /lib64/libc.so.6
 
(gdb) next
 
Single stepping until exit from function nanosleep,
 
which has no line number information.
 
[Switching to Thread 0x7ffff77e2700 (LWP 28344)]
 
 
Breakpoint 1, thread3 (d=0x0) at thread_bug.c:11
 
11 for(c = 0; c < 8; c++) {
 
(gdb) thread 2
 
[Switching to thread 2 (Thread 0x7ffff7fe3700 (LWP 28303))]#0 0x000000362f8acce9 in nanosleep () from /lib64/libc.so.6
 
(gdb) next
 
Single stepping until exit from function nanosleep,
 
which has no line number information.
 
0x000000362f8acb50 in sleep () from /lib64/libc.so.6
 
(gdb) info threads
 
3 Thread 0x7ffff77e2700 (LWP 28344) thread3 (d=0x0) at thread_bug.c:11
 
* 2 Thread 0x7ffff7fe3700 (LWP 28303) 0x000000362f8acb50 in sleep () from /lib64/libc.so.6
 
1 Thread 0x7ffff7fe5700 (LWP 28260) 0x000000362f8476f0 in vfprintf () from /lib64/libc.so.6
 
(gdb) thread 3
 
[Switching to thread 3 (Thread 0x7ffff77e2700 (LWP 28344))]#0 thread3 (d=0x0) at thread_bug.c:11
 
11 for(c = 0; c < 8; c++) {
 
(gdb) next
 
12 l = c/2*2; /* should have been l = c/(2*2); */
 
(gdb) watch (k >= 2)
 
No symbol "k" in current context.
 
(gdb) watch (l >= 2)
 
Hardware watchpoint 4: (l >= 2)
 
(gdb) continue
 
Continuing.
 
Thread 1: 0
 
Thread 1: 1
 
Thread 1: 2
 
Thread 1: 3
 
Thread 1: 4
 
[...]
 
Hardware watchpoint 4: (l >= 2)
 
 
Old value = 0
 
New value = 1
 
thread3 (d=0x0) at thread_bug.c:13
 
13 w[l] = c;
 
(gdb) print l
 
$1 = 2
 
(gdb) print c
 
$2 = 2
 
(gdb) quit
 
</pre>
 
= Disassembling =
 
{| width=600px class="wikitable"
 
|-
 
! Command
 
! Description
 
|-
 
| info functions
 
| Shows names and data types of all defined functions.
 
|-
 
| info line "function"
 
| Map source lines to memory adresses (and back).
 
|-
 
| disassemble ''function''
 
| Disassembles "function" (or a function fragment).
 
|}
 
 
 
'''Sample GDB disassembling session:'''
 
<pre>
 
$ gcc -Wall -O0 -g bug.c -o bug
 
$ gdb ./bug
 
[...]
 
(gdb) info functions
 
All defined functions:
 
 
File bug.c:
 
void begin();
 
void end();
 
void loop();
 
int main();
 
 
Non-debugging symbols:
 
0x0000000000400370 _init
 
0x00000000004003a0 __libc_start_main@plt
 
0x00000000004003b0 __gmon_start__@plt
 
0x00000000004003c0 _start
 
0x00000000004003f0 deregister_tm_clones
 
0x0000000000400430 register_tm_clones
 
0x0000000000400470 __do_global_dtors_aux
 
0x0000000000400490 frame_dummy
 
0x0000000000400540 __libc_csu_init
 
0x00000000004005b0 __libc_csu_fini
 
0x00000000004005b4 _fini
 
</pre>
 
'''Sample GDB disassembling session:'''
 
<pre>
 
(gdb) disassemble main
 
Dump of assembler code for function main:
 
0x000000000040050f <+0>: push %rbp
 
0x0000000000400510 <+1>: mov %rsp,%rbp
 
0x0000000000400513 <+4>: mov $0x0,%eax
 
0x0000000000400518 <+9>: callq 0x4004b6 <begin>
 
0x000000000040051d <+14>: mov $0x0,%eax
 
0x0000000000400522 <+19>: callq 0x4004c7 <loop>
 
0x0000000000400527 <+24>: mov $0x0,%eax
 
0x000000000040052c <+29>: callq 0x4004fe <end>
 
0x0000000000400531 <+34>: mov $0x0,%eax
 
0x0000000000400536 <+39>: pop %rbp
 
0x0000000000400537 <+40>: retq
 
End of assembler dump.
 
</pre>
 
'''Sample GDB disassembling session:'''
 
<pre>
 
(gdb) disassemble /m main
 
Dump of assembler code for function main:
 
23 int main() {
 
0x000000000040050f <+0>: push %rbp
 
0x0000000000400510 <+1>: mov %rsp,%rbp
 
 
24 begin();
 
0x0000000000400513 <+4>: mov $0x0,%eax
 
0x0000000000400518 <+9>: callq 0x4004b6 <begin>
 
 
25 loop();
 
0x000000000040051d <+14>: mov $0x0,%eax
 
0x0000000000400522 <+19>: callq 0x4004c7 <loop>
 
 
26 end();
 
0x0000000000400527 <+24>: mov $0x0,%eax
 
0x000000000040052c <+29>: callq 0x4004fe <end>
 
 
27
 
28 return 0;
 
0x0000000000400531 <+34>: mov $0x0,%eax
 
 
29 }
 
0x0000000000400536 <+39>: pop %rbp
 
0x0000000000400537 <+40>: retq
 
 
End of assembler dump.
 
</pre>
 
'''Sample GDB disassembling session:'''
 
<pre>
 
(gdb) disassemble /m loop
 
Dump of assembler code for function loop:
 
9 void loop() {
 
0x00000000004004c7 <+0>: push %rbp
 
0x00000000004004c8 <+1>: mov %rsp,%rbp
 
 
10 int v[2];
 
11 int i, k;
 
12
 
13 for(i = 0; i < 8; i++) {
 
0x00000000004004cb <+4>: movl $0x0,-0x4(%rbp)
 
0x00000000004004d2 <+11>: jmp 0x4004f5 <loop+46>
 
0x00000000004004f1 <+42>: addl $0x1,-0x4(%rbp)
 
0x00000000004004f5 <+46>: cmpl $0x7,-0x4(%rbp)
 
0x00000000004004f9 <+50>: jle 0x4004d4 <loop+13>
 
 
14 k = i/2*2; /* should have been k = i/(2*2); */
 
0x00000000004004d4 <+13>: mov -0x4(%rbp),%eax
 
0x00000000004004d7 <+16>: mov %eax,%edx
 
0x00000000004004d9 <+18>: shr $0x1f,%edx
 
0x00000000004004dc <+21>: add %edx,%eax
 
0x00000000004004de <+23>: sar %eax
 
0x00000000004004e0 <+25>: add %eax,%eax
 
0x00000000004004e2 <+27>: mov %eax,-0x8(%rbp)
 
 
15 v[k] = i;
 
0x00000000004004e5 <+30>: mov -0x8(%rbp),%eax
 
0x00000000004004e8 <+33>: cltq
 
0x00000000004004ea <+35>: mov -0x4(%rbp),%edx
 
0x00000000004004ed <+38>: mov %edx,-0x10(%rbp,%rax,4)
 
 
16 }
 
17 }
 
0x00000000004004fb <+52>: nop
 
0x00000000004004fc <+53>: pop %rbp
 
0x00000000004004fd <+54>: retq
 
 
End of assembler dump.
 
</pre>
 
'''Sample objdump disassembling session:'''
 
<pre>
 
$ objdump -S -D bug
 
[...]
 
00000000004004c7 <loop>:
 
 
void loop() {
 
4004c7: 55 push %rbp
 
4004c8: 48 89 e5 mov %rsp,%rbp
 
int v[2];
 
int i, k;
 
 
for(i = 0; i < 8; i++) {
 
4004cb: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
 
4004d2: eb 21 jmp 4004f5 <loop+0x2e>
 
k = i/2*2; /* should have been k = i/(2*2); */
 
4004d4: 8b 45 fc mov -0x4(%rbp),%eax
 
4004d7: 89 c2 mov %eax,%edx
 
4004d9: c1 ea 1f shr $0x1f,%edx
 
4004dc: 01 d0 add %edx,%eax
 
4004de: d1 f8 sar %eax
 
4004e0: 01 c0 add %eax,%eax
 
4004e2: 89 45 f8 mov %eax,-0x8(%rbp)
 
v[k] = i;
 
4004e5: 8b 45 f8 mov -0x8(%rbp),%eax
 
4004e8: 48 98 cltq
 
4004ea: 8b 55 fc mov -0x4(%rbp),%edx
 
4004ed: 89 54 85 f0 mov %edx,-0x10(%rbp,%rax,4)
 
 
void loop() {
 
int v[2];
 
int i, k;
 
 
for(i = 0; i < 8; i++) {
 
4004f1: 83 45 fc 01 addl $0x1,-0x4(%rbp)
 
4004f5: 83 7d fc 07 cmpl $0x7,-0x4(%rbp)
 
4004f9: 7e d9 jle 4004d4 <loop+0xd>
 
k = i/2*2; /* should have been k = i/(2*2); */
 
v[k] = i;
 
}
 
}
 
4004fb: 90 nop
 
4004fc: 5d pop %rbp
 
4004fd: c3 retq
 
[...]
 
</pre>
 
   
 
[[Category:debugger software]][[Category:bwUniCluster]]
 
[[Category:debugger software]][[Category:bwUniCluster]]

Revision as of 16:57, 15 December 2021

The main documentation is available via module help <category>/<softwarename> on the cluster. Most software modules for applications provide working example batch scripts.


Description Content
module load devel/gdb
License GPL
Citing n/a
Links Homepage | Documentation | Wiki | Mailinglists
Graphical Interface No
Included modules icc | icpc | ifort | idb


1 Description

The GNU Debugger (GDB) is a standard debugger for serial programs although it can be used for parallel and even distributed programs with few processes too. In the past Intel supported their own idb debugger, however this has been deprecated in favor of their own port called gdb-ia.

2 Basic commands

The code you want to debug should be compiled with the -g option. If the optimization flag is not set, GCC will still do some basic optimization, like dead-code elimination or reorder instruction execution obfuscating the order when debugging. Therefore, it is recommended to turn off optimization explicitly with the -O0 parameter for debugging. To start a debug session for a program execute GDB with the program path as parameter:

$ gdb ./example

Inside GDB is a prompt where you can enter commands. Important commands are listed below.

Command Description
help cmd Show help for command cmd.
break func Set a breakpoint at function func.
run Start program.
next Go to next program line. Do not enter functions.
step Go to next program line. Enter functions.
list Show the surrounding source code of the currently processed line.
print expr Print the value of the expression expr.
display expr Display the value of the expression expr every time the program stops.
watch expr Stop when value of the expression expr changes.
continue Continue execution until a breakpoint or a watchpoint appears.
backtrace Print a list of functions that are currently active.
quit Exit GDB.


3 Branch record tracing

Starting with GBD-10.1, the debugger has been installed with Intel Processor Trace libipt, allowing recording and replaying of process state. This allows disassembling previously executed instructions, checking for previously called functions and branch tracing.

Honestly, Segmentation Violations are better caught using Valgrind. However in this case, valgrind would not have helped: this loops overwrites v an array of 2 ints on the stack and the return address leading to the execution of IP 0x07.

More information is available in gdb's feature documentation


4 Core dumps

When the program crashes, a log file (called core dump) can be created which contains the state of the program when it crashed. This is turned off by default because these core dumps can get quite large. If you want to turn it on you have to change your ulimits, for example:

$ ulimit -c unlimited

Every time your program crashes a new file called core.xxx (where xxx is a number) will be created in the directory from which you started the executable. You can call gdb to examine your core dump using the following command (assuming your program is called ex):

$ gdb ./ex core.xxx

Now you can print a backtrace to check in which function the error happened and what values the parameters had. Additionally you can examine the values of your variables to reproduce the error.


5 Multithreaded debugging

GDB can also be useful for multithreaded applications for example when OpenMP was used. By going through each thread separately you can better see what is really going on and you can check the computation step by step. The following commands are useful for multithreaded debugging: