Allinea DDT Debugger

Allinea DDT Debugger
Dan Mazur, McGill HPC
[email protected]
[email protected]
March 26, 2014
2014-03-26
1
Outline
●
Introduction and motivation
●
Guillimin login and DDT configuration
●
Compiling for a debugger
●
Launching debugging jobs
●
Controlling program execution
–
Stepping, breakpoints, watchpoints, tracepoints
●
Viewing variable contents
●
Debugging parallel programs
●
Effective debugging strategies
●
Practice
2014-03-26
3
Why use a debugger?
We want to debug efficiently to reduce
the total development time:
Inefficient
Coding
More
efficient
Coding
Debugging
Debugging
Time
2014-03-26
4
Why a debugger?
●
●
Greatly reduces the amount of time required for debugging
–
For some projects, the time spent debugging can be greater than the time
spent coding!
–
If you program, you should learn how to debug effectively
Why not just use printf() statements?
–
May create much more data than is useful (parallel applications)
–
debugging is fixed at compile time
–
frequent task switching (thinking, editing, compiling, running, repeat)
–
might introduce bugs (constantly changing/recompiling code)
–
might forget to remove debugging output
–
slower (typing, compiling, runtime)
–
misleading (buffered output, changes timing and memory)
2014-03-26
5
What does it do?
●
All symbolic debuggers have these basic features:
– Step through code line-by-line
– Inspect variables
– Run and pause the code according to specified conditions
(e.g. line numbers, function calls, variable changes)
– Catches signals such as SIGSEGV (Segmentation Fault)
2014-03-26
6
What does it not do?
●
●
Debuggers do not:
–
Turn defective code into working code
–
Find or fix bugs
–
Understand the programmers' intentions
Debugger
–
2014-03-26
Better name: 'Program Inspector'
7
Exercise 1: Login and
configuration
●
●
Login to Guillimin with X11 forwarding
–
$ ssh -X class##@guillimin-p2.hpc.mcgill.ca
–
## same as csuser## numbers
–
password is written on slips of paper handed out by instructor
Copy the workshop materials to your home directory
–
●
~/.
Load the environment modules
–
●
$ cp -R /software/workshop/ddt/*
$ module load DDT ifort_icc openmpi
Launch DDT
–
$ ddt &
2014-03-26
8
Compiling for
debuggers
●
●
Most programs
–
Binary format
–
Optimized by compiler
–
No longer described by original code
Programs for debugging
–
Special compiler flag (-g)
–
Programs keep a link to their source code
–
Debugger functions are limited without this link
2014-03-26
9
Launching a Job
2014-03-26
10
Launching a Job
2014-03-26
11
Exercise 2: Compile and
launch
●
●
●
Compile the serial program serial.c for
debugging
Launch the program in DDT with the integer
argument 0 (zero)
Please follow along as I discuss the various
functions
2014-03-26
12
Template File
●
DDT uses a template file for job submission
–
●
default:
/sb/software/tools/ddt/templates/guillimin.qtf
To have more control over msub script
–
Copy this file to your ~/.allinea directory
–
Modify the file
–
Point DDT to your own copy in options>job
submission
2014-03-26
13
Source Code
Variable Monitoring
File Information
Execution Controls
Process Focus
Process Information
2014-03-26
14
Breakpoints
●
Breakpoints indicate where the program should
pause execution for inspection
–
●
Select a line of code, right click, select 'add
breakpoint'
–
●
At a certain line of code or function call
Or, click on the 'add breakpoint' icon
Right click on a breakpoint to select 'edit
breakpoint' or 'delete breakpoint'
–
Conditional breakpoints (break 'if' some condition)
2014-03-26
15
Breakpoints
Where to
break?
Which
processes?
Useful for
inspecting
large loops
Conditional
breakpoints
2014-03-26
16
Watchpoints
●
●
Watchpoints pause execution when a certain
memory location is changed
Right click on a variable in the variable
monitoring area to set a watchpoint
2014-03-26
17
Tracepoints
Log variables without pausing
execution
● Like printf(), but more flexible
● Right-click on a line of code to set
● Right-click again to edit or delete
●
2014-03-26
18
Tracepoint
When to
log?
Which
processes
?
Which
variables?
Useful for
inspecting
large loops
Conditional
tracepoints
2014-03-26
19
Stack Trace
●
Displays sequence of function calls
●
How did we get here?
2014-03-26
20
Execution Controls
A
B
C
D
E
F
G
A - Play (F9), Continue until next breakpoint or pause
B - Pause (F10), Interrupt program execution
C - Add breakpoint
D - Step into (F5), Enter next function and pause inside
E - Step over (F8), Execute the next function and pause outside
F - Step out (F6), Exit the current function and pause outside
G - Run to line, Play but pause on a certain line
2014-03-26
22
Exercise 3: Basic Functions
●
Try each basic function at least once in serial.c and
understand how it works
–
Breakpoints
–
Watchpoints
–
Tracepoints
–
Play
–
Step into
–
Step over
–
Step out
2014-03-26
23
Exercise 4: Inspecting Variables
●
Compile variables.c and launch with ddt
–
●
●
$ icc -g -o variables variables.c
Inspect the contents of each variable
Change the contents of one of the variables
during execution (array, for example)
2014-03-26
24
Exercise 4: Inspecting Variables
2014-03-26
25
Exercise 4: Inspecting Variables
2014-03-26
26
Exercise 4: Inspecting Variables
2014-03-26
27
Exercise 4: Inspecting Variables
2014-03-26
28
Exercise 4: Inspecting Variables
Evaluate combinations
of variables
Change a variable
2014-03-26
29
Exercise 5: Multithreaded Debugging
●
Compile omp_mm.c/.f90 with openmp for
debugging
–$
icc -g -openmp -o mm
omp-mm.c
●
2014-03-26
Launch with DDT
–Make sure to select
OpenMP with 4 threads
30
Multi-threaded Debugging
2014-03-26
31
Multi-threaded Debugging
Select which thread you are debugging (focus)
-Variable output will be thread specific
-Program execution can be per thread or all together
Stack trace shows you where each thread is
in the program execution
2014-03-26
32
2014-03-26
33
Multi-process Debugging
Compile pi.c
– $ mpicc -g -o pi pi.c
● Launch in DDT with 4 mpi processes
●
2014-03-26
34
Process Groups
2014-03-26
●
Create groups of threads
●
Step by group
35
Passwordless Login
●
●
DDT can attach to running processes
–
requires communication between nodes
–
asks for password each time
We will set up ssh keys to avoid the
passwords
2014-03-26
36
Setting up ssh keys
[class59@lg-1r17-n01 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/class59/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Generate a pair
(public, private)
of keys
Use default location
Your identification has been saved in /home/class59/.ssh/id_rsa.
Your public key has been saved in /home/class59/.ssh/id_rsa.pub.
The key fingerprint is:
6a:c1:fe:42:dd:43:56:78:4c:c3:67:45:6d:69:e6:97 class59@lg-1r17-n01
The key's randomart image is:
...
[class59@lg-1r17-n01 .ssh]$ cat id_rsa.pub >
~/.ssh/authorized_keys
[class59@lg-1r17-n01 .ssh]$ eval $(ssh-agent -s)
[class59@lg-1r17-n01 .ssh]$ ssh-add ~/.ssh/id_rsa
Enter passphrase for /home/class59/.ssh/id_rsa:
Identity added: /home/class59/.ssh/id_rsa (/home/class59/.ssh/id_rsa)
[class59@lg-1r17-n01 .ssh]$ ddt &
Copy your public key into
authorized_keys file
Save your passphrase
for this session
37
Exercise 6: Attaching
●
●
●
Compile and launch the job mpi_hang
–
$ mpicc -g -o mpi_hang mpi_hang.c
–
$ msub -q class mpi_hang.sh
Bug! This program hangs until it runs out of walltime
When the job is running, put the list of nodes into
~/.allinea/nodes (e.g. JobID=123456)
–
●
(Re-)launch DDT
–
2014-03-26
$ ddt_gen_nodelist 123456
ddt &
●
Select “Attach to a running program”
●
Goal is to inspect the bug, actually fixing it is bonus
38
Exercise 6: Attaching
2014-03-26
39
Exercise 6: Attaching
2014-03-26
40
Exercise 6: Attaching
2014-03-26
41
Exercise 7: Signals
●
●
●
Compile and run cstart.c on the command line
–
$ icc -g -o cstart cstart.c
–
$./cstart one two three
Will crash (Segmentation Fault) if any of the
command line variables are:
–
crash
–
memcrash
–
overflow
Inspect the code for each case
2014-03-26
42
Effective Debugging
●
●
●
●
2014-03-26
0: Planning and proper development strategies (future
workshop?)
1: Know exactly i) what it is supposed to do, ii) what it
actually does, and iii) how you know that it is broken
2: Look for simple answers:
–
Is your computer plugged in? turned on?
–
Correct version?
–
Correct input data?
3: Focus on the problem
–
Isolate the faulty subsystem precisely
–
Reduce the problem to the simplest possible test case
43
Effective Debugging
●
●
●
●
2014-03-26
4: Scientific Method
–
Invent a hypothesis to explain fault
–
Design an experiment to test the hypothesis
–
Repeat
5: Change one variable at a time (experimental controls)
6: Keep a good lab book (what results did each experiment
produce?)
7: Avoid:
–
Guesswork (do the experiments)
–
Random trial and error
–
Laying blame (libraries, compilers, OS, collaborators)
–
Band-aids and workarounds
–
Quitting as soon as its 'working' (If you didn't fix it, it isn't fixed)
44
MPI Debugging
●
Common MPI Deadlocks
–
Only one process calls a collective communication
function
–
All processes blocking receive before any process
sends
–
A message never gets sent
–
Process tries to receive data from itself
2014-03-26
45
MPI Debugging
●
Other common MPI bugs
–
Send/Recv type mismatch (eg. MPI_INT matched
with MPI_FLOAT)
–
Mixed-up parameters to MPI functions
2014-03-26
46
MPI Debugging
●
●
●
●
●
Collective communications are less error-prone than
point-to-point
Start with smallest possible number of processes (i.e.
1) and scale up slowly
Start with smallest possible problem size and scale
up slowly
printf output is chronological for a process, but not
between processes
Compare sent messages and received messages
2014-03-26
47
Getting Help
●
http://www.allinea.com/resources/
●
[email protected]
2014-03-26
48
Exercise 8: Debugging
Practice
●
●
●
2014-03-26
Please debug as many of the bugged practice problems as time allows
If you have your own code you would like to debug, or inspect you may do
that now
Practice problems:
–
InsLotsOfErrors.c (http://heather.cs.ucdavis.edu/~matloff/debug.html)
–
Lnk.c (http://heather.cs.ucdavis.edu/~matloff/debug.html)
–
programs in mpi_bug folder (Lawrence Livermore MPI Tutorial)
–
programs in omp_bug folder (Lawrence Livermore OpenMP Tutorial)
49