Allinea DDT Debugger Dan Mazur, McGill HPC [email protected] [email protected] March 26, 2014 2014-03-26 1 Outline ● Introduction and motivation ● Guillimin login and DDT configuration ● Compiling for a debugger ● Launching debugging jobs ● Controlling program execution – Stepping, breakpoints, watchpoints, tracepoints ● Viewing variable contents ● Debugging parallel programs ● Effective debugging strategies ● Practice 2014-03-26 3 Why use a debugger? We want to debug efficiently to reduce the total development time: Inefficient Coding More efficient Coding Debugging Debugging Time 2014-03-26 4 Why a debugger? ● ● Greatly reduces the amount of time required for debugging – For some projects, the time spent debugging can be greater than the time spent coding! – If you program, you should learn how to debug effectively Why not just use printf() statements? – May create much more data than is useful (parallel applications) – debugging is fixed at compile time – frequent task switching (thinking, editing, compiling, running, repeat) – might introduce bugs (constantly changing/recompiling code) – might forget to remove debugging output – slower (typing, compiling, runtime) – misleading (buffered output, changes timing and memory) 2014-03-26 5 What does it do? ● All symbolic debuggers have these basic features: – Step through code line-by-line – Inspect variables – Run and pause the code according to specified conditions (e.g. line numbers, function calls, variable changes) – Catches signals such as SIGSEGV (Segmentation Fault) 2014-03-26 6 What does it not do? ● ● Debuggers do not: – Turn defective code into working code – Find or fix bugs – Understand the programmers' intentions Debugger – 2014-03-26 Better name: 'Program Inspector' 7 Exercise 1: Login and configuration ● ● Login to Guillimin with X11 forwarding – $ ssh -X class##@guillimin-p2.hpc.mcgill.ca – ## same as csuser## numbers – password is written on slips of paper handed out by instructor Copy the workshop materials to your home directory – ● ~/. Load the environment modules – ● $ cp -R /software/workshop/ddt/* $ module load DDT ifort_icc openmpi Launch DDT – $ ddt & 2014-03-26 8 Compiling for debuggers ● ● Most programs – Binary format – Optimized by compiler – No longer described by original code Programs for debugging – Special compiler flag (-g) – Programs keep a link to their source code – Debugger functions are limited without this link 2014-03-26 9 Launching a Job 2014-03-26 10 Launching a Job 2014-03-26 11 Exercise 2: Compile and launch ● ● ● Compile the serial program serial.c for debugging Launch the program in DDT with the integer argument 0 (zero) Please follow along as I discuss the various functions 2014-03-26 12 Template File ● DDT uses a template file for job submission – ● default: /sb/software/tools/ddt/templates/guillimin.qtf To have more control over msub script – Copy this file to your ~/.allinea directory – Modify the file – Point DDT to your own copy in options>job submission 2014-03-26 13 Source Code Variable Monitoring File Information Execution Controls Process Focus Process Information 2014-03-26 14 Breakpoints ● Breakpoints indicate where the program should pause execution for inspection – ● Select a line of code, right click, select 'add breakpoint' – ● At a certain line of code or function call Or, click on the 'add breakpoint' icon Right click on a breakpoint to select 'edit breakpoint' or 'delete breakpoint' – Conditional breakpoints (break 'if' some condition) 2014-03-26 15 Breakpoints Where to break? Which processes? Useful for inspecting large loops Conditional breakpoints 2014-03-26 16 Watchpoints ● ● Watchpoints pause execution when a certain memory location is changed Right click on a variable in the variable monitoring area to set a watchpoint 2014-03-26 17 Tracepoints Log variables without pausing execution ● Like printf(), but more flexible ● Right-click on a line of code to set ● Right-click again to edit or delete ● 2014-03-26 18 Tracepoint When to log? Which processes ? Which variables? Useful for inspecting large loops Conditional tracepoints 2014-03-26 19 Stack Trace ● Displays sequence of function calls ● How did we get here? 2014-03-26 20 Execution Controls A B C D E F G A - Play (F9), Continue until next breakpoint or pause B - Pause (F10), Interrupt program execution C - Add breakpoint D - Step into (F5), Enter next function and pause inside E - Step over (F8), Execute the next function and pause outside F - Step out (F6), Exit the current function and pause outside G - Run to line, Play but pause on a certain line 2014-03-26 22 Exercise 3: Basic Functions ● Try each basic function at least once in serial.c and understand how it works – Breakpoints – Watchpoints – Tracepoints – Play – Step into – Step over – Step out 2014-03-26 23 Exercise 4: Inspecting Variables ● Compile variables.c and launch with ddt – ● ● $ icc -g -o variables variables.c Inspect the contents of each variable Change the contents of one of the variables during execution (array, for example) 2014-03-26 24 Exercise 4: Inspecting Variables 2014-03-26 25 Exercise 4: Inspecting Variables 2014-03-26 26 Exercise 4: Inspecting Variables 2014-03-26 27 Exercise 4: Inspecting Variables 2014-03-26 28 Exercise 4: Inspecting Variables Evaluate combinations of variables Change a variable 2014-03-26 29 Exercise 5: Multithreaded Debugging ● Compile omp_mm.c/.f90 with openmp for debugging –$ icc -g -openmp -o mm omp-mm.c ● 2014-03-26 Launch with DDT –Make sure to select OpenMP with 4 threads 30 Multi-threaded Debugging 2014-03-26 31 Multi-threaded Debugging Select which thread you are debugging (focus) -Variable output will be thread specific -Program execution can be per thread or all together Stack trace shows you where each thread is in the program execution 2014-03-26 32 2014-03-26 33 Multi-process Debugging Compile pi.c – $ mpicc -g -o pi pi.c ● Launch in DDT with 4 mpi processes ● 2014-03-26 34 Process Groups 2014-03-26 ● Create groups of threads ● Step by group 35 Passwordless Login ● ● DDT can attach to running processes – requires communication between nodes – asks for password each time We will set up ssh keys to avoid the passwords 2014-03-26 36 Setting up ssh keys [class59@lg-1r17-n01 ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/class59/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Generate a pair (public, private) of keys Use default location Your identification has been saved in /home/class59/.ssh/id_rsa. Your public key has been saved in /home/class59/.ssh/id_rsa.pub. The key fingerprint is: 6a:c1:fe:42:dd:43:56:78:4c:c3:67:45:6d:69:e6:97 class59@lg-1r17-n01 The key's randomart image is: ... [class59@lg-1r17-n01 .ssh]$ cat id_rsa.pub > ~/.ssh/authorized_keys [class59@lg-1r17-n01 .ssh]$ eval $(ssh-agent -s) [class59@lg-1r17-n01 .ssh]$ ssh-add ~/.ssh/id_rsa Enter passphrase for /home/class59/.ssh/id_rsa: Identity added: /home/class59/.ssh/id_rsa (/home/class59/.ssh/id_rsa) [class59@lg-1r17-n01 .ssh]$ ddt & Copy your public key into authorized_keys file Save your passphrase for this session 37 Exercise 6: Attaching ● ● ● Compile and launch the job mpi_hang – $ mpicc -g -o mpi_hang mpi_hang.c – $ msub -q class mpi_hang.sh Bug! This program hangs until it runs out of walltime When the job is running, put the list of nodes into ~/.allinea/nodes (e.g. JobID=123456) – ● (Re-)launch DDT – 2014-03-26 $ ddt_gen_nodelist 123456 ddt & ● Select “Attach to a running program” ● Goal is to inspect the bug, actually fixing it is bonus 38 Exercise 6: Attaching 2014-03-26 39 Exercise 6: Attaching 2014-03-26 40 Exercise 6: Attaching 2014-03-26 41 Exercise 7: Signals ● ● ● Compile and run cstart.c on the command line – $ icc -g -o cstart cstart.c – $./cstart one two three Will crash (Segmentation Fault) if any of the command line variables are: – crash – memcrash – overflow Inspect the code for each case 2014-03-26 42 Effective Debugging ● ● ● ● 2014-03-26 0: Planning and proper development strategies (future workshop?) 1: Know exactly i) what it is supposed to do, ii) what it actually does, and iii) how you know that it is broken 2: Look for simple answers: – Is your computer plugged in? turned on? – Correct version? – Correct input data? 3: Focus on the problem – Isolate the faulty subsystem precisely – Reduce the problem to the simplest possible test case 43 Effective Debugging ● ● ● ● 2014-03-26 4: Scientific Method – Invent a hypothesis to explain fault – Design an experiment to test the hypothesis – Repeat 5: Change one variable at a time (experimental controls) 6: Keep a good lab book (what results did each experiment produce?) 7: Avoid: – Guesswork (do the experiments) – Random trial and error – Laying blame (libraries, compilers, OS, collaborators) – Band-aids and workarounds – Quitting as soon as its 'working' (If you didn't fix it, it isn't fixed) 44 MPI Debugging ● Common MPI Deadlocks – Only one process calls a collective communication function – All processes blocking receive before any process sends – A message never gets sent – Process tries to receive data from itself 2014-03-26 45 MPI Debugging ● Other common MPI bugs – Send/Recv type mismatch (eg. MPI_INT matched with MPI_FLOAT) – Mixed-up parameters to MPI functions 2014-03-26 46 MPI Debugging ● ● ● ● ● Collective communications are less error-prone than point-to-point Start with smallest possible number of processes (i.e. 1) and scale up slowly Start with smallest possible problem size and scale up slowly printf output is chronological for a process, but not between processes Compare sent messages and received messages 2014-03-26 47 Getting Help ● http://www.allinea.com/resources/ ● [email protected] 2014-03-26 48 Exercise 8: Debugging Practice ● ● ● 2014-03-26 Please debug as many of the bugged practice problems as time allows If you have your own code you would like to debug, or inspect you may do that now Practice problems: – InsLotsOfErrors.c (http://heather.cs.ucdavis.edu/~matloff/debug.html) – Lnk.c (http://heather.cs.ucdavis.edu/~matloff/debug.html) – programs in mpi_bug folder (Lawrence Livermore MPI Tutorial) – programs in omp_bug folder (Lawrence Livermore OpenMP Tutorial) 49
© Copyright 2024 ExpyDoc