Debugging Python using pdb and pdbrc

`pdb` is the built-in debugger of Python. With the **REPL** (Read-Evaluate-Print-Loop) Python interpretor, the `pdb` debugger can be extremely helpful in the initial development, and of cause, the debugging stages of your project.

Typical usage of the pdb debugger

pdb is the built-in debugger of Python. With the REPL (Read-Evaluate-Print-Loop) Python interpretor, the pdb debugger can be extremely helpful in the initial development, and of cause, the debugging stages of your project.

The two most common usages of pdb are:

  1. Set break points in the code and enter into the interactive debugging prompt.
  2. Execute the script with the pdb module and enter into post-mortem debugging prompt if any error occurred.

We will start from the 1st usage case using a simple toy example script as an illustration. Then I will introduce some commonly used debugging commands, which can also be used in the 2nd usage type which will be covered after that. Finally, I’d like to share some debugging tips using the .pdbrc configuration file.

Below is the example script:

def printCell(row, col, ncol):
    '''Form the string for a given cell'''
    end='\n' if col==ncol-1 else ' '
    print('(%d, %d)' %(row, col), end=end)

def matrixIndices(nrow, ncol):
    '''Print element indices for a 2d matrix'''
    __import__('pdb').set_trace()
    for i in range(nrow):
        for j in range(ncol):
            printCell(i, j, ncol)

matrixIndices(5,4)

What it does is printing out the line and row indices of a 5×4 matrix, like so:

(0, 0) (0, 1) (0, 2) (0, 3)
(1, 0) (1, 1) (1, 2) (1, 3)
(2, 0) (2, 1) (2, 2) (2, 3)
(3, 0) (3, 1) (3, 2) (3, 3)
(4, 0) (4, 1) (4, 2) (4, 3)

Usage 1: Set break points in the code

You can use this line of code to insert a break point anywhere in your Python code:

import pdb; pdb.set_trace()

Just like at Line 8 of the toy example. So when executing the script, it will stop at the break point, putting you to a debugger prompt:

-> for i in range(nrow):
(Pdb) 

The line after -> shows the next line of code after the break point that would be executed in the next step, and (Pdb) tells you that you are in the debugger prompt.

Now you have a number of debugging commands at hand that you can use to help debug the code.

Debugging commands

Here are the most commonly used debugging commands that I use in the (Pdb) prompt.

l or list: print the context lines around the break point.

Continuing from the above example, we type the l command at the prompt and hit Enter:

-> for i in range(nrow):
(Pdb) l

The outputs look like:

-> for i in range(nrow):
(Pdb) l
  4  	    print('(%d, %d)' %(row, col), end=end)
  5  	
  6  	def matrixIndices(nrow, ncol):
  7  	    '''Print element indices for a 2d matrix'''
  8  	    __import__('pdb').set_trace()
  9  ->	    for i in range(nrow):
 10  	        for j in range(ncol):
 11  	            printCell(i, j, ncol)
 12  	
 13  	matrixIndices(5,4)
 14  	

Again, -> indicates the next line.

Also note that you can keep on executing the l (or list) command to show the next few lines. But doing so too many times will probably get yourself lost. Later I will show you a way to get back to the break point again when you have moved too far away from the break point.

n or next: execute the next line

Remember the next line is the one where the -> is pointing. So if we type n and hit Enter, it will run the outer for loop (iterating i), and stops again:

(Pdb) n
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(10)matrixIndices()
-> for j in range(ncol):
(Pdb) 

Note that it also prints out that we are current at in Line 10 in pdb_example.py which is the Python file being run.

Also note that the next command literally moves just 1 line down. It means that, for example, if your function call takes several lines like this:

result = myComplexFunction(arg1, arg2, arg_list1,
    arg_tuple1, keyword1=anotherFunc('Key'), keyword2='Big Key',
    keyword3='Huge key')

The next command will stops at each line of this 3-line function call, and will execute the function only after you run n at the 3rd line (keyword3='Huge key')).

I found this feature more useful when using with the step-in command, which will be talked about next.

s or step: step in a function call.

Use the above myComplexFunction example again. Similar as n, s will also try to step at each line in the 3-line function call. Note that at the 2nd line there is another function call anotherFunc('Key'), so if you run s at that line, you will be stepping into the anotherFunc function.

Sometimes this is what is desired: you may want to examine the inner workings of all these "nested" functions and eventually step into the "outer" function. But sometimes this could put you in a rabbit hole. To prevent stepping a "nested" function, what I usually do is use n with s. For instance, when the debugger puts me at this 1st line:

-> result = myComplexFunction(arg1, arg2, arg_list1,
      arg_tuple1, keyword1=anotherFunc('Key'), keyword2='Big Key', 
      keyword3='Huge key')

Running n puts me on the 2nd line, where the "inner" anotherFunc appears. Then running n moves on to the 3rd line, where a final s will step me into the myComplexFunction that I wanted to step in.

Let’s come back to the matrixIndices() example. If you have been following the execution, we should currently be at Line 10 (for j in range(ncol):), run n again to enter the inner column loop:

-> for j in range(ncol):
(Pdb) n
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(11)matrixIndices()
-> printCell(i, j, ncol)
(Pdb) 

Note that -> tells that the next line calls the printCell() function. Now use the s command to step in, and l to remind us of where we are:

-> printCell(i, j, ncol)
(Pdb) s
--Call--
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(1)printCell()
-> def printCell(row, col, ncol):
(Pdb) l
  1  ->	def printCell(row, col, ncol):
  2  	    '''Form the string for a given cell'''
  3  	    end='\n' if col==ncol-1 else ' '
  4  	    print('(%d, %d)' %(row, col), end=end)
  5  	
  6  	def matrixIndices(nrow, ncol):
  7  	    '''Print element indices for a 2d matrix'''
  8  	    __import__('pdb').set_trace()
  9  	    for i in range(nrow):
 10  	        for j in range(ncol):
 11  	            printCell(i, j, ncol)

These 2 lines

> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(1)printCell()
-> def printCell(row, col, ncol):

tells us we have stepped in the function printCell() defined at Line 1 in the file pdb_example.py.

The l command shows the context lines, which in this case is enough to cover the entire printCell() function.

What we could do now is use the n command to move step-by-step through the function, or use another commend to get to the end of the function directly. This is the return command.

r or return: return to the end of function call

Following up the previous example, use r inside the printCell() function:

(Pdb) l
  1  ->	def printCell(row, col, ncol):
  2  	    '''Form the string for a given cell'''
  3  	    end='\n' if col==ncol-1 else ' '
  4  	    print('(%d, %d)' %(row, col), end=end)
  5  	
  6  	def matrixIndices(nrow, ncol):
  7  	    '''Print element indices for a 2d matrix'''
  8  	    __import__('pdb').set_trace()
  9  	    for i in range(nrow):
 10  	        for j in range(ncol):
 11  	            printCell(i, j, ncol)
(Pdb) r
(0, 0) --Return--
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(4)printCell()->None
-> print('(%d, %d)' %(row, col), end=end)
(Pdb) 

The last line print('(%d, %d)' %(row, col), end=end) is how the function returns. Typically it would be a line of return that returns values from the function, but in this case it is a call to the print() function.

From here, another n or r command would put us out from the function call, and one level up in the trace stack:

-> print('(%d, %d)' %(row, col), end=end)
(Pdb) r
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(10)matrixIndices()
-> for j in range(ncol):
(Pdb) 

See it stops at for j in range(ncol): again. Let’s move on one step down and let it update the j variable:

-> for j in range(ncol):
(Pdb) n
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(11)matrixIndices()
-> printCell(i, j, ncol)
(Pdb) print(j)
1
(Pdb) 

Note that we are executing the print(j) to query the current value of j. At the debugger prompt, you can run arbitrary code to examine the current state of your program, define new variables etc., almost the same as if you are in the REPL Python interpreter. This can be very helpful in finding bugs in the code or trying out new methods.

Note that I said almost above. Because some operations are bit different in the debugging prompt.

use ! to escape the debugger commands

For instance, running help(list) in the pdb prompt will show you

*** No help for '(list)'

Because it defaults to call the help function of the debugger, not the Python helper function. To call the later, put a ! at the front:

!help(list)

This also applies to other debugging commands. For instance, somewhere in your code you have a variable named a and you would like to print out its value, an a followed by Enter will run the debugger command a (args) which prints the current function arguments. To examine the varialbe a, need to use

!a

Note that previously I used print(j) to examine the current value of j, rather than simply typing j and Enter. This is because j is a shorthand for the command jump, running j alone will cause the debugger to execute the jump command (and fail because we did not tell it where to jump to).

c or continue: continue execution till the end or the next break point

Previously, it is shown that ‘r’ will resume execution till the end of the current function and give back the control to you. However, c or continue will continue till the end of the program, or the next break point if there is one.

In our matrixIndices example, running c after the we hit the break point the first time will finish the script and give us the result. Suppose we put the break point inside a loop, for instance

def matrixIndices(nrow, ncol):
    '''Print element indices for a 2d matrix'''
    for i in range(nrow):
        __import__('pdb').set_trace()
        for j in range(ncol):
            printCell(i, j, ncol)

The break point would stop the execution in every iteration of the loop (even if you use continue). This means that in some cases you may not want to put break points inside a loop. A workaround is setting a break point outside the loop in the script, and adding break points when you are in the debugger prompt. This leads us to the b or break debugging command.

b or `break: set break point

In the pdb prompt, set a new break point using the following command:

(Pdb) break line_number

E.g. in our matrixIndices() example, suppose we have stopped at the break point set in the script file:

> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(9)matrixIndices()
-> for i in range(nrow):
(Pdb)

Then we would like to stop at the printCell() line so we can step into it to debug its inner functioning. What we could do is

(Pdb) break 11

This sets a new break point at line 11, where printCell() is located.

We could examine the existing break points by using the break command without arguments. See the code listing below:

(Pdb) b 11
Breakpoint 1 at /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py:11
(Pdb) break
Num Type         Disp Enb   Where
1   breakpoint   keep yes   at /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py:11
(Pdb) 

Note that the only break point is labelled Num 1, at line 11.

Now we can keep on running c repeatedly to stop at every call of printCell() function. Once we are done debugging this function, we can delete this break point, and use c to resume execution to the end. The next part talks about how to clear break points.

cl or clear: delete break points

Continuing the previous section, after setting a break point at line 11, this is how to delete it:

(Pdb) clear 1

1 is the numerical id given to the break point, which you obtained using the break command without arguments.

Now show the existing break points again:

(Pdb) clear 1
Deleted breakpoint 1 at /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py:11
(Pdb) break
(Pdb) 

Nothing prints out, break point deleted.

Other debugging commands

There are some other debugging commands, for instance, q or (quit) to quit the debugger, u (or up) goes one level up in the stack trace (e.g. from inside the function back to the calling line), d or down moves one level down, and j line_number lets you to jump to a specified line backward or forward. For more details on the debugging commands, refer to the official documentation.

Usage 2: run script with the pdb module

In the 1st usage case, you set break points in the script, and debug the code in the pdb prompt. Sometimes you don’t necessarily know where to set the break point and stop the program, but you know an error is hidden somewhere. In such cases you can run a Python script with the pdb module:

python -m pdb myscript.py

Once you execute that command, the script won’t start running immediately, but wait at the very 1st line of the script. Running c (or continue) will execute the entire script. If run successfully, it will get back to the 1st line again. But if any exception is encountered, it will prompt you into a post-mortem prompt, at the place where the error occurs, so you could start working out the cause of the problem.

Use the matrixIndices() example again, but this time we "plant" an error into the code to make it fail, by changing the print('(%d, %d)' %(row, col), end=end) line in the printCell() function to the following:

print('(%d, %d)' %(row), end=end)

i.e. we deleted the 2nd argument to the format string, so this print() call will fail. Now the entire script looks like this:

def printCell(row, col, ncol):
    '''Form the string for a given cell'''
    end='\n' if col==ncol-1 else ' '
    print('(%d, %d)' %(row), end=end)

def matrixIndices(nrow, ncol):
    '''Print element indices for a 2d matrix'''
    for i in range(nrow):
        for j in range(ncol):
            printCell(i, j, ncol)

matrixIndices(5,4)

Now run the script using

python -m pdb pdb_example.py

It waits at the very 1st line of the code:

$ python -m pdb pdb_example.py 
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(1)<module>()
-> def printCell(row, col, ncol):
(Pdb)

Now run c to resume execution, here is the output:

(Pdb) c
Traceback (most recent call last):
  File "/usr/lib/python3.8/pdb.py", line 1703, in main
    pdb._runscript(mainpyfile)
  File "/usr/lib/python3.8/pdb.py", line 1572, in _runscript
    self.run(statement)
  File "/usr/lib/python3.8/bdb.py", line 580, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/home/guangzhi/scripts/tools/numbersmithy/pdb_example.py", line 1, in <module>
    def printCell(row, col, ncol):
  File "/home/guangzhi/scripts/tools/numbersmithy/pdb_example.py", line 10, in matrixIndices
    printCell(i, j, ncol)
  File "/home/guangzhi/scripts/tools/numbersmithy/pdb_example.py", line 4, in printCell
    print('(%d, %d)' %(row), end=end)
TypeError: not enough arguments for format string
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(4)printCell()
-> print('(%d, %d)' %(row), end=end)
(Pdb) 

A long list of information is printed out. When your code gets more complicated, such information outputs will get even longer. The way to read such debugging information is to go from bottom up, and starts from the last line .

The last line tells which line of code is causing the trouble, in this case, the print('(%d, %d)' %(row), end=end) line, precisely where we "planted" the error. You also find this incorrect print() call shown at line 15 in the code listing, highlighted out for you. This is the bottom level trackback.

Going one line up from line 15, it tells you that the troublesome line is inside the printCell() function, defined at line 4 in the pdb_example.py script. Also correct. We are moving upward following the trackback.

Going one step upward along the trackback again, it shows you that the failed printCell() call is inside a matrixIndices() function call, which itself is at line 10 in the pdb_example.py file. This is also the top of the trackback.

Also note that it has prompted you with the following

TypeError: not enough arguments for format string
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /home/guangzhi/scripts/tools/numbersmithy/pdb_example.py(4)printCell()
-> print('(%d, %d)' %(row), end=end)
(Pdb) 

The (Pdb) signals that you are now in the pdb debugger prompt, and you can use the debugging commands introduced above to help fix the error.

We have introduced 2 common usage cases of the pdb debugger, one with manually setting break points in the code, and the other the post-mortem way that stops execution at (the 1st) erroneous lines. Of cause you could combine these 2, to stop at some places that you know are critical to the code, and let the post-mortem debugger prompt when you don’t have full knowledge where the problem might be. In all cases, those debugging commands are your best friends in narrowing down the problematic code.

In the last section, I’ll share some tips in the configuration of the debugger using the .pdbrc config file, this would give you a more customized debugger.

.pdbrc config

The .pdbrc file (note the leading dot symbol, which makes it a hidden file in the Linux or Mac OS) is the default configuration file for users to customize their Python pdb debugger. You could create one such file in the your HOME directory (if there isn’t one already),and use it to customize or extend pdb.

Get back to the current line and print context

In the pdb debug prompt, the l (or list) command will print the context around the current line (by default 11 lines, 5 above + 1 current + 5 next lines). Repeated l will scroll your current vision of the script down, and you might get lost after a few pages-down. To get back to where you were at the current line, you can create an alias in the .pdbrc file:

alias ll u;;d;;l

alias is another pdb debugger command not covered in this article. It is used to create command aliases. In this case, the command sequence is triplet

u;;d;;l

u gets you one level up in the stack trace, d gets down back again, putting you back at the current line again. Then the final l prints the context. This triplet is aliased to ll. So next time you get lost after running too many l commands, do a ll will get you back where you were.

(Note that a new command ll (short for longlist) is added in Python version 3.2, if you want to keep this, choose a different alias to avoid name conflict.)

Run custom function in debugger

In your daily workflow there might be some data types that you work with regularly. For me the most important data type is a derivative of numpy.ndarray, and I often need to visualize a 2D slab of ndarray to examine the program. To make this process as easy and quick as possible, I have this alias in my .pdbrc:

alias pl import pdbrc; pdbrc.pdbPlot(%1)

What is bound to the pl alias is import pdbrc; pdbrc.pdbPlot(%1). pdbrc is a module I created whose path is included in the PYTHONPATH environment variable, so it is importable in any Python session. In the module I have a pdbPlot(var) function which handles the plotting of the ndarray variable var. Note that the input argument in the alias definition is represented with %1. If you have more than one inputs, use %1, %2 etc..

To use this, suppose I have some 2d data:

data = np.random.rand(10,10)

I could quickly plot it in the debugger:

(Pdb) pl data

Similarly, there is another alias I use regularly:

alias pt import pdbrc; pdbrc.pdbShow(%1)

and the pdbrc.pdbShow(var) function prints out some information of the input argument. To use this in the debugger:

(Pdb) pt data

You could define your own custom helper functions and aliases to extend the functionality of pdb and make your debugging easier.

2 Comments

  1. Could you share the ‘import pdbrc’ module?

    • Sorry for the late reply.
      Sure, but my `pdbrc` may not make too much sense to others so I didn’t add that to the main text. The main point is that it is just another regular python module and you can put whatever useful in it.

      Here is my pdbrc.py:

      import numpy
      import print_out

      def pdbShow(var):
      print('### Var info:')
      print('## Var type: %s' %type(var))

      #-----Display attributes----------------
      attr_list=['id','name','shape','units']
      for attr in attr_list:
      if hasattr(var,attr):
      print('## Var {0}: {1}'.format(attr,var.__getattribute__(attr)))

      #-----Print var maxmin and mean----------
      if isinstance(var,str):
      print('## String var: %s' %var)
      else:
      try:
      print('## min = {0}, mean = {1}, max = {2}'.format(\
      numpy.nanmin(numpy.array(var)),
      numpy.nanmean(numpy.array(var)),
      numpy.nanmax(numpy.array(var))
      ))
      if numpy.any(numpy.isnan(var)):
      print('## var contains nans')
      except:
      pass

      def pdbPlot(var):
      import matplotlib.pyplot as plt
      import plot
      print('### Plot var:')

      try:
      plot.pl(var)
      except:
      print('\n# : Printing failed.')

      def openTmuxPane():
      print_out.getTmuxPane(method='socket')

      def pdbPrintSocket(var):
      print_out.ps(var)

      def pdbPrintFile(var):
      print_out.pf(var)

      def changePort(port):
      print_out.changePort(port)

Leave a Reply