Profiling your Python script using cProfile

Profiling refers to performing dynamic analysis on a script to measure its execution time, the execution time of its subcomponents, as well as how many times each subcomponent is being called. This produces data on where the script program is spending the most time, and can help with optimizing your script to minimize its execution time. This blog has had two past posts on profiling, one on C++ using Callgrind and one on Python using PyCharm. PyCharm is a Python IDE that’s very useful but unfortunately not free, so if you’re looking for some freeware profiling functionality in Python, this post is for you.

Python has a module called cProfile. A simple example on timing the multiplication of two matrices with cProfile:

import cProfile
import numpy as np

mat1 = ([1, 6, 3],[3, 6, 3],[2, 8, 3]) 
mat2 = ([2, 7, 6],[5, 4, 7],[6, 1, 9]) 
  
cProfile.run('np.dot(mat1,mat2)')

this should print out something like the following:

         4 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method numpy.core.multiarray.dot}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

I have recently used cProfile on one of my own scripts which I’ll be using here to demonstrate how it can be used in your own work. I have a function called fish_game, which contained my model and took vars as input. This function also calls function hrvSTR which represented my action policy function (it’s extraneous to this post what these functions do exactly, one represents the system and the other represents a policy that we use to act on the system, you can see the full model here). The fish_game function was called by my optimization algorithm during optimization. Running cProfile on it produces this:

cProfile.run('fish_game(vars)',sort='cumulative')
         282814 function calls in 0.698 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.698    0.698 {built-in method builtins.exec}
        1    0.000    0.000    0.698    0.698 <string>:1(<module>)
        1    0.166    0.166    0.698    0.698 <ipython-input-4-df258d5a749f>:55(fish_game)
    20200    0.385    0.000    0.531    0.000 <ipython-input-4-df258d5a749f>:1(hrvSTR)
    20200    0.016    0.000    0.089    0.000 fromnumeric.py:1821(sum)
    20200    0.021    0.000    0.069    0.000 fromnumeric.py:64(_wrapreduction)
   121208    0.055    0.000    0.055    0.000 {built-in method numpy.core.multiarray.zeros}
    20200    0.046    0.000    0.046    0.000 {method 'reduce' of 'numpy.ufunc' objects}
    20200    0.003    0.000    0.003    0.000 {built-in method builtins.isinstance}
    40400    0.003    0.000    0.003    0.000 {built-in method builtins.len}
    20200    0.002    0.000    0.002    0.000 {method 'items' of 'dict' objects}
        2    0.000    0.000    0.000    0.000 {method 'normal' of 'mtrand.RandomState' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

This tells me one run of my function takes 0.698 seconds in total (this might vary slightly every time, depending on your processor usage at the time and other factors), and that most of that time, 0.531 seconds, are consumed by the hrvSTR function. Even though 0.7 seconds might not seem long, in the context of optimization, where this function would need to be evaluated tens of thousands of times, an additional 0.1 seconds might add hours of process time to your workflow. Trying to bring that down is probably a worthwhile investment of time that will result in time savings later on. As a result I figured there could be something I could do to reduce the time hrvSTR took. I particularly intrigued by the fact that some numpy process numpy.core.multiarray.zeros was called 121208 times, an order of magnitude more than every other method in my script, which prompted me to think that I might be unnecessarily repeating a process.

Looking at my code more closely (this is a script I have been using for more than a year now), I realized that I was ordering and normalizing and creating arrays for my action policy every single time it was called, something that was unnecessary, as the same policy was used for all time steps. I could instead perform all those steps once, save the outputs and use them for every time step instead of recalculating every time. I spent some time to adjust my script to do that and running cProfile again, produced this:

 cProfile.run('fish_game(vars)',sort='cumulative')
         79282 function calls in 0.379 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.379    0.379 {built-in method builtins.exec}
        1    0.000    0.000    0.379    0.379 <string>:1(<module>)
        1    0.150    0.150    0.379    0.379 <ipython-input-16-3c528334eb55>:62(fish_game)
    19800    0.193    0.000    0.229    0.000 <ipython-input-16-3c528334eb55>:35(hrvSTR)
    59414    0.036    0.000    0.036    0.000 {built-in method numpy.core.multiarray.zeros}
        4    0.000    0.000    0.000    0.000 fromnumeric.py:2817(mean)
        4    0.000    0.000    0.000    0.000 _methods.py:58(_mean)
        2    0.000    0.000    0.000    0.000 <ipython-input-16-3c528334eb55>:4(generate_policy)
        6    0.000    0.000    0.000    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        2    0.000    0.000    0.000    0.000 {method 'normal' of 'mtrand.RandomState' objects}
        2    0.000    0.000    0.000    0.000 fromnumeric.py:1821(sum)
        2    0.000    0.000    0.000    0.000 fromnumeric.py:64(_wrapreduction)
        4    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}
        4    0.000    0.000    0.000    0.000 _methods.py:48(_count_reduce_items)
        2    0.000    0.000    0.000    0.000 {method 'clip' of 'numpy.generic' objects}
        4    0.000    0.000    0.000    0.000 numeric.py:504(asanyarray)
       10    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        8    0.000    0.000    0.000    0.000 {built-in method builtins.issubclass}
        4    0.000    0.000    0.000    0.000 {built-in method numpy.core.multiarray.array}
        2    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

I basically shaved 0.3 seconds off my function evaluation time by investing some time to look more closely at my script. If optimizing with 30000 function evaluations, this translates to some 2+ hours of processing time that I am saving (I spent far less figuring this out for the first time).

You can also run cProfile directly from the command line like so:

python -m cProfile -s cumtime fish_game.py

Types of Errors in Numerical Methods

What are numerical methods?

Many engineering problems are too time consuming to solve or may not be able to be solved analytically. In these situations, numerical methods are usually employed. Numerical methods are techniques designed to solve a problem using numerical approximations. An example of an application of numerical methods is trying to determine the velocity of a falling object. If you know the exact function that determines the position of your object, then you could potentially differentiate the function to obtain an expression for the velocity. More often, you will use a machine to record readings of times and positions that you can then use to numerically solve for velocity:

Figure1final

where f is your function, t is the time of the reading, and h is the distance to the next time step.

Because your answer is an approximation of the analytical solution, there is an inherent error between the approximated answer and the exact solution. Errors can result prior to computation in the form of measurement errors or assumptions in modeling. The focus of this blog post will be on understanding two types of errors that can occur during computation: roundoff errors and truncation errors.

Roundoff Error

Roundoff errors occur because computers have a limited ability to represent numbers. For example, π has infinite digits, but due to precision limitations, only 16 digits may be stored in MATLAB. While this roundoff error may seem insignificant, if your process involves multiple iterations that are dependent on one another, these small errors may accumulate over time and result in a significant deviation from the expected value. Furthermore, if a manipulation involves adding a large and small number, the effect of the smaller number may be lost if rounding is utilized. Thus, it is advised to sum numbers of similar magnitudes first so that smaller numbers are not “lost” in the calculation.

One interesting example that we covered in my Engineering Computation class, that can be used to illustrate this point, involves the quadratic formula. The quadratic formula is represented as follows:

figure2

Using a = 0.2, b = – 47.91, c = 6 and if we carry out rounding to two decimal places at every intermediate step:

Figure3

The error between our approximations and true values can be found as follows:

Figure4

As can be seen, the smaller root has a larger error associated with it because deviations will be more apparent with smaller numbers than larger numbers.

If you have the insight to see that your computation will involve operations with numbers of differing magnitudes, the equations can sometimes be cleverly manipulated to reduce roundoff error. In our example, if the quadratic formula equation is rationalized, the resulting absolute error is much smaller because fewer operations are required and numbers of similar magnitudes are being multiplied and added together:

Figure5

Truncation Error

Truncation errors are introduced when exact mathematical formulas are represented by approximations. An effective way to understand truncation error is through a Taylor Series approximation. Let’s say that we want to approximate some function, f(x) at the point xi+1, which is some distance, h, away from the basepoint xi, whose true value is shown in black in Figure 1. The Taylor series approximation starts with a single zero order term and as additional terms are added to the series, the approximation begins to approach the true value. However, an infinite number of terms would be needed to reach this true value.

Figure 1: Graphical representation of a Taylor Series approximation (Chapra, 2017)

The Taylor Series can be written as follows:

Figure7

where Rn is a remainder term used to account for all of the terms that were not included in the series and is therefore a representation of the truncation error. The remainder term is generally expressed as Rn=O(hn+1) which shows that truncation error is proportional to the step size, h, raised to the n+1 where n is the number of terms included in the expansion. It is clear that as the step size decreases, so does the truncation error.

The Tradeoff in Errors

The total error of an approximation is the summation of roundoff error and truncation error. As seen from the previous sections, truncation error decreases as step size decreases. However, when step size decreases, this usually results in the necessity for more precise computations which consequently results in an increase in roundoff error. Therefore, the errors are in direct conflict with one another: as we decrease one, the other increases.

However, the optimal step size to minimize error can be determined. Using an iterative method of trying different step sizes and recording the error between the approximation and the true value, the following graph shown in Figure 2 will result. The minimum of the curve corresponds to the minimum error achievable and corresponds to the optimal step size. Any error to the right of this point (larger step sizes) is primarily due to truncation error and the increase in error to the left of this point corresponds to where roundoff error begins to dominate. While this graph is specific to a certain function and type of approximation, the general rule and shape will still hold for other cases.

Figure 2: Plot of Error vs. Step Size (Chapra, 2017)

Hopefully this blog post was helpful to increase awareness of the types of errors that you may come across when using numerical methods! Internalize these golden rules to help avoid loss of significance:

  • Avoid subtracting two nearly equal numbers
  • If your equation has large and small numbers, work with smaller numbers first
  • Consider rearranging your equation so that numbers of a similar magnitude are being used in an operation

Sources:

Chapra, Steven C. Applied Numerical Methods with MATLAB for Engineers and Scientists. McGraw-Hill, 2017.

Class Notes from ENGRD 3200: Engineering Computation taught by Professor Peter Diamessis at Cornell University