Basic Valgrind Usage

Intro

Valgrind is a tool for analysing the instrumental framework for building dynamic analysis tools, developed by Julian Seward.

Most of the contents in this blog are from the Valgrind official website.

Basic

Compiling

Before using Valgrind, we need to compile our programs with -g option, which means the compiling will include the detailed debug informations (Memchecks' erroe messgaes with exact line numbers will be included). Note that -O0 and -O1 also works, while it is not recommended to use -O2 and above.

Running under Memcheck

In the normal case we just run our programs as this:

myprog arg1 arg2

Then we need to run the following command to involve Valgrind:

valgrind --leak-check=yes myprog arg1 arg2

Note that in this way our programs may be a lot slower than running dependently since Valgrind needs to detect memory errors.

An Example

  #include <stdlib.h>

  void f(void)
  {
     int* x = malloc(10 * sizeof(int));
     x[10] = 0;        // problem 1: heap block overrun
  }                    // problem 2: memory leak -- x not freed

  int main(void)
  {
     f();
     return 0;
  }

The memory messages are like following:

==19182== Invalid write of size 4
==19182==    at 0x804838F: f (example.c:6)
==19182==    by 0x80483AB: main (example.c:11)
==19182==  Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd
==19182==    at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
==19182==    by 0x8048385: f (example.c:5)
==19182==    by 0x80483AB: main (example.c:11)

Memory Leak

Here is an example for memory leak problem.

  ==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
  ==19182==    at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
  ==19182==    by 0x8048385: f (a.c:5)
  ==19182==    by 0x80483AB: main (a.c:11)

We can see from the stack trace where the leaked memory will be allocated, but we cannot know the reason from Memcheck directly.

The leak will be divided into two types:

  1. “definitely lost”: it is obvious that the memory was leaked so necessary to fix it.
  2. “probably lost”: it also means that the memory is leaked, except some pointer oprations, which are also not smart.

“Conditional jump or move depends on uninitialised value(s)”

This is a common message that may happen a lot, which means the uses of uninitialised values. It is difficult to find where the root cause is, but using --track-origin=yes will tell us extra informations that helps us locate them.


The following part includes some typical error type from Valgrind.

Illegal read / Illegal write errors

Here is an example:

  Invalid read of size 4
     at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9)
     by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9)
     by 0x40B07FF4: read_png_image__FP8QImageIO (kernel/qpngio.cpp:326)
     by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621)
     Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd

This happens when the program reads or writes memory at a plce which Memcheck reckons it shouldn’t. Normally when you try to read or write pointed to a part that has already been freed, you’ll be informed with this message.

In some cases, the program will lives well with the actually not valid access since we are allowed to read or write to a garbage area (non-fatal error). So when something goes wrong with the output, we may just check the Valgrind output to see whether there is a memory read/write error.

Use of uninitialised values

Here is an example:

  Conditional jump or move depends on uninitialised value(s)
     at 0x402DFA94: _IO_vfprintf (_itoa.h:49)
     by 0x402E8476: _IO_printf (printf.c:36)
     by 0x8048472: main (tests/manuel1.c:8)
     by 0x402A6E5E: __libc_start_main (libc-start.c:129)

`Uninitialsed value' is also a common error that is not easy to find. Sometimes the variable will be read with wrong value thought the memory access is valid.

Here is an error that is not easy to be found but easy to appear: sometimes we put several judge statement in a single if, then the more necessary conditions conditions should be put first, i.e., the conditions rely on the previous one should be placed after.

Illegal frees

Here is an example:

  Invalid free()
     at 0x4004FFDF: free (vg_clientmalloc.c:577)
     by 0x80484C7: main (tests/doublefree.c:10)
     by 0x402A6E5E: __libc_start_main (libc-start.c:129)
     by 0x80483B1: (within tests/doublefree)
     Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd
     at 0x4004FFDF: free (vg_clientmalloc.c:577)
     by 0x80484C7: main (tests/doublefree.c:10)
     by 0x402A6E5E: __libc_start_main (libc-start.c:129)
     by 0x80483B1: (within tests/doublefree)

Memcheck keeps track of the blocks allocated by your program with malloc/new, so it can know exactly whether or not the argument to free/delete is legitimate or not. Here, this test program has freed the same block twice. As with the illegal read/write errors, Memcheck attempts to make sense of the address free’d. If, as here, the address is one which has previously been freed, you wil be told that – making duplicate frees of the same block easy to spot.

Overlapping source and destination blocks

Here is an example:

==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21)
==27492==    at 0x40026CDC: memcpy (mc_replace_strmem.c:71)
==27492==    by 0x804865A: main (overlap.c:40)
==27492==    by 0x40246335: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==27492==    by 0x8048470: (within /auto/homes/njn25/grind/head6/memcheck/tests/overlap)
==27492== 

The C library functions above copy some data from one memory block to another (or something similar): memcpy(), strcpy(), strncpy(), strcat(), strncat(). The blocks pointed to by their src and dst pointers aren’t allowed to overlap. Memcheck checks for this.

Some Tricks

How to Stop at the First Error

If we want to stop at the console to see where the error occured:

Use the parameter : --gen-suppressions=yes

Some tutorials:

https://valgrind.org/docs/manual/manual.html http://cs.ecs.baylor.edu/~donahoo/tools/valgrind/