Basic Valgrind Usage
Intro
Valgrind is a tool for analysing the instrumental framework for building dynamic analysis tools, developed by Julian Seward.
Most of the contents in this blog are from the Valgrind official website.
Basic
Compiling
Before using Valgrind, we need to compile our programs with -g
option, which means the compiling will include the
detailed debug informations (Memchecks' erroe messgaes with exact line numbers will be included). Note that -O0
and -O1
also works, while it is not recommended to use -O2
and above.
Running under Memcheck
In the normal case we just run our programs as this:
myprog arg1 arg2
Then we need to run the following command to involve Valgrind:
valgrind --leak-check=yes myprog arg1 arg2
Note that in this way our programs may be a lot slower than running dependently since Valgrind needs to detect memory errors.
An Example
#include <stdlib.h>
void f(void)
{
int* x = malloc(10 * sizeof(int));
x[10] = 0; // problem 1: heap block overrun
} // problem 2: memory leak -- x not freed
int main(void)
{
f();
return 0;
}
The memory messages are like following:
==19182== Invalid write of size 4
==19182== at 0x804838F: f (example.c:6)
==19182== by 0x80483AB: main (example.c:11)
==19182== Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd
==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
==19182== by 0x8048385: f (example.c:5)
==19182== by 0x80483AB: main (example.c:11)
-
Actually there will be lots of information in each message indelendently, so we need to read them seperately for finding the detailed error, as well as the bugs related to them.
-
The number in each message denotes the process ID, which is not the key point.
-
The first line of each message denotes the type of the error.
-
The following line denotes the position of the error (as stack trace).
-
All the memory errors may not cause a error for compiler overall. But it is a error, as well as a bug. So it is better to treat them carefully.
Memory Leak
Here is an example for memory leak problem.
==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
==19182== by 0x8048385: f (a.c:5)
==19182== by 0x80483AB: main (a.c:11)
We can see from the stack trace where the leaked memory will be allocated, but we cannot know the reason from Memcheck directly.
The leak will be divided into two types:
- “definitely lost”: it is obvious that the memory was leaked so necessary to fix it.
- “probably lost”: it also means that the memory is leaked, except some pointer oprations, which are also not smart.
“Conditional jump or move depends on uninitialised value(s)”
This is a common message that may happen a lot, which means the uses of uninitialised values. It is difficult to find where the root cause is, but using --track-origin=yes
will tell us extra informations that helps us locate them.
The following part includes some typical error type from Valgrind.
Illegal read / Illegal write errors
Here is an example:
Invalid read of size 4
at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9)
by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9)
by 0x40B07FF4: read_png_image__FP8QImageIO (kernel/qpngio.cpp:326)
by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621)
Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd
This happens when the program reads or writes memory at a plce which Memcheck reckons it shouldn’t. Normally when you try to read or write pointed to a part that has already been freed, you’ll be informed with this message.
In some cases, the program will lives well with the actually not valid access since we are allowed to read or write to a garbage area (non-fatal error). So when something goes wrong with the output, we may just check the Valgrind output to see whether there is a memory read/write error.
Use of uninitialised values
Here is an example:
Conditional jump or move depends on uninitialised value(s)
at 0x402DFA94: _IO_vfprintf (_itoa.h:49)
by 0x402E8476: _IO_printf (printf.c:36)
by 0x8048472: main (tests/manuel1.c:8)
by 0x402A6E5E: __libc_start_main (libc-start.c:129)
`Uninitialsed value' is also a common error that is not easy to find. Sometimes the variable will be read with wrong value thought the memory access is valid.
Here is an error that is not easy to be found but easy to appear: sometimes we put several judge statement in a single
if
, then the more necessary conditions conditions should be put first, i.e., the conditions rely on the previous one should be placed after.
Illegal frees
Here is an example:
Invalid free()
at 0x4004FFDF: free (vg_clientmalloc.c:577)
by 0x80484C7: main (tests/doublefree.c:10)
by 0x402A6E5E: __libc_start_main (libc-start.c:129)
by 0x80483B1: (within tests/doublefree)
Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd
at 0x4004FFDF: free (vg_clientmalloc.c:577)
by 0x80484C7: main (tests/doublefree.c:10)
by 0x402A6E5E: __libc_start_main (libc-start.c:129)
by 0x80483B1: (within tests/doublefree)
Memcheck keeps track of the blocks allocated by your program with malloc/new, so it can know exactly whether or not the argument to free/delete is legitimate or not. Here, this test program has freed the same block twice. As with the illegal read/write errors, Memcheck attempts to make sense of the address free’d. If, as here, the address is one which has previously been freed, you wil be told that – making duplicate frees of the same block easy to spot.
Overlapping source and destination blocks
Here is an example:
==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21)
==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71)
==27492== by 0x804865A: main (overlap.c:40)
==27492== by 0x40246335: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==27492== by 0x8048470: (within /auto/homes/njn25/grind/head6/memcheck/tests/overlap)
==27492==
The C library functions above copy some data from one memory block to another (or something similar): memcpy(), strcpy(), strncpy(), strcat(), strncat(). The blocks pointed to by their src and dst pointers aren’t allowed to overlap. Memcheck checks for this.
Some Tricks
How to Stop at the First Error
If we want to stop at the console to see where the error occured:
Use the parameter : --gen-suppressions=yes
Some tutorials:
https://valgrind.org/docs/manual/manual.html http://cs.ecs.baylor.edu/~donahoo/tools/valgrind/