diff options
Diffstat (limited to 'lib/libmalloc/malloc.doc')
-rw-r--r-- | lib/libmalloc/malloc.doc | 653 |
1 files changed, 653 insertions, 0 deletions
diff --git a/lib/libmalloc/malloc.doc b/lib/libmalloc/malloc.doc new file mode 100644 index 000000000000..f209baf5a7bf --- /dev/null +++ b/lib/libmalloc/malloc.doc @@ -0,0 +1,653 @@ +$Header: /home/cvs/386BSD/src/lib/libmalloc/malloc.doc,v 1.1 1994/03/06 22:59:48 nate Exp $ + Yet another malloc() + -------------------- + Mark Moraes + <moraes@csri.toronto.edu> + + Standard calls + +It provides the standard calls we expect from any self-respecting +malloc, viz. + + char * + malloc(nbytes) + + unsigned int nbytes; + +which returns pointer to a contiguous memory block, at least nbytes +bytes long, + + char * + calloc(nelements, element_size) + + unsigned int nelements, element_size; + +which returns a pointer to a contiguous memory block, at least +nelements * element_size bytes long, with all locations set to zero, + + char * + realloc(ptr, nbytes) + + char *ptr; + unsigned int nbytes; + +attempts to change the size of the previously malloc'ed block pointed +to by ptr to nbytes, and failing that, returns a pointer to a new block +nbytes long, after copying the contents of the old block to it. +Realloc returns NULL if it fails, and DOES NOTHING WITH THE OLD BLOCK. +(This is not defined; some reallocs may free the old block they were +passed) + + void + free(ptr) + + char *ptr; + +returns the previously malloc'ed block pointed to by ptr to the +storage pool. It will do its best to keep storage as unfragmented as +possible. + + void + cfree(ptr) + + char *ptr; + +The same as free(), used to free calloc()ed blocks. + +There are a couple of additional functions that Sun malloc provides, +and that some programs need (notably the X Windows server on Suns) + + char * + memalign(alignment, nbytes) + + unsigned int alignment; + unsigned int nbytes; + +This returns a pointer to a memory block at least nbytes long, +guaranteeing that the block starting address is an even multiple of +alignment. Alignment must be a power of 2. + + char * + valloc(nbytes) + + unsigned int nbytes; + +This is the same as memalign(getpagesize(), nbytes). + +A frequently used function is to save a copy of a NULL-terminated +character array (a C string) in storage malloc'ed exactly to fit it, +for example when reading or parsing some input line from a large +buffer. Newer systems provide the strdup() function to do just this. +(This may not appear a complex funtion - using it eliminates the +all-too-frequent error where you forget to add 1 to the length of the +string to allow for the NULL terminator) + + char * + strdup(s) + + char *s; + +strdup returns NULL if the malloc fails. + + Additional functions + +In addition to the usual functions, this malloc provides some +debugging and profiling functions, which we discuss in more detail +below. (NOTE: To use these special features, you have to include the +header file "malloc.h" provided with this malloc. This header file +also defines the C preprocessor symbol CSRIMALLOC. This allows anyone +using any of the special features of this malloc to enclose any calls +to the new features within #ifdef CSRIMALLOC, thus preserving code +portability) + +Since this malloc library is usually installed as libmalloc.a, to use +it instead of a regular system malloc, you need to specify something +like -lmalloc on your link/load command line before -lc, if any. +Make sure your program has at least one call to any one of +malloc/free/calloc/realloc or the Unix loader may not link in this +malloc, and will instead use the system one from libc, since it makes +only one pass. + +Most of the debugging features will be available in a version of the +malloc called malloc_d, which is what you should use for development +and testing by specifying -lmalloc_d. For production programs, link +with -lmalloc to get the fast version. + +Frequently, people forget to check the return value on a malloc() +assuming that modern systems never run out of memory. Inevitably, +Murphy ensures that some system will run out of memory, and a user +will be faced with the illuminating error message "Segmentation +violation - core dumped". Alas, when memory runs out, the core dump is +likely to be large. This malloc provides an emalloc() function which +exits if NULL is returned, with an out of memory message. Using this +in place of malloc is advised if running out of memory is a fatal +condition. (If not, either write your own emalloc() to recover +gracefully, or check the value returned by malloc) + + char * + emalloc(nbytes) + + unsigned nbytes; + +Similarly, a realloc which will die instead of returning NULL is +erealloc(). + + char * + erealloc(ptr, nbytes) + + unsigned nbytes; + +A similar function, like strdup() but not returning NULL is strsave(). + + char * + strsave(s) + + char *s; + + + Debugging support + +Alas, one of the more common, and unpleasant errors a C programmer can +make is to accidentally exceed the bounds of an array, and write over +the data that immediately follows (or less frequently, precedes) it in +memory. Such "corruption" errors usually show up at a completely +different place in the program from the place the actual overwriting +occurs (A corollary to Murphy's Law suggests the error will appear in +the last related module of the program), thus making it at least as +hard to track down as the proverbial needle in the haystack. While +there is little that we can do to help detect errors in static, +automatic or global data, we can at least try to ensure the validity +of the malloc()ed data. + +To get this support, the malloc() must be compiled with -DDEBUG - +since this reduces performance somewhat, it is usually done in a +separate object module from the one usually used in production code. +This debugging malloc() makes sure all internal heap pointers relevant +to an allocation (or free) are correct on every call to one of the +allocation routines, and aborts if it detects any trouble, with a +message starting with "assertion botched". This is +useful because it causes the program to die nearer the trouble spot +than would otherwise be the case, helping to pinpoint the error. +This narrows down the location of the error to something less than the +aforementioned haystack, but the problem is still somewhat large. + + int + mal_verify(fullcheck) + + int fullcheck; + +lends a helping hand in such distressing situations. It runs through +all allocated and free blocks, checking all heap pointers for +validity. Since the heap pointers (and block size values) are at the +beginning and end of allocated (and free) blocks, any overwriting off +the end of a memory block usually results in the corruption of the +heap size values, or the pointer values of the next block. If +'fullcheck' is non-zero, then additional checking of the contents of +free blocks is done, to ensure that they haven't been written into +after being freed. Calling mal_verify() with fullcheck as 1 is +recommended. (More on this later) + +On detecting an error, mal_verify() aborts the program, on the not +unreasonable grounds that such an error is A BAD THING and should be +fixed immediately. + +A careful programmer will probably want to put calls to mal_verify() +at frequent points in any code, as checkpoints to trap any stray +overwriting or memory corruption that may take place. Since +mal_verify() does nothing in a malloc compiled without -DDEBUG, it +has no overhead in production code other than the procedure call. +("But I can't be overwriting any data between X and Y in my code" are +famous last words) + +Instead of putting calls to mal_verify(), the programmer can set +the debugging level of the allocator. + + void + mal_debug(level) + + int level; + +Most debugging is done at level 1, which is the default. This only +checks the internal pointers that are encountered during a malloc() or +free(). Setting level to 2 with the mal_debug() call results in a +complete check of the heap (i.e. a call to mal_verify(0) ) every time +malloc(), realloc() or free() are called. (The other allocation +routines call these three) This can be very slow if your program does +a lot of debugging, but is worth turning on for a specific segment of +the program where you suspect trouble. (The author uses this mode when +writing and debugging a program initially, switches to normal +debugging for alpha and beta test) Level 3 asks for mal_verify(1) on +every malloc(), realloc() or free() which checks all free blocks' +insides for any signs of someone writing to them. + +We recommend the use of a binary search technique for pinpointing the +needle, er, the overwriting. This presupposes that malloc(), free(), +or a previously (wisely) inserted mal_verify() has caused the +program to die at some point with an "assertion botched" error. (If +the programmer has not been using this malloc so far, or a variant +compiled without -DDEBUG, then the only indication of a memory +corruption error is utterly strange things happening to the program, +possibly variable changing values when they're not supposed to, or +mysterious deaths of the program in malloc(), free() or some such +call) + +Insert a mal_verify() call (Referred to from now on as Checkpoint +1) at some point well before the error occurs, if necessary, at the +first executable statement in the program. Insert another +mal_verify call on the statement just before you suspect the error +manifesting itself. (Checkpoint 2) + +(Note that when we say insert a mal_verify() call at some point +"before" an error occurs, we are referring to a temporal location, +i.e some piece of code that is executed by the program in the time +before the error occurs. Physically, this may be in a different +procedure, or a different file altogether.) + +Run the program. + +If Checkpoint 1 causes the program to die, then the error is trapped +in between it and the start of the program. (case A) If Checkpoint 2 +causes the program to die, then the error is trapped between it and +Checkpoint 1, (case B) and if neither dies, the error is after +Checkpoint 2, (case C) or is not an overwriting error at all, or is an +error subtle enough to avoid the heap pointer checking. In the last +case, we wish you luck... + +Case A: (The bug is before checkpoint 1) + +Move Checkpoint 2 to where checkpoint 1 presently is, and move +checkpoint 1 further back. Run the program again. + +Case B: (The bug is between checkpoint 1 and checkpoint 2 - narrow the +search area down further) + +This is the case we attempt to maintain. Having attained it, we +promptly attempt to lose it again. (Why is this starting to sound like +Zen...) To do so, we move either Checkpoint 1 or Checkpoint 2 closer +to the other, and run the program again. + +Case C: (The bug is after checkpoint 2) + +Move Checkpoint 1 to where checkpoint 2 presently is, and move +checkpoint 2 further ahead. Run the program again. + +The objective is to bring the two checkpoints as close to each other +as is necessary to spot the bug. (Recognizing the bug is up to the +programmer - loops exceeding the bound by one are a common problem +causing this error - confusion with C's starting arrays at 0 unlike +other languages which start them at 1 is another problem). + +Those familiar with numerical methods will see the similarity between +this method and the binary search method for finding a root of an +equation. (It also bears a resemblance to the binary search method on +sorted data, but the resemblance is less striking) + +In a modular program, which has been well structured, placing such +checkpoints is easy to do; simply start at the top level, narrow it +down to some procedure call at that level, insert them at the entry and +exit points of that procedure, narrow it down to some procedure call +at this level, and recurse downward till the fault is detected. + +We noted earlier that some corruption bugs manifest themselves (Why is +this starting to read like a ghostbusters' script...) as data values +that change when they shouldn't. In this case, a simpler method to +trace the bug is often to put a trace on that data location by setting +a global pointer variable to point to it, and printing the value at +the two checkpoints. The same search strategy can be employed. This +has been found useful in at least one bug the author has encountered, +which sneakily refused to corrupt the heap pointers, jumping over them +straight into another block to do its dirty work. + +A vicious form of heap corruption is when someone frees a block of +memory, but forgets to NULL or reset all pointers pointing to that +block. At some point in the future, if that block is accessed, strange +things can happen. If that block is still free, and in the heap, there +is a chance of corrupting malloc's internal pointers and structures +used to maintain the free list, in which case mal_verify() will detect +the corruption and abort. Or the corruption may go into the middle of +the block and go undetected. Even worse, the block or part of the +block may have been allocated to the program for some other purpose, +and the corruption may now be smashing data in another part of the +program. This sort of corruption is insidious, and very hard to +reproduce, let alone trace down. To help trace this down, when a block +is freed, in debugging mode, the allocator scribbles a magic pattern +all over it, thus making sure any data in there is likely to be wrong. +Invoking mal_verify(1) will check every free block to make sure its +contents are that magic pattern and abort if it detects corruption. +Setting debug level to 3 with mal_debug() will force this sort of +verification on every call to malloc(), realloc() or free(). +Obviously, if the block gets allocated, and then corrupted, the malloc +verification cannot detect it, since it has no idea what goes on +inside allocated blocks. + + + Advanced debugging tools. + + void + mal_heapdump(fd) + + FILE *fd; + +If all else fails, the programmer may obtain a printout of the entire +heap by calling mal_heapdump() with a file descriptor to an already +open file (fopen(3)). This will print the start address of +all blocks, allocated or free, and their sizes. These can be matched +with pointer addresses in the program and used to trace heap +corruption. If you use this call, you probably know how to do this. +Large doses of intuition (and strong beverages) are recommended. +mal_heapdump() performs the same checking that mal_verify(1) does, but +does not abort unless the error makes it impossible to dump the heap +further. A knowledge of the malloc internals is necessary to fully +exploit this call's output. + + Profiling support + +This malloc() is faster than most other mallocs that I've seen - the +search strategy is a first-free, which is not excitingly fast in +theory, but, with small modifications turns out to be quite fast in +practice; See Knuth for more details. (Theoretically, the time is of +the same order as the size of the free list, so the free list is kept +as small as possible by coalescing a block with adjacent freed blocks +if possible, when it is freed) The free() is based on a boundary tags +scheme, as described in Knuth. It is linear-time, for those who care +about such things, and is a few statements of C code - 4 comparisons +and a little pointer arithmetic. It also accesses memory locations +next to the block only, so it has good virtual memory performance. (The +malloc turns out to have good VM performance most of the time, but +will occasionally scan through a few pages, and in worst case, through +a lot of pages. If however, those pages are all being used, then the +VM performance is likely to be influenced more by the program using +the malloc than the malloc itself) + +Nonetheless, a program which calls malloc and free *very* frequently +might be slow. In order to track down malloc usage, if compiled with +-DPROFILEDSIZES, this malloc keeps count of the number of block sizes +being used by the program. To print out these collected statistics, +call + + void + mal_statsdump(fd) + + FILE *fd; + +where fd is an already opened file descriptor, and it will print a +list of block sizes, and the number of times a block of that size was +allocated. This information provides some indication of the allocation +profile of the calling program. + +When mal_statsdump() is called, it zeroes all the counters, so that +you can check the allocation profile of specific segments of a program +by calling mal_statsdump() repeatedly. + +If more detailed tracing information is needed, the malloc must be +compiled with -DTRACE. This prints out the size of every allocation +or free. This can be turned on or off using + + void + mal_trace(flag) + + int flag; + +If flag is 0, tracing is turned off, (this is the default state), if +it is non-zero, then tracing commences. + +The trace records are of the form: +For a malloc, + + <nbytes> <realsize> <address> + + where <nbytes> is the number of +bytes requested, <realsize> is the number of bytes allocated (usually +nbytes rounded up to a multiple of the word size, plus any slop if +that is requested with mal_slopset), and <address> is the block +returned. + + If the malloc results in the system being asked for more memory, +via sbrk(), it prints + sbrk <bytesobtained> +where <bytesobtained> is the number of bytes the system was asked for. +After the first such call, it will then print + heapstart <address> +where <address> is the + +For a free, + - <realsize> <address> +where <address> is the address of the block being freed +and <realsize> is the malloc'ed size of the block. (the size requested +by the user, rounded up to the word size with slop, if any) + +For a realloc, it may print out + the same as as a free if we're shrinking and the part of the +block was freed + + no-op + if we're shrinking and the remaining part of the block +was too small to be freed, it prints + + ++ <nbytes> <realsize> <address> <realsize_free> <address_free> +where <nbytes> is the number of bytes requested, <realsize> is the actual +number of bytes in the block returned, and <address> is the address of +the block returned, <realsize_free> is the actual size of the free +block after the one we're returning, which we grew into, <address_free> +is the address of the free block after the one we're returning. The +free block information is internal to the malloc and is of little use +to the user. + + the same as for a malloc and a free if the block ends up being +copied to a new block. + + +The trace records may be prefixed by the filename and linenumber of +the call is the source if the file malloc.h was included and the C +preprocessor symbol MALLOC_TRACE defined when compiling the program, +and the malloc library was compield with the tracing option enabled. +(Ask your system adminstrator, or whoever installed the malloc) +(typically with the flag -DMALLOC_TRACE on the cc command line) This +is advised - it makes debugging and leak tracing much easier. + +The file to which this information is printed can be set with + + void + mal_setstatsfile(fd) + + FILE *fd; + +The default is stderr. + + +There are two variables in this malloc that can be set to tune +performance somewhat, and keep memory fragmentation down. One is +'slop', the other is the sbrk() size. + + void + mal_slopset(slop) + + int slop; + +The minimum size block allocated is slop. (The default for this is the +minimum possible to maintain the heap pointers required for a free +block, denoted by a slop of 0) If you notice however, that a lot of +blocks are being used in a specific small size, or small range of +small sizes, then you might want to increase slop so that slop is big +enough to cover all those sizes - while this may waste some memory, it +will speed up allocation for those sizes by guaranteeing that all +blocks in the free list are at least that size, so the first fit +becomes as fast as possible, and the memory fragmentation is reduced +because of the more uniform block size. + + void + mal_sbrkset(nbytes) + + int nbytes; + +If there isn't a block large enough to supply a malloc request, then +the allocator asks for the system to increase the data space of the +process using the sbrk call. By default, sbrk() is called with 1K * +sizeof(Word). (unless the malloc request is for a larger size) If your +program uses less memory than this, you may want to reduce the size. +On the other hand, in a program that allocates a lot of memory, to +reduce the number of system calls, you may want to increase this size, +using the mal_sbrkset() call. + + Performance + +The 4.3 malloc, (variants of which are distributed with perl, tcsh, +GNU and so on) is slightly faster than this malloc but it wastes more +space because it allocates blocks in powers of 2. It does not coalesce +free space, which can lead to it wasting lots of space. (There are some +pathological allocation sequences where it will ask for more space +from the system even though it has enough free space) + +The Sun malloc wastes somewhat less than this malloc, but is +twice as slow, and causes more paging because it stores free blocks +and their headers separately. It has debugging support similar to +mal_verify() and mal_debug(), but not quite as thorough. + +The 4.1 malloc is much slower than this malloc, and wastes about the +same space. + + Incompatibilities + +There is only one serious incompatibility that I know of with some +other malloc()s. In the old 4.1 malloc(), free was kept fast by not +having it merge adjacent free blocks. This resulted in seriously +fragmented arenas for programs that did a lot of allocation and +freeing. + +The realloc() kludge provided a hook to force the merging of such +blocks - the user called realloc() with a a block which had been freed. + +I think this practice is bad (Also, both ANSI and SVID do not support +it) - since this malloc does a fast free that also merges the freed +block to maintain the largest possible contiguous free blocks, there +is no need for storage compaction. If compiled with -DDEBUG, this +realloc() will die with an error if a freed block is passed to +realloc() which would enable fixing programs that adhere to the old +convention. If not compiled with -DDEBUG, it sets errno to EINVAL and +returns NULL. + + Memory leaks + +Some memory allocated by a program is meant for the entire lifetime of +the program. (For many simple programs, this constitutes all the +allocation done by the program, and this section does not apply to +such programs) Some memory however is allocated for some time, and is +later freed. + +Keeping track of memory to be freed is often a nuisance, and +sometimes, programs may change the only pointer to an allocated block +without freeing the block first. such unreferenced, but allocated +memory is called "garbage" or a "memory leak", and is wasted, since +the program has no way of finding it again. (Other languages, like +Lisp, perform garbage collection frequently, finding all blocks that +are unreferenced, and freeing them. In the Unix/C environment, this is +difficult to do, even though garbage collecting mallocs exist. Even +worse, it is often inefficient) + +Memory leaks are serious for programs that run for a long time - +window managers, daemons, and suchlike, since the total memory wasted +over time may be large. Meticulously freeing everything allocated +by a program is the best solution - alas, for object-oriented +programming styles, it becomes very hard to keep track of every object +ever created so that open can free it. + +One method or providing temporary storage is the alloca() function. +This is used to allocate space off the run-time stack so that it is +automatically reclaimed upon procedure exit. It can therefore be used +to provide temporary storage needed by a procedure and the procedures +called by the procedure. Whether or not to use the alloca() call is a +somewhat controversial matter. The manual page warns that + "alloca() is both machine- and compiler-dependent; its use is + discouraged." +On the other hand, a fairly portable implementation using malloc() and +free() does exist, and some compilers (eg) GNU cc provide a +__builtin_alloca function, which they translate into a couple of +machine instructions to extend the frame pointer in the appropriate +direction. With these compilers, alloca() can be very fast - much +faster than any other memory allocation technique short of statically +allocated buffers. + +Alloca() still does not address the problem of storage which is needed +temporarily, but which may be passed to a routine's parent. + +Another way for a programmer to trace what is going on is to define +the preprocessor symbol MALLOC_TRACE (for example, with -DMALLOC_TRACE +on the cc command line when compiling the program) and then include +the header file "malloc.h" in the program. When MALLOC_TRACE is +defined, this header redefines malloc() and friends to macros, which +invoke _malloc() etc; the latter are routines which take the filename +and linenumber at which they are called. Calling + + void + mal_leaktrace(value) + + int value; + +with value > 0 will start the leaktracing process, recording the +address of each block returned by malloc, calloc, etc, along with the +filename:linenumber at which it was called. If that block is freed, it +deletes the record for that block. (Calling mal_leaktrace() with +value == 0 turns off tracing) + +At any time, the programmer can call + + void + mal_dumpleaktrace(fd) + + FILE *fd; + +where fd is a file pointer onto a file openeed for writing with +fopen() or freopen(), into which a list of unfreed blocks is dumped. +This list is in the form + + filename:linenumber: sequence-no. address-in-decimal (address-in-hex) + +This permits the programmer to examine the places in the program where +blocks were allocated and not freed. These represent potential memory +leaks. (Several text-editors will understand this sort of output as +output from grep -n, and will be able to load and step through these +files, eg. Jove) + +Typically, memory leaks take place within the main loop of a program, +so the general structure of the program may look something like + + initialization allocation + while (something) { + things to do + } + +The initial allocation is at worst a one-time memory wastage and does +not concern us that much. Memory that is allocated inside the loop, +and is not freed does concern us since it is a continuous loss, so +after the initialization allocation, we insert a call to +mal_leaktrace() to start tracing. When the second iteration starts, +we call mal_dumpleaktrace() to dump the blocks that were presumably +allocated during the first iteration and have not yet been freed, and +do the same at the start of the third iteration for unfreed blocks +from the second iteration and so on. The code now looks something like + + initialization allocation + mal_leaktrace(1); + while(something) { + mal_dumpleaktrace(stderr); + things to do; + } + +The above is a simple example - more complex control-flow may require +turning leak tracing on and off repeatedly, so as not to get deluged +with information. + +If you use allocation functions within your code that layer on top of +malloc, this leak tracing as it is will not be too useful since it +will only report the location of your allocation functions. In that +case, you have to define subsidiary allocation functions like +_malloc() and #defines like those in the malloc.h file so that you can +record the real address of the call. See the _malloc.c file and the +malloc.h header for examples on how to do this. (You have to call the +real allocation and then use the RECORD_FILE_AND_LINE() macro from +trace.h to store the address of the allocated block. When you free the +block you have to call the DELETE_RECORD() macro to remove that +address. Do not include malloc.c in these files, and make sure you do +call the real malloc from your allocation function - otherwise the +allocation package will attempt to record the same address twice and +fail) You may also want to include defs.h and then use PRTRACE() to +print the line number and file name in the trace file. |