kmalloc and vmalloc : Linux kernel memory allocation API Limits

The Intent

To determine how large a memory allocation can be made from within the kernel, via the “usual suspects” – the kmalloc and vmalloc kernel memory allocation APIs, in a single call.

Lets answer this question using two approaches: one, reading the source, and two, trying it out empirically on the system.
(Kernel source from kernel ver 3.0.2; tried out on kernel ver 2.6.35 on an x86 PC and 2.6.33.2 on the (ARM) BeagleBoard).

Quick Summary

For the impatient:

The upper limit (number of bytes that can be allocated in a single kmalloc request), is a function of:

  • the processor – really, the page size – and
  • the number of buddy system freelists (MAX_ORDER).

On both x86 and ARM, with a standard page size of 4 Kb and MAX_ORDER of 11, the kmalloc upper limit is 4 MB!

The vmalloc upper limit is, in theory, the amount of physical RAM on the system.

[EDIT/UPDATE]
In practice, it’s usually a lot less. A useful comment by ugoren points out that:
” in 32bit systems, vmalloc is severely limited by its virtual memory area. For a 32bit x86 machine,
with 1GB RAM or more, vmalloc is limited to 128MB (for all allocations together, not just for one).
The kernel module (can download the source code, see the link at the end of this article), also serves as a decent example of writing pure kernel code.


I kmalloc Limit Tests

First, lets check out the limits for kmalloc :

I Via the Source

From the source: http://lxr.linux.no/linux+v3.0.2/include/linux/slab.h

/*
122 * The largest kmalloc size supported by the slab allocators is
123 * 32 megabyte (2^25) or the maximum allocatable page order if that is
124 * less than 32 MB.
125 *
126 * WARNING: Its not easy to increase this value since the allocators have
127 * to do various tricks to work around compiler limitations in order to
128 * ensure pr
oper constant folding.
129 */
130#define KMALLOC_SHIFT_HIGH ((MAX_ORDER + PAGE_SHIFT – 1) <= 25 ? \
131                           (MAX_ORDER + PAGE_SHIFT – 1) : 25)
132
133#define KMALLOC_MAX_SIZE (1UL << KMALLOC_SHIFT_HIGH)

With MAX_ORDER = 11 and PAGE_SHIFT = 12 (typical case on x86 or ARM with 4K pages), this works out to:
1UL << 22 which is (the same as 2^22) = 4194304 = 4096 KB = 4MB.

II Trying it out

I wanted to verify this experimentally – well, at least the kmalloc() limit (the vmalloc() is hard to verify, as it can use the whole of physical RAM!).

Toward this end, I wrote a small kernel module. It basically does the following:

  • Creates and sets up two proc-based “files”
    • /proc/driver/kmalloc_test
    • /proc/driver/vmalloc_test
  • their ‘write’ callbacks are invoked when the user writes a value to the files; this, of course, is expected to be the number of bytes to attempt to allocate using that particular API (kmalloc, in this case).
    Also, realize that we will just do some  arbitrary writes into the region allocated and then (more or less) immediately free it.
  • their ‘read’ callbacks are invoked when userspace reads the entry; it merely displays the last attempted number of bytes to allocate…
So, once the driver is loaded, one just has to write to the proc file to attempt an allocation. For example:
# dmesg -c
...
#
# insmod ./kvalloc.ko
#
# ls -l /proc/driver/*test*
-rw-r--r-- 1 root root 0 2011-08-16 17:03 /proc/driver/kmalloc_test
-rw-r--r-- 1 root root 0 2011-08-16 17:03 /proc/driver/vmalloc_test
# echo 200000 > /proc/driver/kmalloc_test
# dmesg
[25125.896862] vmall_init_module:260 : Loaded ok.
[25136.659209] kmalloc_procwrite:199 : Successfully allocated via kmalloc 200000 bytes (195 Kb, 0 MB) now to location 0xd5240000 (will kfree..)
#
# echo 2000008 > /proc/driver/vmalloc_test
# dmesg
[25125.896862] vmall_init_module:260 : Loaded ok.
[25136.659209] kmalloc_procwrite:199 : Successfully allocated via kmalloc 200000 bytes (195 Kb, 0 MB) now to location 0xd5240000 (will kfree..)
[25162.082120] vmalloc_procwrite:129 : Successfully allocated via vmalloc 2000008 bytes (1953 Kb, 1 MB) now to location 0xe0ac5000 (will vfree..)
#

As can be seen, the driver was loaded, the proc files were created, and two successful allocations were performed.
Of course, one can’t be expected to keep doing this manually till we hit a limit…so, we’ll use a simple shell script to help automate this task.
The script will basically loop, allocating a given amount of RAM (x bytes), adding a  step factor to it (x+step) and doing it again, in a loop…

We pass ‘k’ or ‘v’ to the script as a parameter, telling it whether to test the kmalloc or vmalloc API limit. We can also pass optional parameters, starting size and ‘step factor’.

# ./test_kvalloc.sh
Usage: test_kvalloc.sh {k|v} [start_numbytes] [step_factor]
 k : test the KMALLOC limit
 v : test the VMALLOC limit
# 

Testing kmalloc limit

So, lets do a test run for kmalloc, starting at 300 Kb in steps of 500 Kb.

# dmesg -c

# ./test_kvalloc.sh k 307200 512000
KVALLOC_PROCFILE = /proc/driver/kmalloc_test

Running:
KMALLOC TEST
test_kvalloc.sh 307200 512000

Attempting to alloc 307200 bytes (300 KB, 0 MB)
Attempting to alloc 819200 bytes (800 KB, 0 MB)
Attempting to alloc 1331200 bytes (1300 KB, 1 MB)
Attempting to alloc 1843200 bytes (1800 KB, 1 MB)
Attempting to alloc 2355200 bytes (2300 KB, 2 MB)
Attempting to alloc 2867200 bytes (2800 KB, 2 MB)
Attempting to alloc 3379200 bytes (3300 KB, 3 MB)
Attempting to alloc 3891200 bytes (3800 KB, 3 MB)
Attempting to alloc 4403200 bytes (4300 KB, 4 MB)
./test_kvalloc.sh: line 46: echo: write error: Cannot allocate memory
FAILURE! AT 4403200 bytes = 4300 KB = 4 MB. Aborting…
#

Yes, it fails at ~ 4 MB.


# dmesg
[25647.216149] kmalloc_procwrite:199 : Successfully allocated via kmalloc 307200 bytes (300 Kb, 0 MB) now to location 0xc0f80000 (will kfree..)
[25647.223584] kmalloc_procwrite:199 : Successfully allocated via kmalloc 819200 bytes (800 Kb, 0 MB) now to location 0xdce00000 (will kfree..)
[25647.321891] kmalloc_procwrite:199 : Successfully allocated via kmalloc 1331200 bytes (1300 Kb, 1 MB) now to location 0xdcc00000 (will kfree..)
[25647.329913] kmalloc_procwrite:199 : Successfully allocated via kmalloc 1843200 bytes (1800 Kb, 1 MB) now to location 0xdcc00000 (will kfree..)
[25647.336471] kmalloc_procwrite:199 : Successfully allocated via kmalloc 2355200 bytes (2300 Kb, 2 MB) now to location 0xdc000000 (will kfree..)
[25647.397678] kmalloc_procwrite:199 : Successfully allocated via kmalloc 2867200 bytes (2800 Kb, 2 MB) now to location 0xdc000000 (will kfree..)
[25647.398845] kmalloc_procwrite:199 : Successfully allocated via kmalloc 3379200 bytes (3300 Kb, 3 MB) now to location 0xdc000000 (will kfree..)
[25647.400119] kmalloc_procwrite:199 : Successfully allocated via kmalloc 3891200 bytes (3800 Kb, 3 MB) now to location 0xdc000000 (will kfree..)
[25647.401443] ————[ cut here ]————
[25647.401598] WARNING: at /build/buildd/linux-2.6.35/mm/page_alloc.c:2005 __alloc_pages_slowpath+0x36d/0x4b0()
[25647.401624] Hardware name: VMware Virtual Platform
[25647.401625] Modules linked in: kvalloc nls_iso8859_1 nls_cp437 vfat fat usb_storage cdc_acm vmblock vsock vmhgfs acpiphp binfmt_misc snd_ens1371 gameport snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_rawmidi ppdev snd_seq_midi_event snd_seq vmware_balloon snd_timer snd_seq_device psmouse serio_raw parport_pc snd soundcore snd_page_alloc intel_agp lp agpgart vmci i2c_piix4 shpchp parport mptspi mptscsih floppy mptbase scsi_transport_spi vmxnet [last unloaded: kvalloc]
[25647.401674] Pid: 6395, comm: test_kvalloc.sh Not tainted 2.6.35-30-generic #56-Ubuntu
[25647.401676] Call Trace:
[25647.401723] [<c014b602>] warn_slowpath_common+0x72/0xa0
[25647.401729] [<c01e0a5d>] ? __alloc_pages_slowpath+0x36d/0x4b0
[25647.401732] [<c01e0a5d>] ? __alloc_pages_slowpath+0x36d/0x4b0
[25647.401735] [<c014b652>] warn_slowpath_null+0x22/0x30
[25647.401750] [<c01e0a5d>] __alloc_pages_slowpath+0x36d/0x4b0
[25647.401753] [<c01e03c7>] ? get_page_from_freelist+0x247/0x330
[25647.401756] [<c01e0d0f>] __alloc_pages_nodemask+0x16f/0x1c0
[25647.401759] [<c01e0d7c>] __get_free_pages+0x1c/0x30
[25647.401785] [<c020e196>] __kmalloc+0x146/0x170
[25647.401799] [<c0231c6f>] ? mntput_no_expire+0x1f/0xd0
[25647.401803] [<e0a493ca>] kmalloc_procwrite+0x7a/0x1d8 [kvalloc]
[25647.401848] [<c02644d3>] proc_file_write+0x63/0xa0
[25647.401851] [<c0264470>] ? proc_file_write+0x0/0xa0
[25647.401854] [<c025f4f7>] proc_reg_write+0x67/0xa0
[25647.401868] [<c023069d>] ? alloc_fd+0xbd/0xf0
[25647.401872] [<c0219ab2>] vfs_write+0xa2/0x190
[25647.401875] [<c025f490>] ? proc_reg_write+0x0/0xa0
[25647.401878] [<c021a372>] sys_write+0x42/0x70
[25647.401923] [<c05cc284>] syscall_call+0x7/0xb
[25647.401925] —[ end trace b6b188c8bd2758df ]—
[25647.401927] kvalloc: kmalloc of 4403200 bytes FAILED!
#

Testing the same on the BeagleBoard (running 2.6.33.2 Linux on a TI OMAP3 ARM Cortex-A8 processor, 256 MB RAM):

On the board, after inserting the kernel module, we run the shell script:

root@beagleboard:/media/mmcblk0p1# ./test_kvalloc.arm.sh k 50000 102400
KVALLOC_PROCFILE = /proc/driver/kmalloc_test

Running:
KMALLOC TEST
test_kvalloc.arm.sh 50000 102400

Attempting to alloc 50000 bytes (48 KB, 0 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 50000 bytes (48 Kb, 0 MB) now to location 0xc2c40000 (will kfr
ee..)
Attempting to alloc 152400 bytes (148 KB, 0 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 152400 bytes (148 Kb, 0 MB) now to location 0xcd400000 (will k
free..)
Attempting to alloc 254800 bytes (248 KB, 0 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 254800 bytes (248 Kb, 0 MB) now to location 0xcd400000 (will k
free..)
Attempting to alloc 357200 bytes (348 KB, 0 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 357200 bytes (348 Kb, 0 MB) now to location 0xcf980000 (will k
free..)
...
...
Attempting to alloc 3838800 bytes (3748 KB, 3 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 3838800 bytes (3748 Kb, 3 MB) now to location 0xcd800000 (will
 kfree..)
Attempting to alloc 3941200 bytes (3848 KB, 3 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 3941200 bytes (3848 Kb, 3 MB) now to location 0xcd800000 (will
 kfree..)
Attempting to alloc 4043600 bytes (3948 KB, 3 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 4043600 bytes (3948 Kb, 3 MB) now to location 0xcd800000 (will
 kfree..)
Attempting to alloc 4146000 bytes (4048 KB, 3 MB)
kmalloc_procwrite:199 : Successfully allocated via kmalloc 4146000 bytes (4048 Kb, 3 MB) now to location 0xcd800000 (will
 kfree..)
Attempting to alloc 4248400 bytes (4148 KB, 4 MB)
kvalloc: kmalloc of 4248400 bytes FAILED!
root@beagleboard:/media/mmcblk0p1#

Clearly, and as expected, the kmalloc() fails at the 4 Mb allocation attempt (just as on the x86)!

———————————————————————————————————————————————–

II vmalloc Limit Tests

Now, lets check out the limits for vmalloc :

I Via the Source

A quick perusal of the source (start at mm/vmalloc.c), shows the call graph for vmalloc is (approximately, at least):

vmalloc –> __vmalloc_node_flags –> __vmalloc_node –> __vmalloc_node_range

[ a --> b above implies function a() calls function b() ].

In the function __vmalloc_node_range()  (code pasted below from the lxr.linux.no website):

...
void *__vmalloc_node_range(unsigned long size, unsigned long align,
1607                        unsigned long start, unsigned long end, gfp_t gfp_mask,
1608                        pgprot_t prot, int node, void *caller)
1609{
1610        struct vm_struct *area;
1611        void *addr;
1612        unsigned long real_size = size;
1613
1614        size = PAGE_ALIGN(size);
1615        if (!size || (size >> PAGE_SHIFT) > totalram_pages)
1616                return NULL;
1617

Line 1615 above tells us:

if the size is zero OR if the size expressed in pages is greater than totalram_pages, then abort; it’s invalid.
So there’s our answer: the vmalloc API limit is the size of physical RAM!
The vmalloc upper limit is, in theory, the amount of physical RAM on the system.

[EDIT/UPDATE]
In practice, it’s usually a lot less. A useful comment by ugoren points out that:
” in 32bit systems, vmalloc is severely limited by its virtual memory area. For a 32bit x86 machine,
with 1GB RAM or more, vmalloc is limited to 128MB (for all allocations together, not just for one).

Of course, budding kernel developers, please (read, for God’s sake!!!) note:
Do NOT abuse this API by allocating huge amounts of RAM just because you can; you will hurt the kernel’s performance like heck and no one will use your driver! :-) It’s true…

II Trying it out

Running on Ubuntu 10.04 with a 2.6.35 kernel, on the VMware VM, we have this much RAM:

# grep -i memtotal /proc/meminfo
MemTotal:         508396 kB
#

i.e. ~  496 MB of RAM.

We run the same shell script as above, this time with the ‘v’ parameter and a larger initial size (lets try 500 Kb) and ‘step factor’ (lets go with 5 Mb – that means we’ll attempt to alloc in steps of 5 Mb !):

vmalloc limit test 1

vmalloc limit test 2
vmalloc limit test 2


I find that vmalloc successfully allocates (and of course subsequently deallocates) upto ~ 455 MB at one shot (remember, on this VM system total RAM is 496 MB) before the virtual machine itself seems to hang. In reality, the kernel is doing it’s very best to allocate RAM as requested. In fact, it does succeed…until RAM runs out!

The Linux OS design is such that, when RAM is falling short, the kernel will do it’s damn-dest to reclaim RAM, aggressively swapping as necessary. What if this does not help? What if some malicious or greedy app(s) keep allocating memory!? – like we do here: we just keep eating RAM with vmalloc()! Well, when the kernel hits a dead-end, i.e., when both RAM and swap space are (almost) exhausted, the kernel takes a decision: it can either die or kill the bad guy; guess which it does :-)

The OOM (Out Of Memory) Killer runs, killing off the “bad” tasks!
That’s exactly what happens here: we don’t give the kernel much choice, do we!

Notice how this is a very different case from how the kmalloc() fails : it fails on hitting an artificially set limit, with no fuss or fanfare. The vmalloc(), on the other hand, hogs so much RAM, that it inadvertently causes the OOM Killer to run, killing the memory hoggers as a side effect, rather than failing directly due to the size limit being hit.

The screenshots above show the situation before the OOM-killer hits.

[ These tests have been run on Ubuntu 10.04, running a 2.6.35 kernel on the VMware Workstation VM product ].

Testing the same on the BeagleBoard (running 2.6.33.2 Linux on a TI OMAP3 ARM Cortex-A8 processor, 256 MB RAM):

The BeagleBoard I’m using (rev C3), has 256 MB RAM. On the board, after inserting the kernel module, we run the shell script, this time with the ‘v’ parameter and a larger initial size (lets try 500 Kb) and ‘step factor’ (lets go with 5 Mb – that means we’ll attempt to alloc in steps of 5 Mb !):

root@beagleboard:/media/mmcblk0p1# ./test_kvalloc.arm.sh
Usage: test_kvalloc.arm.sh {k|v} [start_numbytes] [step_factor]
 k : test the KMALLOC limit
 v : test the VMALLOC limit
root@beagleboard:/media/mmcblk0p1# ./test_kvalloc.arm.sh v 500000 5242880
KVALLOC_PROCFILE = /proc/driver/vmalloc_test

Running:
VMALLOC TEST
test_kvalloc.arm.sh 500000 5242880

Attempting to alloc 500000 bytes (488 KB, 0 MB)
vmalloc_procwrite:129 : Successfully allocated via vmalloc 500000 bytes (488 Kb, 0 MB) now to location 0xd0871000 (will v
free..)
Attempting to alloc 5742880 bytes (5608 KB, 5 MB)
vmalloc_procwrite:129 : Successfully allocated via vmalloc 5742880 bytes (5608 Kb, 5 MB) now to location 0xd08ee000 (will
 vfree..)
Attempting to alloc 10985760 bytes (10728 KB, 10 MB)
vmalloc_procwrite:129 : Successfully allocated via vmalloc 10985760 bytes (10728 Kb, 10 MB) now to location 0xd0e6f000 (w
ill vfree..)
Attempting to alloc 16228640 bytes (15848 KB, 15 MB)
vmalloc_procwrite:129 : Successfully allocated via vmalloc 16228640 bytes (15848 Kb, 15 MB) now to location 0xd18f1000 (w
ill vfree..)
...
...
Attempting to alloc 220700960 bytes (215528 KB, 210 MB)
vmalloc_procwrite:129 : Successfully allocated via vmalloc 220700960 bytes (215528 Kb, 210 MB) now to location 0xd088e000
 (will vfree..)
Attempting to alloc 225943840 bytes (220648 KB, 215 MB)
vmalloc_procwrite:129 : Successfully allocated via vmalloc 225943840 bytes (220648 Kb, 215 MB) now to location 0xddb42000
 (will vfree..)
Attempting to alloc 231186720 bytes (225768 KB, 220 MB)
vmalloc_procwrite:129 : Successfully allocated via vmalloc 231186720 bytes (225768 Kb, 220 MB) now to location 0xd0890000
 (will vfree..)
Attempting to alloc 236429600 bytes (230888 KB, 225 MB)
echo invoked oom-killer: gfp_mask=0xd2, order=0, oom_adj=0
[<c002e3a0>] (unwind_backtrace+0x0/0xd4) from [<c007c700>] (T.280+0x3c/0x108)
[<c007c700>] (T.280+0x3c/0x108) from [<c007c804>] (T.277+0x38/0xd8)
[<c007c804>] (T.277+0x38/0xd8) from [<c007c9f8>] (__out_of_memory+0x154/0x178)
[<c007c9f8>] (__out_of_memory+0x154/0x178) from [<c007ca9c>] (out_of_memory+0x80/0xb4)
[<c007ca9c>] (out_of_memory+0x80/0xb4) from [<c007f208>] (__alloc_pages_nodemask+0x418/0x514)
[<c007f208>] (__alloc_pages_nodemask+0x418/0x514) from [<c0095678>] (__vmalloc_area_node+0xbc/0x11c)
[<c0095678>] (__vmalloc_area_node+0xbc/0x11c) from [<c00958b8>] (vmalloc+0x24/0x2c)
[<c00958b8>] (vmalloc+0x24/0x2c) from [<bf000140>] (vmalloc_procwrite+0xf4/0x190 [kvalloc])
[<bf000140>] (vmalloc_procwrite+0xf4/0x190 [kvalloc]) from [<c00dd100>] (proc_file_write+0x3c/0x58)
[<c00dd100>] (proc_file_write+0x3c/0x58) from [<c00d94ec>] (proc_reg_write+0x40/0x54)
[<c00d94ec>] (proc_reg_write+0x40/0x54) from [<c009e490>] (vfs_write+0xac/0x154)
[<c009e490>] (vfs_write+0xac/0x154) from [<c009e5e4>] (sys_write+0x3c/0x68)
[<c009e5e4>] (sys_write+0x3c/0x68) from [<c0028dc0>] (ret_fast_syscall+0x0/0x2c)
Mem-info:
Normal per-cpu:
CPU    0: hi:   90, btch:  15 usd:  97
...
...
0 pages swap cached
Out of memory: kill process 991 (sh) score 65 or a child
Killed process 2533 (test_kvalloc.ar) vsz:2676kB, anon-rss:72kB, file-rss:0kB
echo invoked oom-killer: gfp_mask=0xd2, order=0, oom_adj=0
[<c002e3a0>] (unwind_backtrace+0x0/0xd4) from [<c007c700>] (T.280+0x3c/0x108)
[<c007c700>] (T.280+0x3c/0x108) from [<c007c804>] (T.277+0x38/0xd8)
[<c007c804>] (T.277+0x38/0xd8) from [<c007c9f8>] (__out_of_memory+0x154/0x178)
[<c007c9f8>] (__out_of_memory+0x154/0x178) from [<c007ca9c>] (out_of_memory+0x80/0xb4)
...
...
Out of memory: kill process 991 (sh) score 44 or a child
Killed process 991 (sh) vsz:2852kB, anon-rss:112kB, file-rss:24kB
echo invoked oom-killer: gfp_mask=0xd2, order=0, oom_adj=0
...
...
Out of memory: kill process 986 (syslogd) score 42 or a child
Killed process 986 (syslogd) vsz:2740kB, anon-rss:60kB, file-rss:64kB
...
...

So, we can see from above, that at the 225 MB allocation attempt it failed; the kernel OOM-killer jumped in and rapidly finished off the shell script, the shell process, syslogd…
After that, I aborted the script and the login process re-ran (because init would have re-spawned it) and I received the login prompt again.

=============================================================================================================================================

As an aside, this kernel module serves as a decent example of writing some pure kernel code (not driver related): it is concurrent-safe, and implements the useful notion of a global “context / device” structure accessed via a single global pointer that is passed around. This helps with code maintenance and debugging as well…It is also small enough to read quickly and grasp the essentials.

Please do comment with your inputs!

Download the source tarball (tar.gz) here.

About these ads

9 thoughts on “kmalloc and vmalloc : Linux kernel memory allocation API Limits”

  1. Hi, Im writting a virtual video device driver and to this I need to allocate around 64MB-265MB of video memory at kernel space. Do you think this is too much? Should I set an acceptable limit inside this interval??

    1. In general, yeah, it is too much. Still, this depends on how much memory your system has…
      Perhaps you’re doing so to emulate a large video framebuffer?
      Limiting it would be a good idea.
      Can you allocate a smaller amount and reuse it?

  2. You didn’t mention that in 32bit systems, vmalloc is severely limited by its virtual memory area. For a 32bit x86 machine, with 1GB RAM or more, vmalloc is limited to 128MB (for all allocations together, not just for one).

  3. What else does the “convenient.h” file contain?
    I could simulate the #define MSG
    but when I passed the first echo, my machine hung up.

    1. Hi @Utkarsh: I’ve updated the source zip file to contain the convenient.h header now. Sorry for the delay! Pl check it out & let me know if you still have issues…thx.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s