Category Archives: Linux Programming

Setting up Kdump and Crash for ARM-32 – an Ongoing Saga

Author: Kaiwan N Billimoria, kaiwanTECH
Date: 13 July 2017

DUT (Device Under Test):
Hardware platform: Qemu-virtualized Versatile Express Cortex-A9.
Software platform: mainline linux kernel ver 4.9.1, kexec-tools, crash utility.

First, my attempt at setting up the Raspberry Pi 3 failed; mostly due to recurring issues with the bloody MMC card; probably a power issue! (see this link).

Anyway. Then switched to doing the same on the always-reliable Qemu virtualizer; I prefer to setup the Vexpress-CA9.

In fact, a supporting project I maintain on github – the SEALS project – is proving extremely useful for building the ARM-32 hardware/software platform quickly and efficiently. (Fun fact: SEALS = Simple Embedded Arm Linux System).

So, I cloned the above-mentioned git repo for SEALS into a new working folder.

The way SEALS work is simple: edit a configuration file (build.config) to your satisfaction, to reflect the PATH to and versions of the cross-compiler, kernel, kernel command-line parameters, busybox, rootfs size, etc.

Setup the SEALS build.config file.

Screenshot: the script initial screen displaying the current build config:kdumpcr1

Relevant Info reproduced below for clarity:

Toolchain prefix : arm-none-linux-gnueabi-
Toolchain version: (Sourcery CodeBench Lite 2014.05-29) 4.8.3 20140320 (prerelease)

Staging folder : <…>/SEALS_staging
ARM Platform : Versatile Express (A9)

Platform RAM : 512 MB
RootFS force rebuild : 0
RootFS size : 768 MB

Linux kernel to use : 4.9.1
Linux kernel codebase location : <…>/SEALS_staging/linux-4.9.1
Kernel command-line : “console=ttyAMA0 root=/dev/mmcblk0 init=/sbin/init crashkernel=32M”

Busybox to use : 1.26.2
Busybox codebase location : <…>/SEALS_staging/busybox-1.26.2


Screenshot: second GUI screen, allowing the user to select actions to takekdumpcr2

Upon clicking ‘OK’, the build process starts:

I Boot Kernel Setup

  • kernel config: must carefully configure the Linux kernel. Please follow the kernel documentation in detail: [1]In brief, ensure these are set:
    CONFIG_SYSFS=y << should be >>

Dump-capture kernel config options (Arch Dependent, arm)
To use a relocatable kernel, Enable “AUTO_ZRELADDR” support under “Boot” options:      


  • kexec
    • We require to build kexec (kexec-tools is the package). But: the package does not seem to be directly available for A-32 (ARM-32), so had to build from source.
    • Did not succeed
    • then saw this gist:

which succinctly got it working!

  • Copy the ‘kexec’ binary into the root filesystem (staging tree) under it’s sbin/ folder
  • We build a relocatable kernel so that we can use the same ‘zImage’ 
    for the dump kernel as well as the primary boot kernel:
     “Or use the system kernel binary itself as dump-capture kernel and there is 
    no need to build a separate dump-capture kernel. 
    This is possible  only with the architectures which support a 
    relocatable kernel. As  of today, i386, x86_64, ppc64, ia64 and 
    arm architectures support relocatable kernel. ...”
  • the build system will proceed to build the kernel using the cross-compiler specified
  • went through just fine.

II Load dump-capture (or kdump) kernel into boot kernel’s RAM

Do read [1], but to cut a long story short

  • Create a small shell script - a wrapper over kexec – in the root filesystem:
    DUMPK_CMDLINE="console=ttyAMA0 root=/dev/mmcblk0 rootfstype=ext4 rootwait init=/sbin/init maxcpus=1 reset_devices"
    kexec --type zImage \
    -p ./zImage-4.9.1-crk \
    --dtb=./vexpress-v2p-ca9.dtb \
    [ $? -ne 0 ] && { 
        echo "kexec failed." ; exit 1
    echo "$0: kexec: success, dump kernel loaded."
    exit 0
  • Run it. It will only work (in my experience) when:
    • you’ve passed the kernel parameter ‘crashkernel=32M’
    • verified that indeed the boot kernel has reserved 32MB RAM for the dump-capture kernel/system:
RUN: Running qemu-system-arm now ...

qemu-system-arm -m 512 -M vexpress-a9 -kernel <...>/images/zImage \
-drive file=<...>/images/rfs.img,if=sd,format=raw \
-append "console=ttyAMA0 root=/dev/mmcblk0 init=/sbin/init crashkernel=32M" \
-nographic -no-reboot -dtb <...>/linux-4.9.1/arch/arm/boot/dts/vexpress-v2p-ca9.dtb

Booting Linux on physical CPU 0x0
Linux version 4.9.1-crk (hk@hk) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-29) ) #2 SMP Wed Jul 12 19:41:08 IST 2017
CPU: ARMv7 Processor [410fc090] revision 0 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
OF: fdt:Machine model: V2P-CA9
ARM / $ dmesg |grep -i crash
Reserving 32MB of memory at 1920MB for crashkernel (System RAM: 512MB)
Kernel command line: console=ttyAMA0 root=/dev/mmcblk0 init=/sbin/init crashkernel=32M
ARM / $ id
uid=0 gid=0
ARM / $ ./
./ kexec: success, dump kernel loaded.
ARM / $ 

Ok, the dump-capture kernel has loaded up.
Now to test it!

III Test the soft boot into the dump-capture kernel

On the console of the (emulated) ARM-32:

ARM / $ echo c > /proc/sysrq-trigger 
sysrq: SysRq : Trigger a crash
Unhandled fault: page domain fault (0x81b) at 0x00000000
pgd = 9ee44000
[00000000] *pgd=7ee30831, *pte=00000000, *ppte=00000000
Internal error: : 81b [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 724 Comm: sh Not tainted 4.9.1-crk #2
Hardware name: ARM-Versatile Express
task: 9f589600 task.stack: 9ee40000
PC is at sysrq_handle_crash+0x24/0x2c
LR is at arm_heavy_mb+0x1c/0x38
pc : [<804060d8>] lr : [<80114bd8>] psr: 60000013
sp : 9ee41eb8 ip : 00000000 fp : 00000000


[<804060d8>] (sysrq_handle_crash) from [<804065bc>] (__handle_sysrq+0xa8/0x170)
[<804065bc>] (__handle_sysrq) from [<80406ab8>] (write_sysrq_trigger+0x54/0x64)
[<80406ab8>] (write_sysrq_trigger) from [<80278588>] (proc_reg_write+0x58/0x90)
[<80278588>] (proc_reg_write) from [<802235c4>] (__vfs_write+0x28/0x10c)
[<802235c4>] (__vfs_write) from [<80224098>] (vfs_write+0xb4/0x15c)
[<80224098>] (vfs_write) from [<80224d30>] (SyS_write+0x40/0x80)
[<80224d30>] (SyS_write) from [<801074a0>] (ret_fast_syscall+0x0/0x3c)

Code: f57ff04e ebf43aba e3a03000 e3a02001 (e5c32000) 

Loading crashdump kernel...
Booting Linux on physical CPU 0x0

Linux version 4.9.1-crk (hk@hk) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-29) ) #2 SMP Wed Jul 12 19:41:08 IST 2017
CPU: ARMv7 Processor [410fc090] revision 0 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
OF: fdt:Machine model: V2P-CA9
OF: fdt:Ignoring memory range 0x60000000 - 0x78000000
Memory policy: Data cache writeback
CPU: All CPU(s) started in SVC mode.
percpu: Embedded 14 pages/cpu @81e76000 s27648 r8192 d21504 u57344
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 7874
Kernel command line: console=ttyAMA0 root=/dev/mmcblk0 rootfstype=ext4 rootwait 
init=/sbin/init maxcpus=1 reset_devices elfcorehdr=0x79f00000 mem=31744K

ARM / $ ls -l /proc/vmcore            << the dump image (480 MB here) >>
-r-------- 1 0 0 503324672 Jul 13 12:22 /proc/vmcore
ARM / $ 

Copy the dump file (with cp or scp, whatever), 
get it to the host system.

cp /proc/vmcore <dump-file>
ARM / $ halt
ARM / $ EXT4-fs (mmcblk0): re-mounted. Opts: (null)
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system halt
reboot: System halted
QEMU: Terminated
^A-X  << type Ctrl-a followed by x to exit qemu >>
... and done. all done, exiting.
Thank you for using SEALS! We hope you like it.
There is much scope for improvement of course; would love to hear your feedback, ideas, and contribution!
Please visit : . 

IV Analyse the kdump image with the crash utility


The core analysis suite is a self-contained tool that can be used to
investigate either live systems, kernel core dumps created from dump
creation facilities such as kdump, kvmdump, xendump, the netdump and
diskdump packages offered by Red Hat, the LKCD kernel patch, the mcore
kernel patch created by Mission Critical Linux, as well as other formats
created by manufacturer-specific firmware.


A whitepaper with complete documentation concerning the use of this utility
can be found here: [3]

The crash binary can only be used on systems of the same architecture as
the host build system. There are a few optional manners of building the
crash binary:

o On an x86_64 host, a 32-bit x86 binary that can be used to analyze
32-bit x86 dumpfiles may be built by typing "make target=X86".
o On an x86 or x86_64 host, a 32-bit x86 binary that can be used to analyze
 32-bit arm dumpfiles may be built by typing "make target=ARM".

Ah. To paraphrase, Therein lies the devil, in the details.

[UPDATE : 14 July ’17
I do have it building successfully now. The trick apparently – on x86_64 Ubuntu 17.04 – was to install the 
lib32z1-dev package! Once I did, it built just fine. Many thanks to Dave Anderson (RedHat) who promptly replied to my query on the crash mailing list.]

I cloned the ‘crash’ git repo, did ‘make target=ARM’, it fails with:

 ../readline/libreadline.a ../opcodes/libopcodes.a ../bfd/libbfd.a
../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -ldl
-lncurses -lm ../libiberty/libiberty.a build-gnulib/import/libgnu.a
 -lz -ldl -rdynamic
/usr/bin/ld: cannot find -lz
collect2: error: ld returned 1 exit status
Makefile:1174: recipe for target 'gdb' failed

Still trying to debug this!

Btw, if you’re unsure, pl see crash’s github Readme on how to build it.
So, now, with a ‘crash’ binary that works, lets get to work:

$ file crash
crash: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/, for GNU/Linux 2.6.32, …

$ ./crash

crash 7.1.9++
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation

crash: compiled for the ARM architecture

To examine a kernel dump (kdump) file, invoke crash like so:

crash <path-to-vmlinux-with-debug-symbols> <path-to-kernel-dumpfile>

$ <...>/crash/crash \
  <...>/SEALS_staging/linux-4.9.1/vmlinux ./kdump.img

crash 7.1.9++
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
WARNING: cannot find NT_PRSTATUS note for cpu: 1
WARNING: cannot find NT_PRSTATUS note for cpu: 2
WARNING: cannot find NT_PRSTATUS note for cpu: 3

 KERNEL: <...>/SEALS_staging/linux-4.9.1/vmlinux
 DUMPFILE: ./kdump.img
 DATE: Thu Jul 13 00:38:39 2017
 UPTIME: 00:00:42
LOAD AVERAGE: 0.00, 0.00, 0.00
 TASKS: 56
 NODENAME: (none)
 RELEASE: 4.9.1-crk
 VERSION: #2 SMP Wed Jul 12 19:41:08 IST 2017
 MACHINE: armv7l (unknown Mhz)
 PANIC: "sysrq: SysRq : Trigger a crash"
 PID: 735
 COMMAND: "echo"
 TASK: 9f6af900 [THREAD_INFO: 9ee48000]
 CPU: 0

crash> ps
 0 0 0 80a05c00 RU 0.0 0 0 [swapper/0]
> 0 0 1 9f4ab700 RU 0.0 0 0 [swapper/1]
> 0 0 2 9f4abc80 RU 0.0 0 0 [swapper/2]
> 0 0 3 9f4ac200 RU 0.0 0 0 [swapper/3]
 1 0 0 9f4a8000 IN 0.1 3344 1500 init
722 2 0 9f6ac200 IN 0.0 0 0 [ext4-rsv-conver]
728 1 0 9f6ab180 IN 0.1 3348 1672 sh
> 735 728 0 9f6af900 RU 0.1 3344 1080 echo
crash> bt
PID: 735 TASK: 9f6af900 CPU: 0 COMMAND: "echo"
 #0 [<804060d8>] (sysrq_handle_crash) from [<804065bc>]
 #1 [<804065bc>] (__handle_sysrq) from [<80406ab8>]
 #2 [<80406ab8>] (write_sysrq_trigger) from [<80278588>]
 #3 [<80278588>] (proc_reg_write) from [<802235c4>]
 #4 [<802235c4>] (__vfs_write) from [<80224098>]
 #5 [<80224098>] (vfs_write) from [<80224d30>]
 #6 [<80224d30>] (sys_write) from [<801074a0>]
 pc : [<76e8d7ec>] lr : [<0000f9dc>] psr: 60000010
 sp : 7ebdcc7c ip : 00000000 fp : 00000000
 r10: 0010286c r9 : 7ebdce68 r8 : 00000020
 r7 : 00000004 r6 : 00103008 r5 : 00000001 r4 : 00102e2c
 r3 : 00000000 r2 : 00000002 r1 : 00103008 r0 : 00000001
 Flags: nZCv IRQs on FIQs on Mode USER_32 ISA ARM

And so on …

Another thing we can do is use gdb – to a limited extent – to analyse the dump file:

From [1]:

Before analyzing the dump image, you should reboot into a stable kernel.

You can do limited analysis using GDB on the dump file copied out of
/proc/vmcore. Use the debug vmlinux built with -g and run the following
  gdb vmlinux <dump-file>

Stack trace for the task on processor 0, register display, and memory
display work fine.

Also, [3] is an excellent whitepaper on using crash. Do read it.

All right, hope that helps!

Interesting Numbers

This article delves into looking up Interesting Numbers within (as of now) the following sections:

  • Networking
    • Numbers (with sheet screenshot)
    • Mitigation / Solutions
    • Resources
  • SLOCs – Source Lines Of Code
    • Cars
    • OS’s
  • Powers of 2  [Edit: 09 Jun 2015]



  • In general, one requires 1 MHz CPU power to drive 1 Mbps of data (or put another way, 1 CPU cycle per bit of data)
  • Given the (heavy legacy baggage) fact that the standard Ethernet MTU (Maximum Transmission Unit) size is typically 1500 bytes:
    • A 10 Gbps network link running at wire speed, will require to transfer over 800,000 packets per second
    • The table below enumerates the story for differing Ethernet packet sizes and wire rate to be maintained

Screenshot of a sheet describing the relationship between Ethernet frame size and Line Rate to Maintain (click to enlarge)

Screenshot from 2015-05-01 17:28:49
Relationship between Ethernet frame size and network Line Rate

Below Source: “Diving into Linux Networking Stack I”, MJ Schultz

Thus, at a rate of 10 Gbps, for MTU-size packets, we require to sustain a rate of processing approximately 1 packet per microsecond! (and that’s half-duplex, effectively cutting the time down to half for full-duplex)!

How is this possible?

Well, it’s not- not over sustained periods. For one, the interrupt load would be far too high for the processor to effectively handle (leading to’s called “receive livelock”). For another, the IP (and above) protocol stack processing would also be hard put to sustain these rates.

The solution is two-fold:

  • hardware interrupt mitigation is achieved via the NAPI technique (which many modern drivers use as the default processing mode, switching to interrupt mode only when there are no or few packets left to process)
  • Modern hardware NICs and operating systems use high performance offloading techniques (TSO / LRO / GRO). These essentially offload work from the host processor to the hardware NIC, and effectively allow large packet sizes as well.

TSO effectively lets us offload 64KB of data (to the hardware NIC for segmentation and processing). If the host did the usual TCP processing at typical MSS sizes, this works out to approximately (MTU-40)-sized segments ~= 1460 bytes.
Thus, with TSO, we get an ~ > 40X saving (65536/1460 = 44.88) on CPU utilization!


NIC Adapter Time available between packets for MTU-size (1538 bytes) packets Packets per second (pps)
10 Gbps 1,230 ns (1.23 us) 813,008 (~ 0.8 M pps)
40 Gbps 120 ns 8,333,333 (~ 8M pps)
100 Gbps ~ 48 ns ~ 20,833,333 (nearly 21M pps) !

[Update: 25 May 2016]:

See this presentation made by Jasper D Brouer, Principal Kernel Engineer, RedHat at DevConf, Feb 2016 : Kernel network stack challenges at increasing speeds [ODP]



[Update: 09 Aug 2015]:

[Inputs below from this LWN article “Improving Linux Networking Performance”, Jan 2015]

Latency-sensitive workloads:
So, we’ve got approx 48 ns to process a packet on a 100 Gbe capable network adapter. Assuming we have a 3 GHz processor, it give us:

  • ~ 200 cycles to process each packet
  • a cache miss will take about 32ns to resolve
  • an SKB on 64-bit requires around 4 cache lines, and they’re written to during packet handling
  • thus, more than 2 cache misses will wipe out the available time budget!
  • what makes it worse: critical code sections require locking
    • the Intel LOCK prefix instruction (used to implement atomic operations at the machine level via cmpxchg or similar) takes ~ 8.25 ns
    • thus a spin_lock/spin_unlock section will take at least 16 ns
  • System call
    • with SELinux and auditing support- ~ 75 ns
    • without SELinux and auditing support- ~ just under 42 ns

“The (Linux) kernel, today, can only forward something between 1M (M=million) and 2M packets per core every second, while some of the bypass alternatives approach a rate of 15M packets per core per second.” Source: Improving Linux networking performance”, Jon Corbet, Jan 2015.



Presentation slides by Jasper D Brouer, Principal Kernel Engineer, RedHat at DevConf, Feb 2016 : Kernel network stack challenges at increasing speeds [ODP]

Large Segmentation Offload (LSO) on Wikipedia

“Improving Linux networking performance”, LWN, Jon Corbet, Jan 2015

JLS2009: Generic receive offload

Linux and TCP Offload Engines [LWN]

Whitepaper: “Introduction to TCP Offload Engines” (Dell)

Whitepaper: “Boosting Data Transfer with TCP Offload Engine Technology” (Dell, Broadcom, MS; benchmarks displayed here)

“The Ethernet standard assumes it will take roughly 50 microseconds for a signal to reach its destination.” – Source: Basic-Networking-Tutorial

SLOCs – Source Lines Of Code

First, please view this brilliant infographic from the “informationisbeautiful” book (and website).
And, here’s the same numbers in a Google sheet!


Below snippet directly quoted from “This Car Runs on Code”

“The avionics system in the F-22 Raptor, the current U.S. Air Force frontline jet fighter, consists of about 1.7 million lines of software code. The F-35 Joint Strike Fighter, scheduled to become operational in 2010, will require about 5.7 million lines of code to operate its onboard systems. And Boeing’s new 787 Dreamliner, scheduled to be delivered to customers in 2010, requires about 6.5 million lines of software code to operate its avionics and onboard support systems.

These are impressive amounts of software, yet if you bought a premium-class automobile recently, ”it probably contains close to 100 million lines of software code,” says Manfred Broy, a professor of informatics at Technical University, Munich, and a leading expert on software in cars. All that software executes on 70 to 100 microprocessor-based electronic control units (ECUs) networked throughout the body of your car.


Edit: 04 jan 2017
“Car Software: 100M Lines of Code and Counting”
– Article on LinkedIn.


Operating Systems

Source: Wikipedia article on SLOCs

… According to Vincent Maraia,[1] the SLOC values for various operating systems in Microsoft‘s Windows NT product line are as follows:

Year Operating System SLOC (Million)
1993 Windows NT 3.1 4-5[1]
1994 Windows NT 3.5 7–8[1]
1996 Windows NT 4.0 11–12[1]
2000 Windows 2000 more than 29[1]
2001 Windows XP 45[2][3]
2003 Windows Server 2003 50[1]

David A. Wheeler studied the Red Hat distribution of the Linux operating system, and reported that Red Hat Linux version 7.1[4] (released April 2001) contained over 30 million physical SLOC. He also extrapolated that, had it been developed by conventional proprietary means, it would have required about 8,000 man-years of development effort and would have cost over $1 billion (in year 2000 U.S. dollars).

A similar study was later made of Debian GNU/Linux version 2.2 (also known as “Potato”); this operating system was originally released in August 2000. This study found that Debian GNU/Linux 2.2 included over 55 million SLOC, and if developed in a conventional proprietary way would have required 14,005 man-years and cost $1.9 billion USD to develop. Later runs of the tools used report that the following release of Debian had 104 million SLOC, and as of year 2005, the newest release is going to include over 213 million SLOC.

One can find figures of major operating systems (the various Windows versions have been presented in a table above).

Year Operating System SLOC (Million)
2000 Debian 2.2 55–59[5][6]
2002 Debian 3.0 104[6]
2005 Debian 3.1 215[6]
2007 Debian 4.0 283[6]
2009 Debian 5.0 324[6]
2012 Debian 7.0 419[7]
2009 OpenSolaris 9.7
FreeBSD 8.8
2005 Mac OS X 10.4 86[8][n 1]
2001 Linux kernel 2.4.2 2.4[4]
2003 Linux kernel 2.6.0 5.2
2009 Linux kernel 2.6.29 11.0
2009 Linux kernel 2.6.32 12.6[9]
2010 Linux kernel 2.6.35 13.5[10]
2012 Linux kernel 3.6 15.9[11]

Powers of 2

Often, especially for nerdy programmers, it’s a good idea to be familiar with powers of 2. I won’t bore you with the “usual” ones (do it yourself IOW 🙂 ).

^2 Quick Summary:

1000 kB kilobyte
10002 MB megabyte
10003 GB gigabyte
10004 TB terabyte
10005 PB petabyte
10006 EB exabyte
10007 ZB zettabyte
10008 YB yottabyte
1024 KiB kibibyte KB kilobyte
10242 MiB mebibyte MB megabyte
10243 GiB gibibyte GB gigabyte
10244 TiB tebibyte
10245 PiB pebibyte
10246 EiB exbibyte
10247 ZiB zebibyte
10248 YiB yobibyte

For example, on an x86_64 running the Linux OS (kernel ver >= 2.6.x), the memory management layer divides the 64-bit process VAS (Virtual Address Space) into two regions:

  • a 128 TB region at the low end for Userland (this includes the text, data, library/memory mapping and stack segments)
  • a 128 TB region at the upper end for kernel VAS (the kernel segment)

How large is the entire VAS?
It’s 2^64 of course, which is 18,446,744,073,709,551,616 bytes !
Wow. What the heck’s that, you ask??
Ok easier: it’s 16 EB (exabytes)    🙂
(see the Summary Table below too).

From the Wikipedia page on Powers of 2 :

The first 96 powers of two
(sequence A000079 in OEIS)

20 = 1 216 = 65,536 232 = 4,294,967,296 248 = 281,474,976,710,656 264 = 18,446,744,073,709,551,616 280 = 1,208,925,819,614,629,174,706,176
21 = 2 217 = 131,072 233 = 8,589,934,592 249 = 562,949,953,421,312 265 = 36,893,488,147,419,103,232 281 = 2,417,851,639,229,258,349,412,352
22 = 4 218 = 262,144 234 = 17,179,869,184 250 = 1,125,899,906,842,624 266 = 73,786,976,294,838,206,464 282 = 4,835,703,278,458,516,698,824,704
23 = 8 219 = 524,288 235 = 34,359,738,368 251 = 2,251,799,813,685,248 267 = 147,573,952,589,676,412,928 283 = 9,671,406,556,917,033,397,649,408
24 = 16 220 = 1,048,576 236 = 68,719,476,736 252 = 4,503,599,627,370,496 268 = 295,147,905,179,352,825,856 284 = 19,342,813,113,834,066,795,298,816
25 = 32 221 = 2,097,152 237 = 137,438,953,472 253 = 9,007,199,254,740,992 269 = 590,295,810,358,705,651,712 285 = 38,685,626,227,668,133,590,597,632
26 = 64 222 = 4,194,304 238 = 274,877,906,944 254 = 18,014,398,509,481,984 270 = 1,180,591,620,717,411,303,424 286 = 77,371,252,455,336,267,181,195,264
27 = 128 223 = 8,388,608 239 = 549,755,813,888 255 = 36,028,797,018,963,968 271 = 2,361,183,241,434,822,606,848 287 = 154,742,504,910,672,534,362,390,528
28 = 256 224 = 16,777,216 240 = 1,099,511,627,776 256 = 72,057,594,037,927,936 272 = 4,722,366,482,869,645,213,696 288 = 309,485,009,821,345,068,724,781,056
29 = 512 225 = 33,554,432 241 = 2,199,023,255,552 257 = 144,115,188,075,855,872 273 = 9,444,732,965,739,290,427,392 289 = 618,970,019,642,690,137,449,562,112
210 = 1,024 226 = 67,108,864 242 = 4,398,046,511,104 258 = 288,230,376,151,711,744 274 = 18,889,465,931,478,580,854,784 290 = 1,237,940,039,285,380,274,899,124,224
211 = 2,048 227 = 134,217,728 243 = 8,796,093,022,208 259 = 576,460,752,303,423,488 275 = 37,778,931,862,957,161,709,568 291 = 2,475,880,078,570,760,549,798,248,448
212 = 4,096 228 = 268,435,456 244 = 17,592,186,044,416 260 = 1,152,921,504,606,846,976 276 = 75,557,863,725,914,323,419,136 292 = 4,951,760,157,141,521,099,596,496,896
213 = 8,192 229 = 536,870,912 245 = 35,184,372,088,832 261 = 2,305,843,009,213,693,952 277 = 151,115,727,451,828,646,838,272 293 = 9,903,520,314,283,042,199,192,993,792
214 = 16,384 230 = 1,073,741,824 246 = 70,368,744,177,664 262 = 4,611,686,018,427,387,904 278 = 302,231,454,903,657,293,676,544 294 = 19,807,040,628,566,084,398,385,987,584
215 = 32,768 231 = 2,147,483,648 247 = 140,737,488,355,328 263 = 9,223,372,036,854,775,808 279 = 604,462,909,807,314,58

Some selected powers of two

28 = 256
The number of values represented by the 8 bits in a byte, more specifically termed as an octet. (The term byte is often defined as a collection of bits rather than the strict definition of an 8-bit quantity, as demonstrated by the term kilobyte.)
210 = 1,024
The binary approximation of the kilo-, or 1,000 multiplier, which causes a change of prefix. For example: 1,024 bytes = 1 kilobyte (or kibibyte).
This number has no special significance to computers, but is important to humans because we make use of powers of ten.
212 = 4,096
The hardware page size of Intel x86 processor.
216 = 65,536
The number of distinct values representable in a single word on a 16-bit processor, such as the original x86 processors.[4]
The maximum range of a short integer variable in the C#, and Java programming languages. The maximum range of a Word or Smallint variable in the Pascal programming language.
220 = 1,048,576
The binary approximation of the mega-, or 1,000,000 multiplier, which causes a change of prefix. For example: 1,048,576 bytes = 1 megabyte (or mibibyte).
This number has no special significance to computers, but is important to humans because we make use of powers of ten.
224 = 16,777,216
The number of unique colors that can be displayed in truecolor, which is used by common computer monitors.
This number is the result of using the three-channel RGB system, with 8 bits for each channel, or 24 bits in total.
230 = 1,073,741,824
The binary approximation of the giga-, or 1,000,000,000 multiplier, which causes a change of prefix. For example, 1,073,741,824 bytes = 1 gigabyte (or gibibyte).
This number has no special significance to computers, but is important to humans because we make use of powers of ten.
231 = 2,147,483,648
The number of non-negative values for a signed 32-bit integer. Since Unix time is measured in seconds since January 1, 1970, it will run out at 2,147,483,647 seconds or 03:14:07 UTC on Tuesday, 19 January 2038 on 32-bit computers running Unix, a problem known as the year 2038 problem.
232 = 4,294,967,296
The number of distinct values representable in a single word on a 32-bit processor. Or, the number of values representable in a doubleword on a 16-bit processor, such as the original x86 processors.[4]
The range of an int variable in the Java and C# programming languages.
The range of a Cardinal or Integer variable in the Pascal programming language.
The minimum range of a long integer variable in the C and C++ programming languages.
The total number of IP addresses under IPv4. Although this is a seemingly large number, IPv4 address exhaustion is imminent.
240 = 1,099,511,627,776
The binary approximation of the tera-, or 1,000,000,000,000 multiplier, which causes a change of prefix. For example, 1,099,511,627,776 bytes = 1 terabyte (or tebibyte).
This number has no special significance to computers, but is important to humans because we make use of powers of ten.
250 = 1,125,899,906,842,624
The binary approximation of the peta-, or 1,000,000,000,000,000 multiplier. 1,125,899,906,842,624 bytes = 1 petabyte (or pebibyte).
260 = 1,152,921,504,606,846,976
The binary approximation of the exa-, or 1,000,000,000,000,000,000 multiplier. 1,152,921,504,606,846,976 bytes = 1 exabyte (or exbibyte).
264 = 18,446,744,073,709,551,616
The number of distinct values representable in a single word on a 64-bit processor. Or, the number of values representable in a doubleword on a 32-bit processor. Or, the number of values representable in a quadword on a 16-bit processor, such as the original x86 processors.[4]
The range of a long variable in the Java and C# programming languages.
The range of a Int64 or QWord variable in the Pascal programming language.
The total number of IPv6 addresses generally given to a single LAN or subnet.
One more than the number of grains of rice on a chessboard, according to the old story, where the first square contains one grain of rice and each succeeding square twice as many as the previous square. For this reason the number 264 – 1 is known as the “chess number”.
270 = 1,180,591,620,717,411,303,424
The binary approximation of yotta-, or 1,000,000,000,000,000,000,000 multiplier, which causes a change of prefix. For example, 1,180,591,620,717,411,303,424 bytes = 1 Yottabyte (or yobibyte).
286 = 77,371,252,455,336,267,181,195,264
286 is conjectured to be the largest power of two not containing a zero.[5]
296 = 79,228,162,514,264,337,593,543,950,336
The total number of IPv6 addresses generally given to a local Internet registry. In CIDR notation, ISPs are given a /32, which means that 128-32=96 bits are available for addresses (as opposed to network designation). Thus, 296 addresses.
2128 = 340,282,366,920,938,463,463,374,607,431,768,211,456
The total number of IP addresses available under IPv6. Also the number of distinct universally unique identifiers (UUIDs).
2333 =
The smallest power of 2 which is greater than a googol (10100).
21024 ≈ 1.7976931348E+308
The maximum number that can fit in an IEEE double-precision floating-point format, and hence the maximum number that can be represented by many programs, for example Microsoft Excel.
257,885,161 = 581,887,266,232,246,442,175,100,…,725,746,141,988,071,724,285,952
One more than the largest known prime number as of 2013. It has more than 17 million digits.[6]

Again, from the Wikipedia page on Terabyte:


Illustrative usage examples

Examples of the use of terabyte to describe data sizes in different fields are:

  • Library data: The U.S. Library of Congress Web Capture team claims that as of March 2014 “the Library has collected about 525 terabytes of web archive data” and that it adds about 5 terabytes per month.[20]
  • Online databases: claims approximately 600 TB of genealogical data with the inclusion of US Census data from 1790 to 1930.[21]
  • Computer hardware: Hitachi introduced the world’s first one terabyte hard disk drive in 2007.[22]
  • Historical Internet traffic: In 1993, total Internet traffic amounted to approximately 100 TB for the year.[23] As of June 2008, Cisco Systems estimated Internet traffic at 160 TB/s (which, assuming to be statistically constant, comes to 5 zettabytes for the year).[24] In other words, the amount of Internet traffic per second in 2008 exceeded all of the Internet traffic in 1993.
  • Social networks: As of May 2009, Yahoo! Groups had “40 terabytes of data to index”.[25]
  • Video: Released in 2009, the 3D animated film Monsters vs. Aliens used 100 TB of storage during development.[26]
  • Usenet: In October 2000, the Deja News Usenet archive had stored over 500 million Usenet messages which used 1.5 TB of storage.[27]
  • Encyclopedia: In January 2010, the database of Wikipedia consists of a 5.87 terabyte SQL dataset.[28]
  • Climate science: In 2010, the German Climate Computing Centre (DKRZ) was generating 10000 TB of data per year, from a supercomputer with a 20 TB memory and 7000 TB disk space.[29]
  • Audio: One terabyte of audio recorded at CD quality contains approx. 2000 hours of audio. Additionally, one terabyte of compressed audio recorded at 128 kB/s contains approx. 17,000 hours of audio.
  • The Hubble Space Telescope has collected more than 45 terabytes of data in its first 20 years of observations.[30]
  • The IBM computer Watson, against which Jeopardy! contestants competed in February 2011, has 16 terabytes of RAM.[31]


A Header of Convenience

Over the years, we tend to collect little snippets of code and routines that we use, like, refine and reuse.

I’ve done so, for (mostly) user-space and kernel programming on the 2.6 / 3.x Linux kernel. Feel free to use it. Please do get back with any bugs you find, suggestions, etc.

License: GPL / LGPL

Click here to view the code!

There are macros / functions to:

  • make debug prints along with function name and line# info (via the usual printk() or trace_printk()) – (only if DEBUG mode is On)
    • [EDIT] : rate-limiting turned Off by default (else we risk missing some prints)
      -will preferably use rate-limited printk’s 
  • dump the kernel-mode stack
  • print the current context (process or interrupt along with flags in the form that ftrace uses)
  • a simple assert() macro (!)
  • a cpu-intensive DELAY_LOOP (useful for test rigs that must spin on the processor)
  • an equivalent to usermode sleep functionality
  • a function to calculate the time delta given two timestamps (timeval structs)
  • convert decimal to binary, and
  • a few more.

Whew 🙂

Edit: removed the header listing inline here; it’s far more convenient to just view it online here (on

Linux Tools for the serious Systems Programmer

Tools that help. When developing code (systems programming) on the Linux OS: a compilation by Kaiwan N Billimoria :


Tool Type


ARM support (on target)?



find/grep Source Code browsers Y -busybox Source; reqd on host dev system only
cscope NA
ctags NA
Source Code static analysis. FOSS NA
splint (prev LCLint) NA
Coverity / Klocwork / etc Commercial ?
strace Application trace Y
ltrace Y
[f]printf Application – simple instrumentation Y Code-based
My “MSG” and other macros  Header file  Useful Y
gdb Source-level debuggers Y Usually on host dev system only
ddd ?
Insight ?
ps Process state Y -busybox
pgrep, pkill Y -busybox
pstree ?
top Y
pidstat ?
procfs System state / performance tuning
vmstat generic Y
dstat  Tip:
dstat –time –top-io-adv –top-cpu –top-mem 5
(every 5s)
iotop, iostat, ionice disk IO Y buildroot
sar ? package: sysstat
lsof ?
Valgrind Memory Checkers and analysis Considered the best OSS memory checker suite Y -ver 3.7 on buildroot; only for Cortex A8/A9 && kernel ver < 3.x
Electric Fence ?
Dmalloc Y
mtrace Y
iftop Network monitoring, etc ?
iptraf ?
netstat Y -netstat-nat
ethtool Y
tcpdump Y
wireshark Ethernet, USB sniffer N GUI- on host
 Also, BTW, here’s a nice link :

16 commands to check hardware information on Linux


printk Kernel – simple instrumentation Y Kernel code-based debugging techniques [note: recommend you use debugfs and not procfs for debug-related stuff].
My “MSG” and other macros  Header file  Useful Y
procfs Kernel Analysis & Tuning w/ sysctl Y
ioctl Y
debugfs Recommended Y
Magic SysRq During development / system lockups Y
gdb with proc/kcore Kernel lookup Y
KGDB Kernel development debugging Y
KProbes, JProbes Non-intrusive kernel hooks  V useful; for learning / debugging Y
SystemTap Kernel scriptable tracing/probing instrumentation tool  (AFAIK, layered on Kprobes) ?
Ftrace Kernel trace framework Y
OProfile Kernel and App profiler ?
LTTng Linux Trace Toolkit next gen – Instrumentation ?
Kdump, Kexec and Crash Crash dump and analysis Y -kexec crash -on host
Perf / Perfmon2 HW-based performance monitoring Y (limited?) Arch-independent
cpufreq Power Management
CGroups Scheduler Y
Proc – sysctl Y
chrt Y buildroot
cpuset, taskset Y buildroot
sparse Kernel-space static code analysis NA -src Reqd on dev host only
QEMU Virtualization, open source Y
VirtualBox ?
Tip: Using buildroot,enable the packages/features you want for embedded!
Kaiwan N Billimoria, kaiwanTECH.

A quick-ref pic from Brendan Gregg’s fantastic site on Linux Performance tools (and Linux performance monitoring in general):

Click to zoom


Linux Kernel Online and Book Resources collection

Working on the Linux kernel is challenging stuff, no doubt about that. Thus, the hunt for good technical articles, documentation, tips and gotchas on the subject quickly becomes part and parcel of the kernel developer’s work. This page is an attempt to collate and aggregate quality online (and offline – book lists) about the Linux kernel. It’s certainly not  the first and won’t be the last such attempt. Nevertheless, hope you find it useful! Kindly comment and let me know what I inadvertently missed out. Here goes:

  • Perhaps the best all-in-one or starting point website to begin digging up practical (and theoretical) information on the Linux kernel: 

The Wikipedia “Portal:Linux” page linuxportal Continue reading Linux Kernel Online and Book Resources collection

Exploring Linux procfs via shell scripts

Very often, while working on a Linux project, we’d like information about the system we’re working on: both at a global scope and a local (process) scope.

Have we not wondered: is there a quick way to query which kernel version am using, what interrupts are enabled & hit, what my processor(s) are, details about kernel subsystems, memory usage, file, network, IPC usage, etc etc. Linux’s proc filesystem makes this easy.

So what exactly is the proc filesystem all about?

Essentially, some quick salient points about the proc filesystem:

  • it’s a RAM-based filesystem (think ramdisk; yup, it’s volatile)
  • it’s a kernel feature, not userspace – proc is a filesystem supported by the Linux kernel VFS
  • it serves two primary purposes
    • proc serves as a “view” deep into the kernel internals; we can see details about hardware and software subsystems that userspace otherwise would have no access to (no syscalls)
    • certain “files” under proc, typically anchored under /proc/sys, can be written into: these basically are the “tuning knobs” of the Linux kernel. Sysads, developers, apps, etc exploit this feature
  • proc is mounted on start-up under /proc
  • a quick peek under /proc will show you several “files” and “folders”. These are pseudo-entries in the sense that they exist only in RAM while power is applied. The “folders” that are numbers are in fact the PID of each process that’s alive when you typed ‘ls’! it’s a snapshot of the system at that moment in time..
  • in fact, the name “proc” suggests “process”

At this point, and if you’re not really familiar with this stuff, I’d urge you to peek around /proc on your Linux box, cat-ting stuff as you go. (Also, lest i forget, it’s better to run as root (sudo /bin/bash) so that we don’t get annoying ‘permission denied’ messages). Of course, be careful when you run as root!!!

For example, to get one started off:

Continue reading Exploring Linux procfs via shell scripts

kmalloc and vmalloc : Linux kernel memory allocation API Limits

The Intent

To determine how large a memory allocation can be made from within the kernel, via the “usual suspects” – the kmalloc and vmalloc kernel memory allocation APIs, in a single call.

Lets answer this question using two approaches: one, reading the source, and two, trying it out empirically on the system.
(Kernel source from kernel ver 3.0.2; tried out on kernel ver 2.6.35 on an x86 PC and on the (ARM) BeagleBoard).

Quick Summary

For the impatient:

The upper limit (number of bytes that can be allocated in a single kmalloc request), is a function of:

  • the processor – really, the page size – and
  • the number of buddy system freelists (MAX_ORDER).

On both x86 and ARM, with a standard page size of 4 Kb and MAX_ORDER of 11, the kmalloc upper limit is 4 MB!

The vmalloc upper limit is, in theory, the amount of physical RAM on the system.
In practice, the kernel allocates an architecture (cpu) specific “range” of virtual memory for the purpose of vmalloc: from VMALLOC_START to VMALLOC_END.

In practice, it’s usually a lot less. A useful comment by ugoren points out that:
” in 32bit systems, vmalloc is severely limited by its virtual memory area. For a 32bit x86 machine, with 1GB RAM or more, vmalloc is limited to 128MB (for all allocations together, not just for one).

[EDIT/UPDATE #3 : July ’17]
I wrote a simple kernel module (can download the source code, see the link at the end of this article), to test the kmalloc/vmalloc upper limits; the results are what we expect:
for kmalloc, 4 MB is the upper limit with a single call; for vmalloc, it depends on the vmalloc range.

Also, please realize, the actual amount you can acquire at runtime depends on the amount of physically contiguous RAM available at that moment in time; this can and does vary widely.

Finally, what if one require more than 4 MB of physically contiguous memory? That’s pretty much exactly the reason behind CMA – the Contiguous Memory Allocator! Details on CMA and using it are in this excellent LWN article here. Note that CMA was integrated into mainline Linux in v3.17 (05 Oct 2014). Also, the recommended API interface to use CMA is the ‘usual’ DMA [de]alloc APIs (kernel documentation here and here); don’t try and use them directly.

I kmalloc Limit Tests

First, lets check out the limits for kmalloc :

Continue reading kmalloc and vmalloc : Linux kernel memory allocation API Limits

SIGRTMIN or SIGRTMAX – who’s higher up (in signal delivery priority order)?

Linux supports Realtime (RT) signals (part of the Posix IEEE 1003.1b [POSIX.1b Realtime] draft standard). The only other GPOS (General Purpose OS) that does so is Solaris (TODO- verify. Right? Anyone??).

The Linux RT signals can be seen with the usual ‘kill’ command (‘kill -l’; that’s the letter lowercase L).

Continue reading SIGRTMIN or SIGRTMAX – who’s higher up (in signal delivery priority order)?