Inside the Linux Kernel Build Process

To Understanding Linux Kernel can be a difficult task, since its too large source code  to simply go through the code to follow what is happening. Multithreading and preemption add to the complexity for analysis. Locating the entry point (the first line of code to be executed upon entry to the kernel) can be challenging.

A simple and useful ways to understand the structure of a large binary image is to examine its build components. So first lets try to understand kernel build system. kernel build system produces several common files, as well as one or more architecture-specific binary modules. Common files are always built regardless of the architecture. Two of the common files are System.map and vmlinux which will be present in top level kernel directory once build is completed.

Referring to my previous tutorial about kernel compilation.

$make ARCH=arm CROSS_COMPILE=(patch to)/Binary_images/ARM_Cross_Tools/arm-2014.05/bin/arm-none-linux-gnueabi- vexpress_defconfig

This command does not build the kernel; it prepares the kernel source tree for the vexpress ARM architecture i.e ARCH=arm, including an initial default configuration (vexpress_defconfig) for this architecture and processor. It builds a default configuration (the dot-config file) that drives the kernel build, based on the defaults found in the vexpress_defconfig file. The desired architecture (ARCH=arm) and the toolchain (CROSS_COMPILE=arm-none-linux-gnueabi-) are specified on the command line.This forces the make utility to use the arm-none-linux-gnueabi- toolchain (I am using code sourcery toolchain you can also use other toolchains like linaro,buildroot,crosstoolng tool chains etc.. ) to build the kernel image and to use the arm-specific branch of the kernel source tree for architecture-dependent portions of the build. The dot-config file is the configuration blueprint for building a Linux kernel image. The output of this configuration exercise is written to a configuration file named .config, located in the top level Linux source directory that drives the kernel build.

fig.1blog_4_1

$make ARCH=arm help

This will show different list of architecture-specific targets depending on the architecture, e.g here ARM.

$make ARCH=arm CROSS_COMPILE=(patch to)/Binary_images/ARM_Cross_Tools/arm-2014.05/bin/arm-none-linux-gnueabi- menuconfig

menuconfig:- Update current config utilising a menu based program

$make ARCH=arm CROSS_COMPILE=(patch to)/Binary_images/ARM_Cross_Tools/arm-2014.05/bin/arm-none-linux-gnueabi- zImage

The top level of kernel folder contains a  Makefile  which drives complete compilation of kernel. Each subsystem and sub folders contains their own Makefile which compiles the files and generate object code. The Top level Makefile filter gradually and recursively into its subsystems and sub folders and invokes the corresponding Makefile to build the object code, modules and finally, the Linux kernel image zImage (vmlinux).

Kconfig: Each sub-directory has a Kconfig file. Kconfig is in configuration language Kconfig contains the entries, which are read by configuration targets such as make menuconfig to show a menu-like structure. For more detail please have look at  /Documentation/kbuild/

Many architectures and machine types require binary targets specific to the architecture and bootloader in use. One of the more common architecture-specific targets is zImage. In many architectures, this is the default target image that can be loaded and run on the target embedded system. One of the common mistakes that newcomers make is to specify bzImage as the make target. The bzImage target is specific to the x86/ PC architecture, It is a big zImage.

At the end of compilation vmlinux is generated at top level kernel directory.

fig.2
blog_4_2.png

The Kernel Proper: vmlinux
The vmlinux file is the actual kernel proper. It is a fully stand-alone, monolithic ELF image. That is, the vmlinux binary contains no unresolved external references.


Component                                                 Description


arch/arm/kernel/head.o                         Kernel-architecture-specific startup code.
arch/arm/kernel/init_task.o                 Initial thread and task structs required by the kernel.
init/built-in.o                                            Main kernel initialization code. See Chapter 5.
usr/built-in.o                                             Built-in initramfs image.
arch/arm/kernel/built-in.o                   Architecture-specific kernel code.
arch/arm/mm/built-in.o                       Architecture-specific memory-management code.
arch/arm/common/built-in.o              Architecture-specific generic code. Varies by                                                                                           architecture.
arch/arm/mach-ixp4xx/built-in.o     Machine-specific code, usually initialization.
arch/arm/nwfpe/built-in.o                   Architecture-specific floating-point emulation code.
kernel/built-in.o                                      Common components of the kernel itself.
mm/built-in.o                                           Common components of memory-management code.
fs/built-in.o                                                File system code.
ipc/built-in.o                                              Interprocess communications, such as SysV IPC.
security/built-in.o                                    Linux security components.
crypto/built-in.o                                       Cryptographic API.
block/built-in.o                                         Kernel block layer core code.
arch/arm/lib/lib.a                                     Architecture-specific common facilities. Varies by                                                                               architecture.
lib/lib.a                                                        Common kernel helper functions.
arch/arm/lib/built-in.o                          Architecture-specific helper routines.
lib/built-in.o                                              Common library functions.
drivers/built-in.o                                      All the built-in drivers. Does not include loadable                                                                                 modules.
sound/built-in.o                                       Sound drivers.
firmware/built-in.o                                 Driver firmware objects.
net/built-in.o                                            Linux networking.
.tmp_kallsyms2.o                                    Kernel Symbol table.


For more information about build details you can find them in Documentation section of kernel source.

Kernel Kbuild documentation in Linux kernel source tree
linux-4.4.1/Documentation/kbuild/*
linux-4.4.1/Documentation/kbuild/makefiles.txt
linux-4.4.1/Documentation/kbuild/kconfig-language.txt

From above figure(fig.2) we can see the components that make up the Linux kernel image. one of the common files built for every architecture is the ELF binary named vmlinux. This binary file is the monolithic kernel itself, or what we have been calling the kernel proper. When we looked at its construction in the link stage of vmlinux(during compilation of kernel), from compilation logs of below figure shows where to look to see where the first line of code might be found. In most architectures, it is found in an assembly language source file called head.S or a similar filename.  In the ARM Architecture of the kernel head.S is present in arch/arm/boot/compressed/head.S, this file will initializes processors.blog_4_3

 

The vmlinux image (the kernel proper) is linked.Following that, a number of additional object modules are processed.head.o, piggy.o,and the architecture-specific head.s. Image components and their metamorphosis during the build process leading up to a bootable kernel image.

fig.3
blog_4_4

After the vmlinux kernel ELF file has been built, the kernel build system continues to
process the targets as shown in above figure(fig.3). The Image object is created from the vmlinux object. Image is basically the vmlinux ELF file stripped of redundant sections
(notes and comments) and also stripped of any debugging symbols that might have
been present. The following command is used for this purpose:

while compiling give option V=1 which will output compilation details .

arm-none-linux-gnueabi-objcopy -O binary -R .comment -S vmlinux arch/arm/boot/Image

The -O option tells objcopy to generate a binary file; the -R option removes the ELF
sections named .comment; and the -S option is the flag to strip debugging symbols.
Once Image is compiled a number of small modules are compiled. These include several assembly language files (head.o,piggy.gzip.o,misc.o,decompress.o etc… ) that perform low-level architecture and processor-specific tasks.

Once Image is ready then Image file (binary kernel image) is compressed using this gzip command:

cat arch/arm/boot/compressed/../Image | gzip -n -f -9 > arch/arm/boot/compressed/piggy.gzip

This creates a new file called piggy.gzip, which is simply a compressed version of the
binary kernel Image.

Next An assembly language file called piggy.gzip.S  is assembled, which contains a reference to the compressed piggy.gzip. In essence, the binary kernel image is being piggybacked as payload into a low-level assembly language bootstrap loader. This bootstrap loader initializes the processor and required memory regions, decompresses the binary kernel image, and loads it into the proper place in system memory before passing control to it.

fig.4

blog_4_5
Bootstrap Loader:-The bootstrap loader is concatenated to the kernel image for loading. Both bootloader and Bootstrap Loader are different, many architectures use a bootstrap loader (or second-stage loader) to load the Linux kernel image into memory. Some bootstrap loaders perform checksum verification of the kernel image, and most decompress and relocate the kernel image.

The difference between a bootloader and a bootstrap loader in this context is simple:
The bootloader controls the board upon power-up and does not rely on the Linux kernel in any way.  In contrast, the bootstrap loader’s primary purpose is to act as the glue between a bare metal bootloader and the Linux kernel. It is the bootstrap loader’s responsibility to provide a proper context for the kernel to run in, as well as perform the necessary steps to decompress and relocate the kernel binary image. It is similar to the concept of a primary and secondary loader found in the PC architecture.

The functions performed by this bootstrap loader include the following:

1) Low-level assembly language processor initialization, which includes support for enabling the processor’s internal instruction and data caches, disabling interrupts, and setting up a C runtime environment. These include head.o .
2) Decompression and relocation code, embodied in decompress.o, misc.o.
3) lib1funcs.o:Optimized ARM division routines (ARM only).

ARMed with this Basic Knowledge ,the next Tutorial will be code walkthrough how kernel will be initialized , So be stay tuned with 😉

References:-
1)http://free-electrons.com/
2)Embedded Linux Primer
3)Linux Documentation

 

About VinayMahadev

I am passionate about Embedded Linux systems . I believe in "If you want to learn something, read about it. If you want to understand something, write about it. If you want to master something, teach it". Here I am just trying to connect the Dots.
This entry was posted in Uncategorized and tagged , , , . Bookmark the permalink.

2 Responses to Inside the Linux Kernel Build Process

  1. Manjunath Goudar says:

    good explanation……keep it up……….

    Liked by 1 person

Leave a comment