ARM Architecture Basics


ARM stands for Acorn RISC machine, based on the company’s name started ARM designing back in 1983.
ARM Holdings’ primary business is selling  IP cores, which licensees use to create micro controllers (MCUs) and CPUs based on those cores
In this article we will be studying ARM7TDMI in detail, as studying all could be little too much. ARM7TDMI is the most successful implementation of ARM with hundreds of millions sold. Most ARM variants are developed on top of this.

I. Features

  • RISC (Reduced Instruction set)
  • High performance, low power and small size
  • load/store architecture
  • Pipelining
  • Uniform and fixed length instructions
  • ALU and Shifter control
  • Multiple load/store register instructions
  • Coprocessor instruction interface
  • THUMB support (16-bit dense compressed instruction set)
  • 7 Processor Modes


Usually ARM instructions are executed in 3 stages :

1. Fetch : fetch instruction from memory to pipeline
2. Decode : decode the instruction to ARM
3. Execute : ALU result written to destination registers

with latest processor adding two more stages as
Memory access  and write back.
So what is pipelining ?, lets understand
Portion of hardware which does fetching of instruction will be idle while decode and execute phase of instruction, this leaves the room for starting the next instruction’s fetch before first instruction finishes the decode or execute phase.
So in optimized way,  when first instruction is getting executed, second instruction can be decoded and a third instruction can be fetched. This is what pipelining is.  simple !! :).
Below figure will help you memorizing it.


III. Processor Modes


  1. USER
  2. Fast Interrupt FIQ
  3. Interrupt IRQ
  4. Supervisor SVC
  5. Abort ABT
  6. Undefined UND
  7. System SYS

Lets have some quick understanding of these modes:
Most application program runs in USER mode. A program in user mode unable to access protected system resource, in order to use them mode need to be changed from USER mode to some other mode by raising an exception.
Modes other the USER are called privileged modes.
Modes other then USER and SYSTEM mode are called exception modes.
Processor enters into Privileged modes under specific exception conditions.
Different modes have few different additional registers, to avoid corrupting USER state registers when exception occurs.
SYSTEM Mode have same number of registers as USER mode.
summary :modes


ARM has 37, 32-bit long registers:

30 – General purpose
5 – SPSR (saved process status register)
1 – CPSR (current process status register)
1 – PC (program counter)

General purpose registers : 

15 registers are visible at max in one mode(in USER mode) naming R0 to R14.
R0 to R7 are unbanked registers(ie same physical address across all the modes)  R8 to R14 are banked registers(ie separate copy of these registers in different mode if they exist).
Thing to remember is banked register contents are preserved when the mode change and hence no need to save there data.
R13 is used as stack pointer commonly known as SP.
R14 is used as link register to store the return address for exception/sub-routine. If there are multiple nested levels, the previous return address goes to stack, pointed by R13, and the last address is kept in R14.

Program Counter:

R15 is known as PC. PC contains the address of the instruction being executed at the current time.As each instruction gets fetched, the program counter increments by 4 bytes in ARM state and 2 bytes in THUMB state.
Due to pipelining, current executing instruction is typically PC-8 for ARM and PC-4 for thumb.
For ARM state bits 1 & 0 are always 0 or ignored.
For THUMB state bit 0 is always 0 and ignored.

CPSR (Current process status register):

As the name suggest CPSR holds the information of current process.


SPSR(Saved Process status Register):

Used to store CPSR when an exception occurs, each exception mode has its separate SPSR. USER mode and SYSTEM mode doesn’t have SPSR as they need not to execute exception handlers.

Thumb State :

Its a subset of ARM state, In thumb state there is no access to R8 to R12.


ARM registers

V. Exceptions

As the processor enters in to an exception mode, some registers are automatically switched depending on the type of mode. This ensure that task state is not corrupted by occurrence of exception.
When an exception occurs ARM completes its current instruction, then :

Step 1 : saves the PC to LR (R14)
Step 2: saves CPSR in new mode’s SPSR
Step 3: changes the mode corresponding to the exception
Step 4: Disable the exceptions of lower priority
Step 5: Load the new mode’s instruction to PC (exception handler or ISR)

A unique address is predefined for each exception handler, address to which processor is forced to branch is called exception/ interrupt vector.

Exception/Interrupt Vector:


Once the exception is handled by the exception handler, mode is changed back to USER mode and the user task is resumed. Handler program must restore the user state exactly as it was before exception.
Any modified register must be restored from the handler stack.
CPSR must be restored from its SPSR.
PC must be changed back to what it was executing, LR (R14) will help here.

In case multiple exception occurs at same time, depending on there priority they will be serviced.



Two main blocks Data path and Decoder.
Two read ports to register banks from A-Bus and B-bus and one write port from ALU.
Barrel Shifter : shift/rotate 2nd operand by any number of bits
ALU: Perform airthmatic/logic functions
Address Register and Address incrementer holds either PC address or operand address.
Data register holds read/write data from/to memory
Instruction decoder decodes machine code to control signals
In single cycle, data values are read on bus A & B , and the result from ALU is written to registers.
ARM 7 core has Von Neuman architecture, which means single 32 bit data bus carrying both data and instructions. In latter ARM architectures like ARM9 Harvard architecture is implemented, which means separate buses for data and instructions.

These all were the top view of ARM processor, hope this helps.
Please do let me know your feedback/concern in comment section below.

Saurabh Sengar


* all images used in this blog are from google images search, and I don’t own them


Device Tree Tutorial (ARM)



The linux kernel requires the entire description of the hardware, like which board it is booting(machine type), which all devices it is using there addresses(device/bus addresses), there interrupts numbers(irq), mfp pins configuration(pin muxing/gpios)  also some board level information like memory size, kernel command line etc etc …

Before device tree, all these information use to be set in a huge cluster of board files. And, Information like command line, memory size etc use to be passed by bootloaders as part of ATAGS through register R2(ARM). Machine type use to be set separately in register R1(ARM).
At this time each kernel compilation use to be for only one specific chip an a specific board.

So there was a long pending wish to compile the kernel for all ARM processors, and let the kernel somehow detect its hardware and apply the right drivers as needed just like your PC.
But how? On a PC, the initial registers are hardcoded, and the rest of the information is supplied by the BIOS. But ARM processors don’t have a BIOS.
The solution chosen was device tree, also referred to as Open Firmware (abbreviated OF) or Flattened Device Tree (FDT). This is essentially a data structure in byte code format which contains information that is helpful to the kernel when booting up.

The bootloader now loads two binaries: the kernel image and the DTB.
DTB is the device tree blob. The bootloader passes the DTB address through R2 instead of ATAGS and R1 register is not required now.

For a one line bookish definition “A device tree is a tree data structure with nodes that describe the physical devices in a system”

Currently device tree is supported by ARM, x86, Microblaze, PowerPC, and Sparc architectures.


I. Device Tree Compilation

Device tree compiler and its source code  located at scripts/dtc/.
On ARM all device tree source are located at /arch/arm/boot/dts/.
The Device Tree Blob(.dtb) is produced by the compiler, and it is the binary that gets loaded by the bootloader and parsed by the kernel at boot time.

$ scripts/dtc/dtc -I dts -O dtb -o /path/my_tree.dtb /arch/arm/boot/dts/my_tree.dts

This will result my_tree.dtb

For creating the dts from dtb

$ scripts/dtc/dtc -I dtb -O dts -o /path/my_tree.dts /path/my_tree.dtb

This will result my_tree.dts


 II. Device Tree Basics


Each module in device tree is defined by a node and all its properties are defined under that node. Depending on the driver it can have child nodes or parent node.
For example a device connected by i2c bus, will have i2c as its parent node, and that device will be one of the child node of i2c node, i2c may have apd bus as its parent and so on. All leads up to root node, which is parent of all. (Don’t worry an example after this section will make it more clear.)
Under the root of the Device Tree, one typically finds the following most common top-level nodes:

  • cpus: its each sub-nodes describing each CPU in the system.
  • memory : defines location and size of the RAM.
  • chosen : defines parameters chosen or defined by the system firmware at boot time. In practice, one of its usage is to pass the kernel command line.
  • aliases: shortcuts to certain nodes.
  • One or more nodes defining the buses in the SoC
  • One or mode nodes defining on-board devices


III. Device Tree Structure example

Here will take the example of a dummy dts code for explanation

 #include "pxa910.dtsi"
/ {
    compatible = "mrvl,pxa910-dkb", "mrvl,pxa910";
    chosen {
	bootargs = "<boot args here>";
    memory {
        reg = <0x00000000 0x10000000>;
    soc {
	apb@d4000000 {         

	    uart1: uart@d4017000 {
	    status = "okay";
	    twsi1: i2c@d4011000 {
                #address-cells = <1>
                #size-cells = <0>
		status = "okay";
		pmic: 88pm860x@34 {
                    compatible = "marvell,88pm860x";
		    reg = <0x34>;
		    interrupts = <4>;
		    interrupt-parent = <&intc>;
		    #interrupt-cells = <1>;

Figure 1

Each module is defined in one curly bracket area under one node, any sub modules can be defined further inside.

Explaning the above tree starting from the first line :

#include : including any headed file, just like any C file
.dtsi : extended dts file, single dts can have any number of dtsi, but couldn’t include other dts file
/: root node, device tree structure starts here

IV. Properties


There are data define in dts as form of property which are read by the kernel code, lets read about some of the major properties


The top-level compatible property typically defines a compatible string for the board. Priority always given with the most-specific first, to least-specific last. It used to match with the dt_compat field of the DT_MACHINE structure.
Inside a driver or bus node , it is the most crucial one, as it is the link between the hardware and its driver.Each node belongs to one compatible string and based on compatible string only kernel matches the device driver with its data in device tree node.
The connection between a kernel driver and the “compatible” entries it should be attached to, is made by a code segment as follows in the driver’s source code:

static struct of_device_id dummy_of_match[] = {
  { .compatible = "marvell,88pm860x", },
MODULE_DEVICE_TABLE(of, dummy_of_match);

 The above code in driver matches it to the pmic node shown in device tree structure shown in figure 1.


defines the address for that node/device


property indicate how many cells (i.e 32 bits values) are needed to form the base address part in the reg property


the size part of the reg property


is a boolean property that indicates that the current node is an interrupt controller


indicates the number of cells in the interrupts property for the interrupts managed by the selected interrupt controller


is a phandle that points to the interrupt controller for the current node. There is generally a top-level interrupt-parent definition for the main interrupt controller.

The label and node name

First, the label (”pmic”) and entry’s name (”88pm860x@34″). The label could have been omitted altogether, and the entry’s node name should stick to this format (some-name@address). This tells the kernel that this driver name 88pm860x and connected to its parent bus(i2c in this case) with the adress 34 (i2c slave address here). PMIC is the label which could be use as a phandle to refer this node inside dts.


 V. Getting the resources from DTS


Below are the few major APIs in current kernel (4.3) for reading the various properties from DTS.

of_address_to_resource: Reads the memory address of device defined by res property

irq_of_parse_and_map: Attach the interrupt handler, provided by the properties interrupt and interrupt-parent

of_find_property(np, propname, NULL): To find if property named in argument2 is present or not.

of_property_read_bool: To read a bool property named in argument 2, as it is a bool property it just like searching if that property present or not. Returns true or false

of_get_property: For reading any property named in argument 2

of_property_read_u32To read a 32 bit property, populate into 3rd argument. Doesn’t set anything to 3rd argument in case of error.

of_property_read_string: To read string property

of_match_device: Sanity check for device that device is matching with the node, highly optional, I don’t see much use of it.


let me know if you have any doubts related to device tree in comment section below or an personal email to me.


Saurabh Singh Sengar

email to:


Kernel Patch Submission tutorial

Getting your patch submitted in Linux kernel could be one of the most satisfying job for a newbie Linux kernel developer. Until my first patch got in to mainline I was not knowing that this could be a very easy task as it seems to be.

There are many ways to submit a patch to Linux community, but will not discuss all of them as it could be very confusing. I have gone through all and came up with one most sure shot procedure.

Ok, so here is to the point step by step guide to submit your first linux kernel patch, and have your name in Linux kernel 🙂

I am considering you are doing all this in Ubuntu machine, for other flavors please tweak the commands accordingly.

 I. Tools Set up

    1) Install git : version control system for your Linux kernel

sudo apt-get install git

    2) Install git mail : you will be requiring this to send patches to Linux community

sudo apt-get install git-email

    3) Config git : create a file at your home directory vim ~/.gitconfig, and have the below details in it

name = Saurabh Sengar
email =
smtpencryption = tls
smtpserver =
smtpuser =
smtpserverport = 587
smtppass = XXXXXX

The above example is for my gmail id configuration to git. Change my email ids and names with yours, change smtp server if you are using an email other then gmail, but Heay ! who doesn’t have gmail id ? keep it simple :). And yeah don’t forget to write your actual gmail login password in place of XXXXXX, yes that’s right you heard it correct 😉

    4) Enable access for less secure apps in gmail : This is a very important step, otherwise gmail will not let you send your mails via git email client. So, to do this log in to your gmail id, and in same browser enter the below url.

It will show the option to ‘Turn on’ your access of less secure apps, select that !

II. Download Linux kernel

Go to your home directory, or any folder where you want to keep your linux kernel repository, run the below commands

git clone git://

This will take some time, go watch some movie and come back, don’t worry I will be waiting here.

 III. Creating a Patch

You have your changes ready ? in case you are not sure what you need to submit go to section V to know how all you begin to create the patches for contributing to linux kernel.

Say, you want to change the file ” drivers/staging/iio/iio_simple_dummy_buffer.c”

1 ) Do your changes in file and save it

vim drivers/staging/iio/iio_simple_dummy_buffer.c

2) Add the changes to your git

git add drivers/staging/iio/iio_simple_dummy_buffer.c

3) Commit your changes

git commit -s -v

You need to add a subject and description of changes you have made in patch. Do a git log of the file you have changed in order to get the format of subject and description for that particular file

Thing to make sure is that your sign-off tag should be there in this patch, anyway -s argument of above command will take care of it don’t worry.

4) Whom to send : Get the maintainers email ids by running get_maintainer script on your patch or file you changed

./script/ -f  drivers/staging/iio/iio_simple_dummy_buffer.c

./script/ –patch /tmp/0001-test.patch

copy these mail ids

And next section will be about sending your patch to linux community, but before that make sure your changes doesn’t have any coding style issues, this can be checked by script ./script/

./script/ -f drivers/staging/iio/iio_simple_dummy_buffer.c

./script/ –patch /tmp/0001-test.patch


IV. Sending The Patch

This is easy

git send-email –annotate HEAD^1

it will prompt for email ids whom to sent, paste the email ids copied “whom to send” section. It may also ask for ‘in reply to’ you can just avoid it by pressing enter for first patch as this option is for replying on an existing mail chain. Then for final confirmation it will ask, press ‘y’ … hit enter …. and YOU ARE DONE !! 🙂

After some 10 -15 minutes you can see your patch in site, good start :). And if every thing goes fine soon your patch will be merged in mainline. Depending on the complexity of patch and availability of maintainers your patch can be given feed back ranging from 1 minute to 20 days. In case you don’t get any feedback ping them back, but at least give around 20 days to Linux Kernel Maintainers to respond.

V. What to submit

I would suggest to start your patch submission from driver/staging directory, until you are confident enough on how to send patches.

You can run some static analyzer tool in order to fix there errors, but keep in mind that there could be many false positives too by these tools, so be careful !

Run on kernel, and fix the errors/warnings reported by it, these are many.

Run coccicheck:

After that you gan grep for “TODO” and “FIXME” in kernel code, and can implement the missing logic, be careful here.

And if you are confident in some module or framework of linux kernel, go review them and in case you feel you can make them better go ahead and do the changes.

And as the last advice I can give you while interacting with Linux kernel developers in lkml, be patient and be humble 🙂

Get your hands dirty and let me know if you need any help in comments section.


Hope this helps,

Saurabh Singh Sengar

email-to :