Programming from the Ground up

28 minute read

Title: Programming from the Ground up

Author: Jonathan Bartlett

Chapter 2: Computer Architecture

Modern computer architecture is based on Von Neumann architecture where the computer is made up of the memory and CPU.

Structure of Computer Memory

The computer memory is made up of a sequence fixed-size storage and each storage has a unique address for it.

  • A memory address refers to a single byte storage in main memory.
  • Aside from program data, normal programs also reside in memory
  • Number attached to each storage is called address (aka Pointer)
  • The size of each storage is a byte: on x86 a byte is number between 0 and 255

CPU

Main responsibility: read instructions from memory one at a time and execute them.

Fetch-execute cycle:

  • Program Counter:
    • Tell the CPU where to fetch the next instruction from. Holds memory address of the next instruction.
    • Instructions are stored in binary form (numbers)
  • Instruction decoder:
    • Decode the binary form of instruction to actual instruction (add/sub/mult/…)
    • Computer instruction includes both the actual instruction and a list of memory involved.
  • Data bus:
    • From the list of memory address, fetch the actual memory that will be used.
    • Actual wire that connects the CPU and memory
  • Registers:
    • High speed memory locations (store bytes)
    • Types of registers:
      • General Purpose: perform main action (add/sub/mult/…)
      • Special purpose: TODO
    • Registers will be used to store temporary data during computation
    • Due to lack of registers, data is brought from main memory into registers, CPU will use the data in registers to perform computation and once done, the data is put back into main memory.
  • Arithmetic and Logic Unit (ALU):
    • where instruction is actually executed -> result of execution will be written to register or main memory.

Computer word size: the size of a register (on x86_64 => 8 bytes word size)

  • An address is also word size => store an address in the register

Note:

  • A CPU cannot tell the difference between a number, address, character. What type of data it is depends on the context => are you treating the data as a memory address?
  • Anything variable length in a struct is stored else where and the struct will have a pointer to it.

Data Addressing Methods

  1. Immediate Mode:
    • Data to access is in the instruction itself (skip dereferencing an address)
  2. Register addressing mode:
    • Instruction contains a register that stores the data.
    • Treat another register as a memory storage.
  3. Direct addressing mode:
    • Instruction contains a memory address that stores the data
  4. Indexed addressing mode:
    • Instruction contains a memory address and an index register that stores the offset of an address.
    • a multiplier can be specified in the instruction to state the multiplier to the offset
    • ie: memory address in instruction is 2002 and the index register has 3 and the multiplier is 4 then the memory address is 2002 + 3 * 4 = 2014
  5. Indirect address mode:
    • Instruction contains a register that contains a memory address
  6. Base pointer address mode:
    • Instruction contains a register that contains a memory address and the instruction can include a offset to add to the memory address in the register memory address value.

Chapter 3: Your First Programs

  • The source code is in assembly language - human readable form
  • Use as exit.s -o exit.o to compile the assembly code into machine code - the output of this is known as an object file
    • Each assembly file will generate a single a single object file and we will need to link them into a single executable
  • Use ld exit.o -o exit to link the object file into a executable

Outline of an Assembly Language Program

Comments: anything that starts with #

Assembler directives or pseudo-operations: anything that starts with period (.).

  • Instruction (line) that will not be directly directly translated to machine instruction.
  • Used by the assembler itself

Section (.section):

  • .section: breaks the program into sections
  • .section .data: starts the data section - where you should list all the memory storage you will need for data.
    • klement: this should not include the stack and heap but instead the static duration storage? Stack and heap memory requirements are not constant at compile time?
  • .section .text: the text section of a program - where the program instruction lives

Global (.global):

  • .global SYMBOL Indicate to the assembler that it should not discard the symbol after assembly.
  • .global _start: _start is a special symbol that marks the location of the start of the program
    • Needs _start so that the symbol will still be present after assembly (in object file)
    • Without _start being present, the computer will not know which instruction to start at
    • Without a _start symbol the linker will throw “ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000”

Symbols:

  • A variable/text that will be replaced by something else during assembly/linking
  • label: defines the value of a symbol
  • Syntax: _start: - a symbol followed by a colon
  • When assembler is assembling a program, it will need to assign each data value and instruction (line) an address
    • Labels tell the assembler that the symbol value is where the data element or address will be (allow you to change position of the label)
    • klement: does this mean all symbols will be replaced with the address of the value and not the value itself?

movl $1, %eax

  • transfers the number 1 into the %eax register
    • Instructions in assembly can have operands (arguments), movl has the source (first operand) and destination (second operand). Some operand might be hard coded (require the value of operand to be in a prefixed register).
    • $ (dollar) sign in front means that immediate mode addressing is used (literal). This means the operand for the instruction is the value 1
      • Without dollar sign it would be direct addressing => read the value at address 1
    • Setting the register %eax to 1 value will make the kernel execute system call 1 (exit) when the program interrupt and hand over to the kernel

movl $0, %ebx

  • %ebx register stores the exit code for the kernel to read.
  • Loading value to %ebx does not do anything but with the system call 1 (exit), the kernel will read whatever value in that register to set the exit code.

int $0x80

  • send an interrupt to the program => transfer control from program to Linux (kernel)
  • int => interrupt.

System call:

  • Operating system features (file io) are accessed through system calls
  • System calls are invoked by issuing int $0x80 and setting %eax with the system call number
  • Some system call might require additional parameters which can be provided by setting certain registers (ie %ebx)

Finding a Maximum Value

This section covers an assembly program that returns the larges number in a list as the program exit code.

  • Use 0 to represent the end of the list

data_items

  • For this program we will need to store the list of number in a memory
  • Use data_items: (label) that refers to the memory location in the next line => the memory location of the first element in the list
    • data_items == address of 3
  • .long X,Y,... tells the assembler to reserve 4 bytes for each element in the list
    • In this example, the assembler will reserve 14 * 4 = 56 bytes and initialize the memory storage with the provided data.
  • Last element of the list 0 to indicate the end of the list (we need some sort of way to signal that the list has ended.)
  • Do not need to mark the data_items label as the symbol is not needed after the assembly is done => only for convenience

Variable:

  • In assembly, a variable is a dedicated storage (can be assigned to a register) for specific purpose
  • For this problem, we need a variable to store for:
    • %ebx: the current maximum number
    • %edi: the index of the current element in the list
    • %eax: the value of the current element.

start:

line 23 movl $0, %edi:

  • Initialized the current element index to 0

line 24 movl data_items(,%edi,4), %eax:

  • movl: move(aka copy) a long (4 bytes) from source to destination
  • data_items is the address of the first element in the list
  • %edi will be 0 at this point => move the element at data_items + 0 => first element
  • 4 is the multiplier to the offset

line 25 movl %eax, %ebx:

  • set the first element to be the largest

start_loop

Responsibility in the loop:

  1. Check if the current value is 0, if so break from the loop
  2. Load the next value in the list
  3. If the next value is larger than the previous max => update the max value variable
  4. Return back to the start of the loop

line 29,30 cmpl $0, %eax => je loop_exit:

  • cmpl: compares both 0 and eax and store the result in %eflags (hardcoded into the instruction - cannot be specified)
    • This comparison is the three way comparison <=>
    • %eflags is the status register
  • je end_loop: jump to the end_loop label if the result %eflags is compared equal
    • Other conditional jumps jg (if second is greater)…

line 31, 32 incl %edi => movl data_items(,%edi,4), %eax

  • incl: increment the value at register %edi
  • movl: copy the values at data_items + %edi*4 into %eax => %eax will have the new value that %edi points to

line 33, 34 cmpl %ebx, %eax => jle start_loop

  • cmpl: three-way compare %ebx%eax
  • jle start_loop: if the current element (%eax) is smaller than the current max (%ebx) move back to the start of the loop
    • Immediately move to start of loop as it does not need to perform anything if the current element is smaller

line 36, 37 movl %eax, %ebx => jmp start_loop

  • movl: at this point we know that the current element is the maximum => copy the current element to the current maximum register
  • jmp start_loop: jump back to the start of the loop

end_loop

line 42, 43 movl $1, %eax

  • movl: setting %eax to indicate to the kernel that we would like to exit (1). Program end and the current element value does not matter anymore
    • %ebx already populated with the current maximum => exit status will be this
  • int 0x80: wake up the kernel

Addressing Modes

General form of memory address format:

ADDRESS_OR_OFFSET(%BASE_OR_OFFSET,%INDEX,MULTIPLIER)
  • ADDRESS_OR_OFFSET and MULTIPLIER must be constants (labels to constants)
  • The other 2 must be registers
  • All the fields are optional

Final Address:

FINAL_ADDRESS = ADDRESS_OR_OFFSET + % BASE_OR_OFFSET + MULTIPLIER * %INDEX

Direct addressing mode

movl ADDRESS, %eax

  • Having a constant/label as one of the operand means that it is treated as an absolute address
  • This example will load whatever value (long) into %eax

Indexed addressing mode

movl string_start(,%ecx,1), %eax

  • Get the memory address from the ADDRESS_OR_OFFSET part and move it by %INDEX multiply by MULTIPLIER bytes.
  • As the name suggest it is similar to how we access elements in an array
  • This example will load whatever that is at string_start + 1 * %ecx

Indirect addressing mode

movl (%eax), %ebx

  • Load whatever value that is in the address stored in %eax into %ebx

Base pointer addressing mode

movl 4(%eax), %ebx

  • Get the memory address from the register %eax and adds it by ADDRESS_OR_OFFSET
  • Treats ADDRESS_OR_OFFSET as the offset as the address is derived from %eax
  • This example will load whatever that is at %eax + 4

Immediate mode

movl $12, %eax

  • Loads a literal into the register
  • Use a dollar ($) sign in front of the literal, without it direct address mode will be used

Register addressing modde

movl %eax,...

  • treats the register as a memory address

Note: every mode can be used as source and destination operand except for immediate mode. Immediate mode can only be used as source (cannot load a new value into a literal).

Storage size

  
______%eax_______
[   |   |%ah|%al]
        ___%ax___
  • movl treats the storage size as a word long
  • movb treats the storage size as a byte
    • cannot be used with a full register (ie %eax) as it does not make sense to move a byte into a word storag
  • To move 2 bytes to a register you will move to the 2 least significant bytes (%ax)
  • To move 1 byte you can move the least significant byte in %al or the most significant byte in %ah
    • Note: %ax is the concat of %ah and %al
  • loading a value into %eax will wipe the values in ax, ah and al and loading to either of the smaller address will corrupt eax

Chapter 4: All About Functions

How Functions Work

Components of a function:

  • function name:
    • A symbol that represents the starting address of the function code - assembly label
  • function parameters:
    • Data to be passed to the function
  • local variable:
    • Data storage that the function but thrown away after the function returns
    • Not accessible by other functions
  • static variables (function static variable):
    • Data storage that the function uses but not thrown away after the function returns
    • Static variable will be reused every time the function is called
  • global variable:
    • Data storage that is managed outside of the function
  • return address:
    • A function parameter that tells the function where to resume after returning.
    • Usually stores the address of the next instruction on the call site.
      • Allow the function to know what should be executed after returning
    • call instruction will pass the return address to the function for you
    • ret instruction allows the flow to be passed back to where the function was called from
    • (klement: this is something new that I did not know previously)
  • return value:
    • Transfer data from the function (callee) back to the caller

Calling Convention: is the way the implementation choose to store and handles the various components of a function.

Assembly-Language Functions using the C Calling convention

Stack:

  • a region of memory
  • Stack is at the very top of address (highest address)
  • Grow downwards: pushing into the stack will decrease the top of the stack address
  • %esp: register that stores the address of the top of the stack
  • Assembly provides pushl (push long) and popl (pop long) to push/pop 1 word size of data on/off the stack
    • %esp will automatically get subtracted (push) and added (pop)
    • Each take one operand - register to push/pop
  • Accessing the stack
    • indirect: movl (%esp), %eax - access the word that is at the top of the stack
    • base pointer addressing mode: movl 4(%esp), %eax - access the word that is 4 ahead of the top of stack

Step 1 - prepare params (caller):

  • Caller will push all parameters onto the stack in reverse order (last parameter lowest)
  • Caller execute call instruction:
    1. Push the return address (address of next instruction on caller) onto the stack
    2. Modifies %eip (instruction pointer) to point to the start of the callee function
      • Allow the machine to know which instruction to execute next
  • View of stack right after call is called:
    Parameter #N
    ...
    Parameter 2
    Parameter 1
    Return Address <-- (%esp)
    

Step 2 - base pointer (callee)

  • Control flow is on the callee
  • Base pointer %ebp: a special register used for accessing the function parameters and local variables (current stack frame)
    • Cannot use esp at it could change when we are trying to call other functions in the callee which will push the args onto the stack => esp change
    • We can treat %ebp as a constant whenever the control flow is in the callee function
  • Parts:
    1. Save the current base pointer (%ebp):
      • At this point the base pointer (%ebp) points to the base of caller stack.
      • The callee will push the data in %ebp onto the stack - pushl %ebp
      • This will allow the %ebp to return to the caller’s stack frame when the callee returns
    2. Copy stack pointer to the base pointer:
      • At this point the top of the stack is where the %ebp should be - by C convention
      • Top of stack stores the caller’s %ebp
      • Update the %ebp to the current (callee) stack frame - movl %esp, %ebp
      • This will allow us to easily access the current stack using base address mode
  • The view of the stack:
    Parameter #N   <-- N*4+4(%ebp)
    ...
    Parameter 2    <-- 12(%ebp)
    Parameter 1    <-- 8(%ebp)
    Return Address <-- 4(%ebp)
    Old %ebp       <-- (%esp) and (%ebp)
    

Step 3 - local variables (callee)

  • Control flow is on the callee
  • Reserve space right above the base pointer for local variables
    • subl $8, %esp - if local variables are 8 bytes
  • Local variables reserved at the “bottom” (right above ebp) vs pushl when the variable should be initialized
    • We could call other function between variable initializations => could result stack being (local variable, params, local variable) which will result in the params being unable to be poped when the callee returns
  • When function return, the memory after the ebp will be removed => local variable (life time as long as the function)
  • View of the stack:
    Parameter #N     <-- N*4+4(%ebp)
    ...
    Parameter 2      <-- 12(%ebp)
    Parameter 1      <-- 8(%ebp)
    Return Address   <-- 4(%ebp)
    Old %ebp         <-- (%esp)
    Local Variable 1 <-- -4(%ebp)
    Local Variable 2 <-- -8(%ebp) and (%esp)
    
  • Note: static and global variables are stored in the data segment.

Step 4 - returning from a function (callee)

  1. Stores it’s return value in %eax
    • (klement: what if we want to return more than 1 word?)
  2. Rests the stack to what it was right when the control is passed to the callee
    • Stack only has callee params at the top of the stack
    • Parts:
    1. Pop all local variables.
      • By reseting %esp to %ebp (stack pointer will be pushed to before the local variables)
      • movl %ebp, %esp
    2. Restore %ebp to the top of old %ebp on the stack and pop.
      • At this point the top of the stack is the old %ebp
      • Restore the %ebp to old %ebp - popl %ebp
  3. Return control back to the caller.
    • At this point the stack only params and caller return address:
       Parameter #N
       ...
       Parameter 2
       Parameter 1
       Return Address <-- (%esp)
      
    • Done by calling ret instruction. Pop the top of stack (return address) to %eip => allow the control to be passed back to the caller.

Instructions for proper return:

movl %ebp, %esp
popl %ebp
ret

Step 5 - after function return (caller)

  • Control now is on the caller
  • Caller can examine the return value by looking at the %eax register
  • Calling code needs to pop off all of the parameters/move stack pointer back to before parameters
  • Destruction of register:
    • To save the current registers values, the caller will need to push the registers value on to the stack before.
    • %eax is guaranteed to be overwritten with the return value
    • Except for %ebp which will be the same (point to the same stack frame).

Pow Function Example

Problem: program to calculate \(2^3 + 5^2\)

  • Main program:
    • Push the arguments (base_number, exponent) for power onto the stack - pushing the last argument first
    • Save the result of the first function (stored in %eax) onto the stack - no guarantee that data will remain the same for all registers (except for %ebp)
  • Function (power):
    • .type power,@function: tells the linker that the symbol should be treated as a function - only relevant when linking > 1 object file.
    • jmp power vs call power: call will add the return address to the stack but jmp does not.
    • C calling convention (callee)
      • Code:
        pushl %ebp
        movl %esp, %ebp
        subl $4, %esp   # reserve space for local variable (current result)
        
      • Stack view:
        Exponent       <-- 12(%ebp)
        Power          <-- 8(%ebp)
        Return Address <-- 4(%ebp)
        Old %ebp       <-- (%ebp)
        Current result <-- -4(%ebp) and (%esp)
        
    • Why local variable (on stack) vs register:
      • When passing data to another function, you can pass an address on the stack but cannot pass an address to a register.
      • Not enough register to store all of the local variables.
    • imull: multiplies operand 1 and 2 but store the result in operand 2

Recursive Factorial Function

Main Function

_start:
 pushl $4
 call factorial
  • Push the value 4 onto the stack and call factorial
addl $4, %esp
movl %eax, %ebx
movl $1, %eax
int $0x80
  • At this point the function factorial has return (all recursive call completed)
  • addl $4, %esp: move the stack pointer back by 4 bytes (1 word) to clean up the factorial parameters
    • Should always clean up stack after a function returns
  • movl %eax, %ebx: move the return value from factorial to the register that will be used for exit code

Factorial Function

.type factorial,@function
factorial:
  • .type directive tells the linker that the factorial symbol is a function
pushl %ebp
movl %esp, %ebp
  • Creates a stack frame and save the caller stack frame
  • Standard C calling convention - should be executed in all instructions
movl 8(%ebp), %eax
  • Stores the argument passed by the caller (at address %ebp + 8) in %eax register
cmpl $1, %eax
je end_factorial
  • %eax now stores n
  • Check for base case when n == 1, if true then move to end_factorial routine
decl %eax
  • At this point %eax will not equal to 1
  • decrease %eax (n) by one so that it can be passed to the next recursive call
pushl %eax
call factorial
  • Add %eax as an argument for the next factorial call and call the function
movl 8(%ebp), %ebx
  • After the inner call to factorial, the register that stores n (%eax) will be over written by the inner call
  • Copy back the original parameter (n) in %ebx
imull %ebx, %eax
  • Multiply n (%ebx) with the result of the recursive call %eax and store in %eax
end_factorial:
 movl %ebp, %esp
 popl %ebp
 ret
  • Return from function by resetting the stack pointer to base pointer (local variables will be cleaned up)
  • Reset to the caller’s base pointer and return to the next instruction in the caller.

Chapter 5: Dealing wit Files

UNIX File Concept

Overview: when we want access a file, the operating system will give us a “number” (file descriptor) which we will use to perform any file operation. Once the we are done , the file will be closed and the file descriptor will be useless.

Using asm to perform file operations:

  1. File open:
    • To open a file, the program will need to execute a open system call, provide the filename and the number to represent the open mode
    • %eax will hold the system call number 5 (open file)
    • %ebx will contain the address of the first character in the file name (const char*)
    • %ecx will contain the file access mode (read/write)
    • %edx store the permission set
  2. Linux will return the file descriptor in %eax
  3. Perform read/write (system call) on the file:
    • Read is system call 3 => move $3 into %eax
    • Place the file descriptor from #2 into %ebx
    • Address of the read/write buffer into %ecx
    • The size of the buffer in %edx
    • OS will return the number of bytes read/written in %eax
  4. Close the file by executing the system call 6 (move $6 into %eax)

Buffers and .bss

  • Continuous block of bytes used for bulk data transfer
  • Using data segment as a buffer:
    • Data segment is static data that is initialized using the source code
      • Source will specifically specify the value of each individual byte in the data segment of the code
      • ie .byte 0 0 0 0
    • Will require the users to type in the individual elements in the source code (redudant as we do not care what the initial value of the buffer)
    • The size of the data section directly translate to the size of executable
      • Executable will contain the same bytes as what the data segement should be initialized to
      • (klement: can imagine the OS performing a memcpy from the executable into data segment)
  • .bss:
    • Similar to the data segment
    • Does not take up space in the executable
    • Specificy the size of the memory storage but unable to specific the initial value (uninitialized)
      • The memory location is initialized to 0
    • The buffer can be initialized/modified during runtime but not at initialization of the program
        .section .bss
         .lcomm my_buffer, 500
      
    • .lcomm will create a symbol my_buffer that points to a 500 byte storage location.

Capitalizing all characters in a file

Overview: a program that takes a path to a file and capitalize all the characters in it

  1. A function that takes a block of memory and converts it to upper-case
    • Takes the address of the block of memory and size
  2. Repeatedly read from into the buffer (read system call might not read all the bytes of a file)
  3. Open and close the necessary files
  • Setup
    • .equ directive allow us to name literals (ie $8)
    • STACK_POSITIONS: store the offsets from base pointer as constants
  • convert_to_upper:
      movl ST_BUFFER(%ebp), %eax 
      movl ST_BUFFER_LEN(%ebp), %ebx
    
    • In C calling convention, all the arguments are saved on the stack. Using the predefined constant %ebp offset, move the arguments into the registers
       cmpl $0, %ebx 
       je end_convert_loop
      
    • Exit the loop if the provided buffer is 0 in size
        movb (%eax, %edi, 1), %cl
      
    • Using indexed indirect addressing mode, copy the character at %eax buffer with offset %edi and multiplier 1 => %eax + 1*%edi
  • Opening files:
    • Command line arguments are pushed onto the stack before the first instruction
      • argv and argc will be stored
    • We will first save the current stack position in %ebp => used to easily access all the cli arguments

Chapter 6: Reading and Writing Simple Records

Reading and writing a simple fixed length record:

  • First name - 40 bytes
  • Lastname - 40 bytes
  • Address - 240 bytes
  • Age - 4 bytes

To reduce code, we can define constants in a separate file and include it when needed.

record-def.s

linux.s

Function to read a record from file into a buffer

Pramas:

  • Location of the buffer that can be used to read the records into
  • File descriptor to read from

read-record.s:

Function to write a record from a buffer into file

Pramas:

  • Location of the buffer that contains the records
  • File descriptor to write into

write-record.s:

Function: Counting number of characters in a buffer

A function to count the number of records in a buffer (null terminated string)

Function to write new line

Writing Records into a file

write-records.s

  • .include is similar to c++ header inclusion, the file will be copy and pasted into that part of the code
  • Calling of externally defined functions (ie write_record) will be linked by the linker and will not need to .include the file
  • pushl $record1 vs pushl record1: the former uses immediate mode and push the address of the label onto the stack without deferencing while the latter will use direct addressing mode and deference the first long at the address and push onto the stack
  • .rept: repeat the instruction between .rept and .endr.
    • Adds to the executable space (unlike bss) but will save writing the same code

Reading records from file

  • pushl $RECORD_FIRSTNAME + record_buffer: assembler knows that this is refering to push the address of record_buffer with offset at $RECORD_FIRSTNAME

Chapter 7: Developing Robust Programs

(klement: this chapter covers very generic SWE principals)

Chapter 8: Sharing Functions with Code Libraries

Using a Shared Library

Shared Libraries: allow functions/labels to be shared across different source files.

as helloworld-lib.s -o helloworld-lib.o
ld -dynamic-linker /lib/ld-linux.so.2 \
   -o hellowrold-lib helloworld-lib.o -lc
  • -dynamic-linker /lib/ld-linux.so.2: allow the program to be linked with dynamic program using the dynamic-linker (not the shared library) specified
    • Loads the external shared libraries
    • It is a tool that is used by the executable to be able to linked with shared libraries
  • -lc: link to the shared c library
    • Links the executable to the libc.so shared library (prepends lib and appends .so)
    • Format -lYOUR_SHARED_LIBRARY

Dynamic vs Statically linked library

  • Static: During link-time the linker will resolve all the undefined names (external reference) in each object files with all other object files.
    • Resolve by replacing the undefined names with the address (could address of function)
    • the undefined name will not be in the final executable
  • Dynamic:
    • During link-time, the undefined name will remain unresolved in the executable.
    • When the program runs, the dynamic linker will be load the shared library listed in -lXXX and resolve all undefined names.
    • Undefined names will be replaced with the actual address at the end of the start up

How Shared Library work

Process:

  • All the shared library code does not need to be included in the executable
  • During link-time, the linker will note in the executable on where to find the shared library (ie libc.so).
  • When the program starts the dynamic linker will look for the shared library (libXXX.so) in standard places: /etc/ld.so.conf and LD_LIBRARY_PATH
  • Loads the library into the program’s virtual memory
  • Replace all the undefined names with the actual addresses.

(note: structs are passed by pointer instead of actual argument (pushing on stack))

Chapter 9: Intermediate Memory Topics

Terminology:

  • Byte: size of a single storage location
  • Word: size of a register
  • Address: number that refers to a byte (storage location in memory)
    • asm label directive: symbol will be replace directly with the address of the next instruction (maybe first element of the array)
    • Doing movl $label %eax => movl $SOMEADDR %eax, replace label with a number and copy into the register.
  • Pointer: word whose value is an address.

Memory Layout

 0xbfffffff
 ┌────────────────────────────────┐
 │           Env Variables        │
 ├────────────────────────────────┤
 │          Program Name          │
 ├────────────────────────────────┤
 │            CLI Args            │
 ├────────────────────────────────┤
 │              Stack             │
 │                                │
 │                                │
 ├────────────────────────────────┤
 │                                │
 │                                │
 │                                │
 │                                │
 │           Unmapped Memory      │
 │                                │
 │                                │
 │                                │
 ├────────────────────────────────┤ <-- Current Break Point after sbrk
 │                                │
 │              Heap              │
 │                                │
 ├────────────────────────────────┤ <-- Oringal Break Point
 │                                │
 │                                │
 │              BSS               │
 │                                │
 ├────────────────────────────────┤
 │                                │
 │              DATA              │
 │                                │
 │                                │
 ├────────────────────────────────┤
 │                                │
 │             Text               │
 │                                │
 │                                │
 └────────────────────────────────┘
 0x08048000
  • Memory layout is linear: from top to bottom
  • Constants: Text (instruction), Data (global constants), BSS (uninitialized constants)
    • Allocated at the bottom of the memory
  • Stack: grows from top down with only Env variables, cli args (argv/c) and program name at the
  • Heap: grows like a stack (only increase) from the bottom up
  • Memory between the stack and heap is unmapped → the page is not assign a frame on physical memory/disk

Getting more memory

sbrk (set break point):

  • Tell linux (kernel) to move the new break point by N bytes
    • linux will return the new break point (last addressable memory)
    • Could be larger than expected (round up to the nearest page - 4096 byte)
  • Using sbrk will naively act as linear allocator (only increase linearly) but does not support deallocation
  • Require a memory allocator (malloc) to allocate and deallocate memory.

Simple Memory Manager

  • Global Variable
    • heap_begin: to mark the beginning of the heap (dynamic value - how many instructions,…)
    • current_break: similar to heap but mark the byte after the last addressable byte
  • Constants:
    • .equ HEADER_SIZE, 8: the header size of the memory block
    • .equ HDR_AVAIL_OFFSET, 0, .equ HDR_SIZE_OFFSET, 4: the offset in the header member variables.

allocate_init:

  • Purpose: to set the heap_begin and current_break global varialbe
  • Use the trick of calling sbrk with $0: return the current break point

allocate:

  • allocate_loop_begin:
    1. Check if the current pointer in the heap has reached the end → move the break point if so
    2. The current pointer will point to a memory block header => check if the block is AVAILABLE
      • Check if the block is big enough -> is so call allocate_here
    3. Move to the next_location:
      • Add advance the pointer by the size of header and size of memory
  • allocate_here:
    • Mark the memory as unavailable and returns the usable memory (right after header)
  • move_break:
    • At this point we know the memory size needed.
    • Calculate the size to sbrk by - size of memory requested + block header
    • Check if sbrk returns an error
    • else: mark the newly sbrk-ed memory block as UNAVAILABLE and move %eax to the useable memory (right after the header)
    • Update the program break

deallocate: setting the avail member of the header to AVAILABLE

Chapter 10: Counting Like a Computer

(klement: most content here are very basic computer organization topics)

Truth, Falsehood, and Binary Numbers

Interesting topics

  • Bitwise AND,OR,NOT,XOR can be done between words in 2 register in a single instruction
    • These operations are very fast
    • Doing XOR with the register itself is faster than loading $0 into the register
  • Program Status Register: will store the carry if you add 2 register that overflows.
    • Signify presence of a carry by setting the carry flag

Floating-point numbers

  • Represented using sign, exponent and mantissa
    • sign: MSB set if negative number
    • exponent: exponent of the multiplier (next 8 bits)
      • Exponent with base 2 => 2^n
      • Unsigned value buy offset by -127: 2^(exponent - 127) => represent both positive and negative exponent
    • matissa: repsent the floating point + 1 (always fixed)
      • ie: 1 (fixed) + 1/2^1 * (nth bit of mattisa) + 1/2^2 * (nth -1 bit of mattisa)

Negative Numbers

  • twos complement: get a negative number (x -1)
    • Perform a NOT operation on the number
    • add one to the resulting number
  • For positive number: lowest (0) 0000... and maximum keep adding one bit until (2^(31-1) - 1) 01111..
    • -1 because 0 is “included” in the positive side
    • Always contain 0 in the MSB
  • For negative number: smallest (-1) 11111... and keep subtracting one bit until (-2^(31-1)) 1000...
    • Always contain 1 in the MSB
  • In absolute terms there is one more negative number than positive number.
  • Sign Extension: to change a 4 bit 2s complement repsentation to 32 bit 2s complement represenation, we cannot simply pad the extra MS bits with 0.
    • Pad the extra MS bits with the MSB value of the smaller representation: 1101 -> 111111111...1101

Chapter 11: High-Level Languages

(klement: skipping this chapter as it is very generic)

Chapter 12: Optimization

Local Optimization (by compiler):

  • Precomputing Calculation: compiler might take all combination of the parameters and precompute the result and place them in a lookup table
  • Cache results of fucntions
  • Locality of Reference: bundle operation on memory that are close together. These operation might share the same page and will not require 2 separate fetching of frame of disk into memory.
  • Register usage: operator on registers instead of memory.
  • Inline Function
  • Optimized Instructions: use more optimized instructions
  • Addressing Modes: fastest mode are Immediate, Register Addressing Mode > Direct Addressing Mode > Base Pointer Indexed Indirect
  • Data Alignment: Some processor can access data on word-aligned memory faster than non-word aligned

Updated: