
How a Program Binary Becomes a Running Process
- April 21, 2025
- 12 min read
- Operating systems , Programming concepts
Table of Contents
Have you ever stopped to think about what really happens when you run a program?
Not just clicking “Run” or executing a command in the terminal, but what goes on under the hood—from the executable file sitting on your disk to a fully running process in memory?
In this article, we’ll explore how processes are created, how memory is organized for them, and how this works on both Windows and Unix-like systems.
Whether you’re writing code in C, Python, or Rust, understanding this flow can make you a smarter, more efficient programmer.
Free Process Creation Cheat Sheets
Get the quick-reference guide for fork(), execve(), CreateProcessA(), and more system calls — plus updates on new content.

​
Executable Files: The Beginning of a Process
At the heart of every running process is an executable file—a binary file containing instructions the CPU can understand and execute natively.
If you’re writing programs in compiled languages like C, C++, Rust, or Go, your code is translated into a binary executable by a compiler. This binary is what gets loaded and run by the operating system.
In contrast, for interpreted or virtual-machine-based languages like Python, JavaScript, or Java, your code runs through an interpreter or a virtual machine.
These interpreters themselves are compiled binaries, so when you execute a script, it’s actually the interpreter that becomes the process.
Each operating system uses a different format for executable files:
- Windows: PE (Portable Executable)
- Linux: ELF (Executable and Linkable Format)
- macOS: Mach-O (Mach Object)

Executable files include structured sections, each serving a specific role. While names differ slightly across formats—ELF (Linux), PE (Windows), and Mach-O (macOS)—the concepts are largely the same:
-
Code & Data
.text
→ machine instructions- PE:
.text
, Mach-O:__TEXT,__text
- PE:
.data
→ initialized variables- PE:
.data
, Mach-O:__DATA,__data
- PE:
.bss
→ uninitialized variables- PE:
.bss
, Mach-O:__DATA,__bss
- PE:
.rodata
→ read-only data (e.g., string literals)- PE:
.rdata
, Mach-O:__TEXT,__const
- PE:
-
Symbol & String Tables
.symtab
→ symbol table- PE: COFF symbols or external
.pdb
, Mach-O: viaLC_SYMTAB
- PE: COFF symbols or external
.strtab
→ string table (symbol names)- PE: part of COFF, Mach-O: also via
LC_SYMTAB
- PE: part of COFF, Mach-O: also via
-
Relocation Info
.reloc
→ used for dynamic linking and address adjustment- PE:
.reloc
, Mach-O: section-specific relocation entries
- PE:
…and more, depending on the platform and compilation settings.
Some sections are read-only (such as .rodata
and .text
); others can be both read and written (such as bss
and data
).
The OS uses this structure to map the executable file into memory correctly.
In this example:
1const int G_ARGV_INDEX = 1;
2char g_element;
3int g_index = 0;
4int main(int argc, char *argv[]) {
5 char *match = "Hello";
6 if (argc < G_ARGV_INDEX) {
7 return 1;
8 }
9 // ...
10 return 0;
11}
The .text
section would contain the compiled machine code for the main()
function and any other functions.
The .data
section would hold the initialized global/static variables like g_index
, while g_element
would be in the .bss
section since it’s uninitialized.
The .rodata
section would contain the string literal "Hello"
and constant G_ARGV_INDEX
.
The .symtab
and .strtab
sections would include the symbol names and their addresses, allowing the linker to resolve references between different parts of the program.
Note
Unlike the program source code where variable are represented by their names, in the executable file, they are represented by their addresses. The OS loader uses this information to map the sections into memory correctly.
​
What Is a Process?
When an executable file runs, the operating system transforms it into a process. A process is more than just code—it’s a running instance that has its own memory and control structures.
Each process has two key components:
​
1. Memory Space
This is where the contents of the executable file live once loaded into RAM. It includes the same sections we discussed—.text
, .data
, .bss
, and so on. The OS loader handles copying and mapping them into memory.

At the end of this memory layout, the system also stores:
- Environment variables (like
MY_ENV=hello
when you run$ MY_ENV=hello python hello.py
) - Program arguments (like
hello.py
when you run$ python hello.py
)
In between, there are very important sections of the memory for program executions: the stack, the heap and memory maps.
- Stack: This is where local variables and function call information are stored. It grows and shrinks as functions are called and return.
- Heap: This is used for dynamic memory allocation. It grows and shrinks as you allocate and free memory.
- Memory Maps: This is where shared libraries and other dynamically loaded resources are mapped into the process’s address space.
The OS uses a virtual memory system to manage this space, allowing processes to run in isolation and preventing them from interfering with each other.
​
2. Process Control Block (PCB)
The PCB is a data structure maintained by the operating system that holds all metadata about the process.
This includes:
- Process ID (PID)
- Parent Process PCB pointer
- Exit code
- Process state
- Open file descriptors
- CPU registers
- Program counter
- Signal handlers
- Process priority
- Current working directory
- …and much more
On Linux systems, the PCB is represented by a C structure called task_struct, defined in the kernel source. This struct contains all the necessary fields the kernel needs to manage and schedule the process.
It is stored in kernel space, not user space, ensuring that only the operating system has access to it.
All the PCBs are stored in a system-wide structure called the process table, which keeps track of all running processes.
And what’s even more interesting? Each process is usually created by another process — It’s like a chain: one process spawns the next, and so on.
For example, here’s a simplified process tree on a Linux system:
1user@host:~$ pstree -as 16980
2init(Ubuntu-20.
3 └─SessionLeader
4 └─Relay(11773)
5 └─bash
6 └─python
(You can run pstree
yourself to explore how processes are connected on your own machine!)
When you run the Python interpreter, it’s actually a child process of the shell (like bash
or zsh
).
The shell itself is a descendant of the init
process (or systemd
on modern Linux systems), which is the very first process that starts when your system boots.
This hierarchy naturally emerges as the operating system creates and manages processes.
Understanding this parent-child chain is key — but how exactly does a new process get created in the first place? Let’s break it down step by step.
​
Creating a Process: Step by Step
So, how does the OS go from an executable file to a new running process?
Here are the main steps:
-
Forking the Parent
A new process is created by copying the memory space and PCB of an existing (parent) process.Fields like the PID and parent process are updated to reflect the new identity of the child process. Other fields, like the program counter and open file descriptors, are typically set to the same values as the parent.
-
Loading the Executable
The new process then replaces its memory space with the contents of the executable file. The sections are mapped exactly as specified in the file format.
However, copying large memory spaces can be costly. That’s why operating systems often use Copy-On-Write (COW): the parent and child initially share memory pages.
If either process modifies a page, only then does the OS create a new copy for the writing process. This saves both time and memory.
​
How a Parent Monitor Child Processes
When a parent process creates a child process (such as by calling fork()
), it often needs to monitor the child’s status. Typically, the parent:
- Waits for the child process to finish execution.
- Retrieves the child’s exit code.
On Unix-like systems, system call functions like wait()
and waitpid()
are used for this purpose.
They allow the parent to block execution until the child terminates, and then check how the child process exited (whether it succeeded, failed, or was terminated by a signal).
On Windows, similar functionality is provided by functions like WaitForSingleObject()
to wait for a process handle to signal completion, and functions like GetExitCodeProcess()
to obtain the child process’s exit code.
This monitoring is crucial for resource management, error handling, and ensuring that no zombie or orphaned processes are left running.
Free Process Creation Cheat Sheets
Get the quick-reference guide for fork(), execve(), CreateProcessA(), and more system calls — plus updates on new content.

​
How to Create Processes in Code
​
On Unix/Linux/macOS (POSIX systems)
You can create a new process using the fork()
system call. It duplicates the calling process.
Here’s a simple example in C:
1#include <stdio.h>
2#include <unistd.h>
3int main() {
4 pid_t pid = fork();
5 if (pid < 0) {
6 // Fork failed
7 fprintf(stderr, "Fork failed\n");
8 return 1;
9 }
10 printf("The value of pid is %d.\n", pid);
11 return 0;
12}
The output will look like this:
The value of pid is 2623.
The value of pid is 0.
Both the parent and child continue executing the same code, but they can tell themselves apart by the return value of fork()
:
- The child sees
0
- The parent sees the child’s PID
To load a different executable into the child process, use one of the exec()
functions. These replace the current process image with a new one loaded from an executable file.
For example, you could use the printenv
utility in a child process to print an environment variable like ENV_1
.
1#include <stdio.h>
2#include <unistd.h>
3#include <sys/wait.h>
4int main() {
5 pid_t pid = fork();
6 if (pid < 0) {
7 fprintf(stderr, "Fork failed\n");
8 return 1;
9 }
10 if (pid == 0) {
11 char *args[] = {"printenv", "ENV_1", NULL};
12 char *envp[] = {"ENV_1=Child: env var 1", "ENV_2=2", NULL};
13 execve("/usr/bin/printenv", args, envp);
14 } else {
15 printf("Parent: Hello (child pid is %d).\n", pid);
16 wait(NULL); // Parent waits for the child to complete
17 }
18 return 0;
19}
The output will show the value of ENV_1
set in the child process, while the parent process prints its own message
Parent: Hello (child pid is 7703).
Child: env var 1.
There are several variants of exec()
depending on the arguments and behavior you need (e.g., execl
, execv
, execvp
, etc.).
Other Process Creation Mechanisms
While fork()
combined with exec()
is the traditional and widely used method on Unix-like systems, there are other system calls designed for more specialized process creation needs:
-
clone()
: Available primarily on Linux,clone()
is a more flexible replacement forfork()
, allowing finer control over resource sharing between parent and child processes, such as memory, file descriptors, and namespaces.
Unlikefork()
, which duplicates the entire process memory,clone()
lets you specify which parts of the process should be shared or isolated.
It underpins thread libraries likepthread_create()
and container technologies. -
posix_spawn()
: Part of the POSIX standard,posix_spawn()
combinesfork()
andexec()
into a single function call.
It is especially useful in performance-sensitive environments (e.g., macOS, embedded systems) where the overhead of a traditionalfork()
may be undesirable.
These alternatives are generally used in more specialized scenarios, but understanding them highlights the flexibility of process management in Unix-like operating systems.
​
On Windows
Windows handles this with the CreateProcessA()
function. It:
- Creates a new process
- Starts its main thread
- Allocates a new memory space
- Sets up a new PCB
1#include <windows.h>
2int main() {
3 STARTUPINFOA si = { sizeof(si) };
4 PROCESS_INFORMATION pi;
5 CreateProcessA("C:\\Windows\\System32\\notepad.exe", NULL, NULL, NULL, FALSE, 0, NULL, NULL, &si, &pi);
6 WaitForSingleObject(pi.hProcess, INFINITE);
7 CloseHandle(pi.hProcess);
8 CloseHandle(pi.hThread);
9 return 0;
10}
You provide the path to the executable file and optionally pass command-line arguments and environment variables.
The function requires several parameters, but it gives you fine-grained control over the process being created.
There are several variants of CreateProcess
depending on the level of control you need over the new process.
Free Process Creation Cheat Sheets
Get the quick-reference guide for fork(), execve(), CreateProcessA(), and more system calls — plus updates on new content.

​
Final Thoughts
Understanding the distinction between a process’s memory space and its PCB is crucial for systems programming, debugging, and even writing efficient applications.
Understanding this isn’t just academic — it makes you way more confident when working close to the metal.
And knowing how different programming languages interact with the system—compiled vs. interpreted—helps you reason better about performance and behavior.
Whether you’re digging into system calls, learning how the OS works, or just trying to write better code, this knowledge is a powerful tool.
Next time you hit “Run” or programmatically create a new process you’ll know exactly what’s going on beneath the surface.
​
Further Reading & Resources
If you’d like to explore more about process creation, memory management, and executable file formats, check out these excellent resources:
- Understanding the Linux Virtual Memory Manager – Chapter on Process Address Space
- Operating Systems Lecture Notes – UIC: Process Creation
- CS140: Processes Lecture – Stanford University
- Linux Process Management PDF – Boston University
- OS Notes – University of Illinois Chicago
- Analyzing the Structure of PE Executables (Medium)