7 Basic C Programming Facts you Need to Know
- December 30, 2024
- 15 min read
- C programming
Table of Contents
​
Introduction
C is one of the most influential programming languages ever created. It’s the backbone of countless popular software applications.
C was also one of the first programming languages I learned.
In this article, I’m sharing 7 essential facts about the C programming language.
Whether you’re just starting out or you’ve been using C for years, you’re sure to find something new and interesting.
​
Fact 1: C Has Simple Syntax
C does not support object-oriented programming, anonymous functions, exceptions, [list comprehensions][list-comprehensions], and many other advanced programming concepts found in other popular languages.
As a result, C’s syntax is relatively straightforward and minimalistic.
The complete documentation for the current version of C (C23), including explanations and examples, is remarkably compact.
The language syntax reference takes up less than 160 pages.
Even the standard library, which is available to all C programs, is relatively small, adding up around 370 pages to the documentation.
All this makes C quick to learn and use.
​
Fact 2: C Programs Execute Natively
Before your C code can be executed, it gets transformed by a compiler infrastructure into executable code that runs directly on the system’s hardware.
This transformation involves several stages:
- Preprocessing: From the written source code, the preprocessor handles directives like
#include
and#define
. For example, it replaces#include <stdio.h>
with the content of thestdio.h
file, incorporating the declaration of the functionputs
. - Compiling: The compiler translates the preprocessed code into assembly code specific to the target platform, optimizing it if needed.
- Assembling: The assembler converts the assembly code into machine code, generating object files that the system’s hardware can understand.
- Linking: The linker combines object files and any necessary libraries into a final executable program.
The result is a binary file that the operating system can run directly, without the need for a virtual machine or interpreter.
This also means that C is platform-dependent, so you need to compile different binaries for different architectures like x86, ARM, etc.
Different operating systems prefer different default compiler infrastructures to transform C code into binary:
But here’s the kicker — except for MSVC, these compiler infrastructures can be used across different platforms, giving you flexibility in your development environment.
The big advantage with C is performance. Since it’s compiled to native machine code, C programs run much faster compared to interpreted languages like Python or JavaScript.
For example, a simple matrix multiplication with 1000 rows and columns can be orders of magnitude faster in C.
Language | Run time (seconds) |
---|---|
C | 2.31 |
Python | 296.12 |
Javascript | 20.98 |
So, if you need your programs to run quickly and efficiently, C’s native execution is a great choice.
​
Fact 3: C Has Multiple Provided Libraries
One of the great strengths of C is its extensive support for provided libraries. There are three primary libraries you’ll encounter: the standard library, the system calls interface and the POSIX library.
These libraries provide header files (.h
) that define macros, global variables, custom types, and declare functions. The actual implementations are in library files (.so
, .dylib
, .dll
, etc.).
​
Standard Library
The C documentation defines the standard library, a collection of essential functions and macros for convenient coding, available to all C programs.
Many header files are ready for you to use in your C programs. Each header provides specific functionalities, for example:
<stdio.h>
: Input and output functions.<stdlib.h>
: General utilities like memory allocation and process control.<string.h>
: String handling functions.<math.h>
: Mathematical functions.
Note
On freestanding C environments (systems without a full operating system, often found in embedded systems or low-level programming contexts such as Linux kernel), only a minimal subset of the standard library is provided.
This section focuses on hosted environments, which are systems with a full operating system like Linux, macOS and Windows, where the full standard library must be available.
On Linux, the standard library is typically part of the GNU C Library (glibc). On macOS, it’s part of the Apple C Library (libSystem), and on Windows, it’s part of the Microsoft C Runtime Library (MSVCRT).
Each operating system has its own standard library implementation because they do not share the same system API or system call interface.
​
System Calls
System calls are the interface between your C programs and the operating system. These are often referenced in standard library functions, bridging the gap between user-space programs and the operating system.
Since most popular operating systems like Linux, Windows, and macOS are written in C, they provide extensive system call interfaces in C.
For instance:
- On Linux, you might use the
<linux/io_uring.h>
header’s functions to work with asynchronous IO. - On macOS, you might use the
<sys/event.h>
header file. - On Windows, you would use the
<Windows.h>
header file.
Your C program can directly make system calls declared in kernel headers.
However, this can lead to portability issues because operating systems do not provide the same interfaces. Consequently, each program would have to handle all targeted operating systems specifically.
This situation arises because the C standard library, intentionally small, does not implement all features provided by operating systems through system calls, focusing only on the most common system-related features.
​
POSIX Library
To reduce these portability issues, the POSIX library, standardized by IEEE and the Open Group, was created to unify Unix-like operating systems by providing a standard set of APIs. Think of it as a complementary C library to the standard library, offering features common across UNIX-like systems, but not mandatory.
The POSIX standard includes, among others, headers for:
<pthread.h>
: Thread management.<unistd.h>
: Function definitions for various system calls likefork
andread
.<aio.h>
: Asynchronous input and output.<arpa/inet.h>
: Network operations definitions.<glob.h>
: Pathname pattern-matching.
POSIX ensures that programs can be more portable across different Unix-like systems, reducing the differences that developers need to handle.
Many operating systems, including Linux and macOS, support POSIX. Linux and macOS POSIX libraries are included in glibc and libSystem, respectively.
By using the POSIX library in your code, you can address portability concerns when targeting systems that are POSIX compliant.
To keep your C code portable, use the C standard library first, the POSIX library if needed, and direct system calls only as a last resort.
​
Fact 4: C Gives System Understanding
Learning C helps programmers understand how computers work. C gives access to system calls and provides developers with the freedom to manage memory.
​
A. Operating System API
As you learn and use C to develop software, you will need to use the operating system interface or the POSIX library to make low-level calls. This will help you gain a better understanding of the operating system and its interface.
​
B. Computer Memory
Learning C can help you understand program memory. Here are a few aspects of it:
​
1. Stack Memory Layout
We can understand how variables are laid out on the memory stack with the following example:
1#include <stdio.h>
2
3int main() {
4 int x = 10;
5 int y = 20;
6
7 int *p = &x;
8 printf("Initially: x=%d, y=%d\n", x, y);
9
10 *p = 100;
11 printf("x changed through p: x=%d, y=%d\n", x, y);
12
13 *(p + 1) = 200;
14 printf("y changed through p: x=%d, y=%d\n", x, y);
15
16 return 0;
17}
The execution outputs:
Initially: x=10, y=20
x changed through p: x=100, y=20
y changed through p: x=100, y=200
In the program, two int
variables, x
and y
, are defined, with x
defined before y
. A pointer p
is assigned the address of x
.
By manipulating and accessing memory through the pointer p
, we can observe the layout of variables on the stack.
We modify the value of the memory location at the address following x
and observe that the value of y
changes.
This suggests that y
is laid out at the higher address right after x
.
​
2. Memory Representation
We can also understand how integer data is represented in memory with the following example:
1#include <stdio.h>
2
3int main() {
4 int v = 0x41424344;
5 char *p = (char*)&v;
6
7 printf("v's representation is: %c%c%c%c\n", *p, *(p+1), *(p+2), *(p+3));
8
9 return 0;
10}
The program assigns the hexadecimal value 0x41424344
(representing ‘A’, ‘B’, ‘C’, ‘D’) to an integer v
. A char*
pointer p
is then used to access each byte of v
.
By printing *p
, *(p+1)
, *(p+2)
, and *(p+3)
, you can see how the integer is laid out in memory.
The output reveals whether your system uses little-endian or big-endian byte order, in other words, whether the integer’s least significant byte is stored at the smallest address or the highest address.
Running this example on my x86 computer outputs DCBA, suggesting that it is little-endian.
​
Fact 5: C Can be Challenging to Use
C’s simplicity comes with trade-offs, which can make developing software in C challenging. These challenges are related to safety (the ability to easily write reliable code) and features (things that increase productivity).
​
Safety
-
No Built-in Memory Safety: C does not provide built-in memory safety, which means developers need to manually manage memory. This can lead to common issues like:
- buffer overflows where a memory location is improperly accessed through a pointer.
1int *p = (int*)malloc(sizeof(int)); 2*(p + 1) = 3; 3free(p);
- use-after-free errors where an address is used after the underlying memory is freed.
1int *p = (int*)malloc(sizeof(int)); 2free(p); 3*p = 3;
- memory leaks where allocated memory is not freed.
1int *p = (int*)malloc(sizeof(int)); 2// read or write *p ... 3*p = NULL;
These bugs are often difficult to detect and can lead to security vulnerabilities.
-
Unsafe Standard Library Functions: C’s standard library contains functions, like
printf
andstrcpy
, that can be easily misused. For example, usingprintf
without proper format specifiers can lead to undefined behavior.
1void echo(const char *s) {
2 printf(s);
3}
4
5int main() {
6 echo("format is %s\n");
7 return 0;
8}
Outputs something like:
format is pvK�
Similarly, strcpy
does not check the size of the destination buffer, which can cause buffer overflows (here, the copied string, longer than a
, overruns it and overwrites b
).
1int main() {
2 char a[3] = {'\0'};
3 char b[10] = {'\0'};
4 strcpy(a, "Hello World!");
5 puts(a);
6 puts(b);
7 return 0;
8}
Outputs:
Hello World!
lo World!
- No standard Error Handling: Error handling in C can be confusing because there is no standard way to handle errors.
Different libraries and functions use different conventions, such as returning-1
orNULL
to indicate an error, which can lead to bugs if not handled consistently.
For example, the POSIXmkdir
call returns 0 on success and -1 on failure, while the OpenSSLTYPE_print_ctx
function returns 1 on success and 0 on failure.
A developer working with both can easily confuse the error codes.
​
Features and Tooling
-
Lack of Modern Features: C lacks features like object-oriented programming (OOP), templates, and traits, which can make it more difficult to build complex applications.
Developers often have to implement these features manually, which can be error-prone and time-consuming. -
Reliance on Third-Party Libraries: C has a very small standard library, so developers often rely on third-party libraries for data structures (like hash maps, dynamic lists, and more), web development, and more.
This can lead to dependency management issues and potential security vulnerabilities if the libraries are not well-maintained. -
No Default Testing Tools: C does not come with built-in testing tools, which means developers need to create their own testing frameworks or use third-party tools.
This absence of integrated solutions can introduce extra complexity and effort into projects. -
No Default Package Manager or Build Tool: C does not have a standard package manager or build tool, which means developers need to choose from various options like vcpkg, conan, and even system package managers, as well as autotools, cmake, or visual studio build tools.
This can lead to inconsistencies and compatibility issues across different environments.
​
Fact 6: C Has a Rich History
C was originally developed at Bell Labs by Dennis Ritchie in the early 1970s to enable UNIX’s transition from assembly to a higher-level language, a task carried out with Ken Thompson.
It was first released around 1972. Early on, features were added to the language as needed, based on UNIX development requirements. However, there was no standardization, and different compilers implemented their own versions of C, leading to code portability issues.
In 1978, Brian Kernighan and Dennis M. Ritchie wrote the original description of C in a book that became widely recognized: The C Programming Language, also known as “the White Book.” This book served as the informal standard for C, referred to as K&R C.
The first formal standard for C was established in 1989 by the American National Standards Institute (ANSI) and called ANSI C (also known as C89).
It introduced several changes to K&R C, such as the automatic definition of the __STDC__
macro, which indicates when a program is compiled using standard C. This macro is defined by the compiler for standard C code but is not set for non-standard variants like K&R C.
1#include <stdio.h>
2
3int main(){
4
5#ifdef __STDC__
6 printf("Compiler supports standard C (C89 or later)\n");
7#else
8 printf("Compiler does not support standard C\n");
9#endif
10
11 return 0;
12}
Shortly after, ANSI C became ISO C (C90) when the International Organization for Standardization (ISO) took over the C language standard. For most compilers, C89 and C90 are effectively identical.
In 1995, a new version of the C standard, referred to as C94, C95, or AMD1, was published. It introduced few amendments to the C89/C90 standard, including the __STDC_VERSION__
macro, which identifies the C standard version being used.
Versions of C from C95 onward define this macro, setting its value to a long
integer that corresponds to the year and month of the standard.
1#include <stdio.h>
2
3int main() {
4
5#if __STDC_VERSION__ >= 199409L
6 printf("C95 or newer\n");
7#else
8 printf("C89/C90/ANSI C or pre-standard C.\n");
9#endif
10
11 return 0;
12}
This version also introduced digraphs, which are special character pairs used to replace certain symbols. For keyboards missing keys like [, ], {, }, and even the hash sign (#), digraphs provide a convenient alternative, as shown in this table, where a symbol is replaced by a pair of characters.
Digraph | Equivalent Character |
---|---|
<: |
[ |
:> |
] |
<% |
{ |
%> |
} |
%: |
# |
1#include <stdio.h>
2int main()
3{
4 puts("Hello World!");
5 return 0;
6}
The above simple “Hello World” program would be typed like this using digraphs.
1%:include <stdio.h>
2int main(void)
3<%
4 puts("Hello, World!");
5 return 0;
6%>
In 1999, major updates to the C standard were made, resulting in C99. This version introduced several new features, such as inline
functions, variable-length arrays, single-line comments (//
), boolean macros, and improved support for floating-point arithmetic. It also removed implicit declarations.
1#include <stdbool.h>
2
3// Single line comment
4
5inline _Bool function(int n)
6{
7 int arr[n];
8
9 /* Use arr ... */
10
11 return true;
12}
In 2011, another standard, C11, was released. It was marked by the removal of the unsafe gets
function. Additionally, several features were added, including compile-time assertions.
The next standard, C17, was released in 2018 as a corrected version of C11. It introduced no significant new features or removals, aside from updates to the __STDC_VERSION__
macro value.
For most compilers, such as GCC, the only difference between C11 and C17 is this updated macro value.
The current version of the C standard, C23, was released in October 2024. This version included numerous deletions and additions.
For example, the K&R-style function definition—where parameter types are specified after the parentheses—was removed. At the same time, new constants, preprocessor macros, and keywords were added.
To date, there are five major C language standard versions: C89, C95, C99, C11, and C23. Future versions are expected to continue evolving the language.
When compiling a program, you can specify the desired standard using the corresponding compiler flag.
Some compilers, like GCC, implement extensions to the C standard, such as GNU C. These extensions can also be specified when compiling programs, provided the compiler supports them.
​
Fact 7: C Is Distinct from C++, C#, Objective-C
Ever wonder why C is often confused with C++, C#, and Objective-C? While they share the common heritage of C, they’re fundamentally different programming languages.
​
Comparing C with C#
C# draws inspiration from C but is tailored for the .NET framework. It’s a completely different language, with no compatibility with C.
C# is compiled into an intermediate code and runs on a virtual machine, offering a different execution model from C. It incorporates modern features designed to simplify development and enhance productivity.
C# is mainly used to develop Windows applications.
​
Comparing C with C++ and Objective-C
Both C++ and Objective-C are direct extensions of C, adding new features while retaining compatibility with C code.
They both introduce object-oriented programming (OOP) and exception handling in different forms, allowing developers to create more complex and modular applications. They also provide additional I/O features, like C++’s cout
and Objective-C’s NSLog
.
Objective-C additionally brings dynamic typing, messaging and more. Its id
type adds flexibility, allowing for more dynamic coding practices. Objective-C is mainly used for macOS and iOS development.
C++, on the other hand, brings in features like templates.
One interesting aspect of C++ is name mangling. When you compile a function in C++, it encodes additional information into the name of the function in assembly code, including the length of the function name and the identifiers of the parameter types.
For example, compiling the following function to assembly:
1void function() {}
gives:
1_Z8functionv:
2 push rbp
3 mov rbp, rsp
4 nop
5 pop rbp
6 ret
This allows for function overloading, meaning you can have multiple functions with the same name but different parameters—something you can’t do in C.
When referring to C libraries from C++ code, you often need to use extern "C"
to prevent name mangling and ensure the C library functions are correctly linked.
While C++ and Objective-C retain compatibility with C code, the reverse is not true. C++ and Objective-C code cannot be compiled as C due to their additional features and syntax.
​
Conclusion
To wrap up, here are the key takeaways from this article:
- Simple Syntax: C’s minimalistic design makes it accessible and efficient.
- Native Execution: Its machine-code compilation ensures high performance.
- Extensive Libraries: Offers flexibility through standard, system, and POSIX libraries.
- Rich History: Decades of evolution have refined and standardized the language.
- System Insights: Learning C deepens your understanding of computer systems.
- Programming Challenges: Its simplicity comes at the cost of safety and modern features.
- Distinct Language: C stands apart from C++, C#, and Objective-C, despite their shared origins in the C programming language.
With its balance of power and simplicity, C remains a cornerstone in programming and will continue to be relevant for years to come. Thanks for watching, and I hope you’ve gained valuable insights into this foundational language.
Feel free to share your favorite aspects of C in the comments bellow.