- About the Project
- Features
- Architecture Overview
- Project Structure
- Data Structures
- Pipeline: How It Works
- Built-in Commands
- Supported Operators
- Signal Handling
- Getting Started
- Usage Examples
- Technical Details
Minishell is a fully functional Unix shell built entirely in C as part of the 42 School curriculum. It replicates core behaviors of the Bash shell — from tokenizing raw input to forking child processes, managing pipes, and handling I/O redirections.
This project explores the internals of how a shell truly works: every keystroke, every |, every > is handled manually without relying on any existing shell implementation. The result is a deep understanding of POSIX process management, file descriptors, signal handling, and environment variable management.
"To understand a tool, build it yourself."
| Feature | Description |
|---|---|
| 🔄 REPL Loop | Interactive prompt powered by GNU Readline with command history |
| 🔤 Tokenization | Parses raw input into a structured token stream |
| 🔍 Syntax Validation | Detects unclosed quotes, misplaced operators, and invalid syntax |
| 🌍 Variable Expansion | Resolves $VAR, $? (exit status), and ~ (home directory) |
| 💬 Quote Handling | Full support for single (') and double (") quote semantics |
| 🔗 Pipelines | Chains multiple commands: cmd1 | cmd2 | cmd3 |
| Operator | Direction | Behavior |
|---|---|---|
< |
Input | Read stdin from file |
> |
Output | Write stdout to file (truncate) |
>> |
Append | Write stdout to file (append) |
<< |
Heredoc | Read stdin until a delimiter is reached |
- Forking — each external command runs in a dedicated child process
- Waiting — the parent collects exit statuses from all children
- Exit status propagation —
$?always reflects the last command's result - Pipe chaining — file descriptors are passed seamlessly across the pipeline
┌──────────────────────────────────────────────────────────────────────────┐
│ MINISHELL ARCHITECTURE │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ READLINE │───▶│ TOKENIZATION │───▶│ SYNTAX │ │
│ │ (input) │ │ (tokens) │ │ CHECK │ │
│ └─────────────┘ └──────────────┘ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ EXECUTION │◀───│ CMD STRUCT │◀───│ EXPANSION │ │
│ │ PIPELINE │ │ GENERATION │ │ ($VAR, ~, $?)│ │
│ └──────┬───────┘ └──────────────┘ └──────────────┘ │
│ │ │
│ ├──▶ BUILTIN? ──▶ exec_builtin() │
│ │ (cd, echo, pwd, export, unset, env, exit) │
│ │ │
│ ├──▶ EXTERNAL? ──▶ fork() ──▶ execve() │
│ │ │
│ ├──▶ PIPE? ──▶ pipe() ──▶ fork() ──▶ dup2() ──▶ execve() │
│ │ │
│ └──▶ REDIRECTION? ──▶ open() ──▶ dup2() ──▶ execute │
│ │
│ PARENT WAITS ──▶ COLLECTS EXIT STATUS ──▶ UPDATE $? ──▶ PROMPT │
└──────────────────────────────────────────────────────────────────────────┘
minishell/
│
├── main.c # Entry point: initializes env & starts loop
├── utils.c # Shell loop (REPL), input handler
├── minishell.h # Central header: all structs, enums, prototypes
│
├── parsing/
│ ├── tokens/
│ │ ├── tokenization.c # Converts raw input into token list
│ │ ├── tokenization_utils.c # Token classification helpers
│ │ ├── tokenization_utils1.c
│ │ ├── tokenization_utils2.c
│ │ └── pipex_utils.c
│ │
│ ├── parser/
│ │ ├── parser.c # Syntax validation engine
│ │ ├── parser_utils.c # Parser helpers (quotes, parentheses, ops)
│ │ ├── parser_utils1.c
│ │ ├── parser_utils2.c
│ │ ├── expand.c # Variable & special character expansion
│ │ ├── expand_utils.c
│ │ ├── expand_utils1.c
│ │ ├── expand_utils2.c
│ │ └── expand_utils3.c
│ │
│ ├── generate_struct/
│ │ ├── generate_struct.c # Builds t_cmd linked list from tokens
│ │ ├── utils.c
│ │ ├── utils1.c
│ │ ├── utils2.c
│ │ └── utils3.c
│ │
│ ├── minishel_utils/
│ │ ├── minishell_utils.c # Misc parsing utilities
│ │ └── clean.c # Memory cleanup / garbage collector
│ │
│ └── signal/
│ └── signal_handle.c # SIGINT / SIGQUIT signal handlers
│
├── execution/
│ ├── builtins/
│ │ ├── cd/ # ft_cd() — directory navigation
│ │ ├── echo/ # ft_echo() — print with -n support
│ │ ├── env/ # ft_env() — display environment
│ │ ├── exit/ # ft_exit() — exit with status code
│ │ ├── export/ # ft_export() — set / display env vars
│ │ ├── pwd/ # ft_pwd() — print working directory
│ │ └── unset/ # ft_unset() — remove env vars
│ │
│ └── exec/
│ ├── check/
│ │ └── check_cmd_name.c # PATH resolution for external commands
│ └── execute/
│ ├── exec_command.c # Execution orchestrator (builtin or execve)
│ ├── child_process.c # Child process setup (fd, signals, exec)
│ ├── child_process_utils.c
│ ├── handle_pipe.c # Pipe creation and management
│ ├── redirections.c # File I/O redirection logic
│ ├── utils.c
│ ├── utils1.c
│ ├── utils2.c
│ └── utils3.c
│
└── Include/
├── libftt/ # Custom libft (ft_* string/memory helpers)
│ ├── libft.h
│ └── *.c
└── gnl/ # Get Next Line library
├── get_next_line.h
├── get_next_line.c
└── get_next_line_utils.c
All core data structures are defined in minishell.h.
Represents a single lexical unit produced by the tokenizer.
typedef struct token {
t_token_type type; // TOKEN_COMMAND, TOKEN_PIPE, TOKEN_REDIR_OUT ...
char *value; // Raw string value
char **expanded_value; // Value(s) after variable expansion
struct token *next;
struct token *previous;
} t_token;Token Types:
typedef enum t_TokenType {
TOKEN_TILDLE, // ~
TOKEN_PIPE, // |
TOKEN_REDIR_IN, // <
TOKEN_REDIR_OUT, // >
TOKEN_REDIR_APPEND, // >>
TOKEN_HERE_DOC, // <<
TOKEN_DOUBLE_QUOTED, // "..."
TOKEN_SINGLE_QUOTED, // '...'
TOKEN_COMMAND, // external command
TOKEN_BUILT_IN, // cd, echo, pwd, export, unset, env, exit
TOKEN_ARGUMENT, // command argument
TOKEN_OPTION, // flag (e.g., -n)
TOKEN_UNKNOWN
} t_token_type;Represents a single command segment in a pipeline.
typedef struct s_cmd {
t_type type; // Command type (PIPE, RE_OUT, RE_IN ...)
char **arguments; // argv-style argument array
t_file *file; // Linked list of file redirections
int builtin; // -1 = not a builtin, else builtin index
char *cmd_path; // Resolved absolute path for execve()
int pipe_fd[2]; // Pipe file descriptors [read, write]
int pid; // Child process ID
int in_fd, out_fd; // Actual stdin/stdout after redirection
int stop; // Error / stop flag
int is_herdoc_end; // Heredoc termination flag
struct s_cmd *prev, *next;
} t_cmd;A doubly-linked list node representing a single environment variable.
typedef struct s_env {
char *name; // Variable name (e.g., "HOME")
char *vale; // Variable value (e.g., "/root")
struct s_env *next;
struct s_env *prv;
} t_envi;Linked list of redirections attached to a command.
typedef struct s_file {
char *filename; // Target filename
int type; // RE_IN, RE_OUT, RE_APPEND, RE_HEREDOC
char *red; // Operator string ("<", ">", ">>", "<<")
struct s_file *next;
} t_file;A single global struct tracking the entire shell's runtime state.
struct s_global {
int exit_status; // Last command exit status ($?)
t_envi *envp; // Environment variable linked list
t_gc *head; // Garbage collector head
int *pid_array; // Array of child process IDs
int pre_pipe_infd; // Input fd from the previous pipe
int fd_heredoc; // Heredoc file descriptor
// ... and more runtime state fields
} g_var;Every line you type goes through a 5-stage pipeline before any process is spawned:
Input: echo "hello world" | grep -i hello > output.txt
Tokens: [BUILT_IN:"echo"] [DOUBLE_QUOTED:"hello world"] [PIPE:"|"]
[COMMAND:"grep"] [OPTION:"-i"] [ARGUMENT:"hello"]
[REDIR_OUT:">"] [ARGUMENT:"output.txt"]
Validates the structure before any execution:
- ✅ Checks for matching quotes (
"and') - ✅ Checks for balanced parentheses
- ✅ Rejects leading/trailing pipe operators (
| cmd,cmd |) - ✅ Rejects consecutive operators (
>> >,| |)
Before: echo "Hello $USER, your home is ~"
After: echo "Hello samia, your home is /home/samia"
Special:
$? → last exit status (e.g., "0" or "130")
~ → $HOME value
$VAR → looked up in env linked list
Inside
'single quotes': no expansion occurs (everything is literal). Inside"double quotes":$VARand~are expanded.
The token list is converted into a t_cmd doubly-linked list, one node per pipe-separated command. Each node stores:
- Resolved argument array
- File redirections (type + filename)
- Heredoc state
- Builtin flag
For each t_cmd node:
Is it a builtin?
└─ YES → execute directly in current process (or subshell for pipes)
└─ NO → fork() a child process
child → apply redirections → execve(cmd_path, argv, envp)
parent → record pid, save pipe fd for next stage
After all commands are launched:
→ waitpid() for all children
→ collect exit statuses
→ update g_var.exit_status ($?)
All built-ins are implemented from scratch without calling any external binary.
| Command | Syntax | Description |
|---|---|---|
echo |
echo [-n] [args...] |
Print arguments to stdout. -n suppresses the trailing newline. |
cd |
cd [path] |
Change the current working directory. Updates PWD and OLDPWD. |
pwd |
pwd |
Print the absolute path of the current working directory. |
export |
export [NAME=value...] |
Set or update environment variables. Without arguments, prints all exported vars sorted alphabetically. |
unset |
unset [NAME...] |
Remove one or more environment variables. |
env |
env |
Print all currently set environment variables. |
exit |
exit [code] |
Exit the shell. The optional numeric code sets the exit status. |
cat /etc/passwd | grep root | wc -lCommands are connected with |. The stdout of each command becomes the stdin of the next. All processes in a pipeline run concurrently.
sort < unsorted.txtThe command reads from unsorted.txt instead of the keyboard.
echo "hello" > greeting.txtCreates or truncates greeting.txt and writes the output into it.
echo "world" >> greeting.txtCreates or appends to greeting.txt — existing content is preserved.
cat << EOF
line one
line two
EOFReads lines from the terminal (or stdin) until the delimiter EOF is matched. The collected input is provided as stdin to the command.
| Signal | Key Combo | Interactive Behavior | In Child Process |
|---|---|---|---|
SIGINT |
Ctrl+C |
Cancels current input, prints new prompt, sets $? to 130 |
Terminates child |
SIGQUIT |
Ctrl+\ |
Ignored in the interactive shell | Terminates child, sets $? to 131 |
SIGTERM |
— | Default behavior | Terminates child |
Signal handling is context-sensitive: the interactive shell suppresses SIGQUIT, while child processes restore the default handlers before executing external programs.
- A Unix-like operating system (Linux / macOS)
ccorgcccompiler- GNU
readlinedevelopment library
# Ubuntu / Debian
sudo apt-get install libreadline-dev
# macOS (Homebrew)
brew install readlinegit clone https://github.com/Samia-Hb/minishell.git
cd minishell
make./minishellYou will be greeted with:
minishell > _
| Target | Description |
|---|---|
make / make all |
Compile the minishell binary |
make clean |
Remove all object files |
make fclean |
Remove object files and the binary |
make re |
fclean then all |
# Basic commands
minishell > echo "Hello, World!"
Hello, World!
minishell > pwd
/home/samia/minishell
minishell > cd /tmp
minishell > pwd
/tmp
# Environment variables
minishell > export MY_VAR=42
minishell > echo $MY_VAR
42
minishell > unset MY_VAR
minishell > echo $MY_VAR
← (empty)
# Pipes
minishell > echo "apple\nbanana\ncherry" | grep an
banana
minishell > ls -la | sort | head -5
# Redirections
minishell > echo "log entry" >> app.log
minishell > cat < app.log
log entry
# Heredoc
minishell > cat << END
> hello
> from heredoc
> END
hello
from heredoc
# Exit status
minishell > ls /nonexistent
ls: cannot access '/nonexistent': No such file or directory
minishell > echo $?
2
# Signal behavior
minishell > sleep 10
^C ← Ctrl+C terminates sleep
minishell > echo $?
130CC = cc
CFLAGS = -Wall -Wextra -Werror
LFLAGS = -lreadline- Strict compilation with zero-warning tolerance (
-Werror) - Linked against GNU Readline for interactive input with history
A custom garbage collector (t_gc) tracks all heap allocations throughout the parsing and execution lifecycle. On each iteration of the shell loop, memory is freed in bulk — preventing leaks from complex branching paths.
| Library | Purpose |
|---|---|
libft (custom) |
String manipulation, memory utils (ft_split, ft_strjoin, etc.) |
gnl (custom) |
get_next_line() — used for heredoc input reading |
readline (system) |
Interactive prompt with line editing and history |
| Metric | Value |
|---|---|
| Source files | ~46 .c files |
| Lines of code | ~5,900 (excl. libft/gnl) |
| Built-in commands | 7 |
| I/O operators | 5 (< > >> << |) |
| Data structures | 7+ |
| Compiler flags | -Wall -Wextra -Werror |
Made with 💻 and ☕ as part of the *1337 school curriculum.
"The shell is the window to the OS — so we built the window."