Skip to content

Samia-Hb/Minishell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

                                                Minishell

A custom Unix shell built from scratch in C — because why use bash when you can build it?

Language School Norm Readline


📖 Table of Contents

  1. About the Project
  2. Features
  3. Architecture Overview
  4. Project Structure
  5. Data Structures
  6. Pipeline: How It Works
  7. Built-in Commands
  8. Supported Operators
  9. Signal Handling
  10. Getting Started
  11. Usage Examples
  12. Technical Details

🐚 About the Project

Minishell is a fully functional Unix shell built entirely in C as part of the 42 School curriculum. It replicates core behaviors of the Bash shell — from tokenizing raw input to forking child processes, managing pipes, and handling I/O redirections.

This project explores the internals of how a shell truly works: every keystroke, every |, every > is handled manually without relying on any existing shell implementation. The result is a deep understanding of POSIX process management, file descriptors, signal handling, and environment variable management.

"To understand a tool, build it yourself."


✨ Features

Core Shell Capabilities

Feature Description
🔄 REPL Loop Interactive prompt powered by GNU Readline with command history
🔤 Tokenization Parses raw input into a structured token stream
🔍 Syntax Validation Detects unclosed quotes, misplaced operators, and invalid syntax
🌍 Variable Expansion Resolves $VAR, $? (exit status), and ~ (home directory)
💬 Quote Handling Full support for single (') and double (") quote semantics
🔗 Pipelines Chains multiple commands: cmd1 | cmd2 | cmd3

I/O Redirections

Operator Direction Behavior
< Input Read stdin from file
> Output Write stdout to file (truncate)
>> Append Write stdout to file (append)
<< Heredoc Read stdin until a delimiter is reached

Process Management

  • Forking — each external command runs in a dedicated child process
  • Waiting — the parent collects exit statuses from all children
  • Exit status propagation$? always reflects the last command's result
  • Pipe chaining — file descriptors are passed seamlessly across the pipeline

🏛️ Architecture Overview

┌──────────────────────────────────────────────────────────────────────────┐
│                         MINISHELL ARCHITECTURE                           │
│                                                                          │
│  ┌─────────────┐    ┌──────────────┐    ┌──────────────┐                │
│  │   READLINE  │───▶│ TOKENIZATION │───▶│    SYNTAX    │                │
│  │   (input)   │    │  (tokens)    │    │    CHECK     │                │
│  └─────────────┘    └──────────────┘    └──────┬───────┘                │
│                                                 │                        │
│                                                 ▼                        │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐               │
│  │  EXECUTION   │◀───│  CMD STRUCT  │◀───│  EXPANSION   │               │
│  │  PIPELINE    │    │ GENERATION   │    │ ($VAR, ~, $?)│               │
│  └──────┬───────┘    └──────────────┘    └──────────────┘               │
│         │                                                                │
│         ├──▶ BUILTIN? ──▶ exec_builtin()                                │
│         │                 (cd, echo, pwd, export, unset, env, exit)     │
│         │                                                                │
│         ├──▶ EXTERNAL? ──▶ fork() ──▶ execve()                         │
│         │                                                                │
│         ├──▶ PIPE? ──▶ pipe() ──▶ fork() ──▶ dup2() ──▶ execve()       │
│         │                                                                │
│         └──▶ REDIRECTION? ──▶ open() ──▶ dup2() ──▶ execute            │
│                                                                          │
│  PARENT WAITS ──▶ COLLECTS EXIT STATUS ──▶ UPDATE $? ──▶ PROMPT        │
└──────────────────────────────────────────────────────────────────────────┘

📁 Project Structure

minishell/
│
├── main.c                          # Entry point: initializes env & starts loop
├── utils.c                         # Shell loop (REPL), input handler
├── minishell.h                     # Central header: all structs, enums, prototypes
│
├── parsing/
│   ├── tokens/
│   │   ├── tokenization.c          # Converts raw input into token list
│   │   ├── tokenization_utils.c    # Token classification helpers
│   │   ├── tokenization_utils1.c
│   │   ├── tokenization_utils2.c
│   │   └── pipex_utils.c
│   │
│   ├── parser/
│   │   ├── parser.c                # Syntax validation engine
│   │   ├── parser_utils.c          # Parser helpers (quotes, parentheses, ops)
│   │   ├── parser_utils1.c
│   │   ├── parser_utils2.c
│   │   ├── expand.c                # Variable & special character expansion
│   │   ├── expand_utils.c
│   │   ├── expand_utils1.c
│   │   ├── expand_utils2.c
│   │   └── expand_utils3.c
│   │
│   ├── generate_struct/
│   │   ├── generate_struct.c       # Builds t_cmd linked list from tokens
│   │   ├── utils.c
│   │   ├── utils1.c
│   │   ├── utils2.c
│   │   └── utils3.c
│   │
│   ├── minishel_utils/
│   │   ├── minishell_utils.c       # Misc parsing utilities
│   │   └── clean.c                 # Memory cleanup / garbage collector
│   │
│   └── signal/
│       └── signal_handle.c         # SIGINT / SIGQUIT signal handlers
│
├── execution/
│   ├── builtins/
│   │   ├── cd/                     # ft_cd() — directory navigation
│   │   ├── echo/                   # ft_echo() — print with -n support
│   │   ├── env/                    # ft_env() — display environment
│   │   ├── exit/                   # ft_exit() — exit with status code
│   │   ├── export/                 # ft_export() — set / display env vars
│   │   ├── pwd/                    # ft_pwd() — print working directory
│   │   └── unset/                  # ft_unset() — remove env vars
│   │
│   └── exec/
│       ├── check/
│       │   └── check_cmd_name.c    # PATH resolution for external commands
│       └── execute/
│           ├── exec_command.c      # Execution orchestrator (builtin or execve)
│           ├── child_process.c     # Child process setup (fd, signals, exec)
│           ├── child_process_utils.c
│           ├── handle_pipe.c       # Pipe creation and management
│           ├── redirections.c      # File I/O redirection logic
│           ├── utils.c
│           ├── utils1.c
│           ├── utils2.c
│           └── utils3.c
│
└── Include/
    ├── libftt/                     # Custom libft (ft_* string/memory helpers)
    │   ├── libft.h
    │   └── *.c
    └── gnl/                        # Get Next Line library
        ├── get_next_line.h
        ├── get_next_line.c
        └── get_next_line_utils.c

🧱 Data Structures

All core data structures are defined in minishell.h.

Token (t_token)

Represents a single lexical unit produced by the tokenizer.

typedef struct token {
    t_token_type  type;             // TOKEN_COMMAND, TOKEN_PIPE, TOKEN_REDIR_OUT ...
    char         *value;            // Raw string value
    char        **expanded_value;   // Value(s) after variable expansion
    struct token *next;
    struct token *previous;
} t_token;

Token Types:

typedef enum t_TokenType {
    TOKEN_TILDLE,           // ~
    TOKEN_PIPE,             // |
    TOKEN_REDIR_IN,         // <
    TOKEN_REDIR_OUT,        // >
    TOKEN_REDIR_APPEND,     // >>
    TOKEN_HERE_DOC,         // <<
    TOKEN_DOUBLE_QUOTED,    // "..."
    TOKEN_SINGLE_QUOTED,    // '...'
    TOKEN_COMMAND,          // external command
    TOKEN_BUILT_IN,         // cd, echo, pwd, export, unset, env, exit
    TOKEN_ARGUMENT,         // command argument
    TOKEN_OPTION,           // flag (e.g., -n)
    TOKEN_UNKNOWN
} t_token_type;

Command (t_cmd)

Represents a single command segment in a pipeline.

typedef struct s_cmd {
    t_type   type;              // Command type (PIPE, RE_OUT, RE_IN ...)
    char   **arguments;         // argv-style argument array
    t_file  *file;              // Linked list of file redirections
    int      builtin;           // -1 = not a builtin, else builtin index
    char    *cmd_path;          // Resolved absolute path for execve()
    int      pipe_fd[2];        // Pipe file descriptors [read, write]
    int      pid;               // Child process ID
    int      in_fd, out_fd;     // Actual stdin/stdout after redirection
    int      stop;              // Error / stop flag
    int      is_herdoc_end;     // Heredoc termination flag
    struct s_cmd *prev, *next;
} t_cmd;

Environment Variable (t_envi)

A doubly-linked list node representing a single environment variable.

typedef struct s_env {
    char          *name;        // Variable name  (e.g., "HOME")
    char          *vale;        // Variable value (e.g., "/root")
    struct s_env  *next;
    struct s_env  *prv;
} t_envi;

File Redirection (t_file)

Linked list of redirections attached to a command.

typedef struct s_file {
    char          *filename;    // Target filename
    int            type;        // RE_IN, RE_OUT, RE_APPEND, RE_HEREDOC
    char          *red;         // Operator string ("<", ">", ">>", "<<")
    struct s_file *next;
} t_file;

Global State (g_var)

A single global struct tracking the entire shell's runtime state.

struct s_global {
    int    exit_status;         // Last command exit status ($?)
    t_envi *envp;               // Environment variable linked list
    t_gc   *head;               // Garbage collector head
    int    *pid_array;          // Array of child process IDs
    int     pre_pipe_infd;      // Input fd from the previous pipe
    int     fd_heredoc;         // Heredoc file descriptor
    // ... and more runtime state fields
} g_var;

⚙️ Pipeline: How It Works

Every line you type goes through a 5-stage pipeline before any process is spawned:

Stage 1 — Tokenization

Input:  echo "hello world" | grep -i hello > output.txt

Tokens: [BUILT_IN:"echo"] [DOUBLE_QUOTED:"hello world"] [PIPE:"|"]
        [COMMAND:"grep"]  [OPTION:"-i"]  [ARGUMENT:"hello"]
        [REDIR_OUT:">"]   [ARGUMENT:"output.txt"]

Stage 2 — Syntax Validation

Validates the structure before any execution:

  • ✅ Checks for matching quotes (" and ')
  • ✅ Checks for balanced parentheses
  • ✅ Rejects leading/trailing pipe operators (| cmd, cmd |)
  • ✅ Rejects consecutive operators (>> >, | |)

Stage 3 — Variable Expansion

Before:  echo "Hello $USER, your home is ~"
After:   echo "Hello samia, your home is /home/samia"

Special:
  $?    →  last exit status (e.g., "0" or "130")
  ~     →  $HOME value
  $VAR  →  looked up in env linked list

Inside 'single quotes': no expansion occurs (everything is literal). Inside "double quotes": $VAR and ~ are expanded.

Stage 4 — Command Structure Generation

The token list is converted into a t_cmd doubly-linked list, one node per pipe-separated command. Each node stores:

  • Resolved argument array
  • File redirections (type + filename)
  • Heredoc state
  • Builtin flag

Stage 5 — Execution

For each t_cmd node:

  Is it a builtin?
  └─ YES → execute directly in current process (or subshell for pipes)
  └─ NO  → fork() a child process
             child  → apply redirections → execve(cmd_path, argv, envp)
             parent → record pid, save pipe fd for next stage

After all commands are launched:
  → waitpid() for all children
  → collect exit statuses
  → update g_var.exit_status ($?)

🛠 Built-in Commands

All built-ins are implemented from scratch without calling any external binary.

Command Syntax Description
echo echo [-n] [args...] Print arguments to stdout. -n suppresses the trailing newline.
cd cd [path] Change the current working directory. Updates PWD and OLDPWD.
pwd pwd Print the absolute path of the current working directory.
export export [NAME=value...] Set or update environment variables. Without arguments, prints all exported vars sorted alphabetically.
unset unset [NAME...] Remove one or more environment variables.
env env Print all currently set environment variables.
exit exit [code] Exit the shell. The optional numeric code sets the exit status.

🔀 Supported Operators

Pipe

cat /etc/passwd | grep root | wc -l

Commands are connected with |. The stdout of each command becomes the stdin of the next. All processes in a pipeline run concurrently.

Input Redirection (<)

sort < unsorted.txt

The command reads from unsorted.txt instead of the keyboard.

Output Redirection (>)

echo "hello" > greeting.txt

Creates or truncates greeting.txt and writes the output into it.

Append Redirection (>>)

echo "world" >> greeting.txt

Creates or appends to greeting.txt — existing content is preserved.

Heredoc (<<)

cat << EOF
line one
line two
EOF

Reads lines from the terminal (or stdin) until the delimiter EOF is matched. The collected input is provided as stdin to the command.


📡 Signal Handling

Signal Key Combo Interactive Behavior In Child Process
SIGINT Ctrl+C Cancels current input, prints new prompt, sets $? to 130 Terminates child
SIGQUIT Ctrl+\ Ignored in the interactive shell Terminates child, sets $? to 131
SIGTERM Default behavior Terminates child

Signal handling is context-sensitive: the interactive shell suppresses SIGQUIT, while child processes restore the default handlers before executing external programs.


🚀 Getting Started

Prerequisites

  • A Unix-like operating system (Linux / macOS)
  • cc or gcc compiler
  • GNU readline development library
# Ubuntu / Debian
sudo apt-get install libreadline-dev

# macOS (Homebrew)
brew install readline

Build

git clone https://github.com/Samia-Hb/minishell.git
cd minishell
make

Run

./minishell

You will be greeted with:

minishell > _

Makefile Targets

Target Description
make / make all Compile the minishell binary
make clean Remove all object files
make fclean Remove object files and the binary
make re fclean then all

💡 Usage Examples

# Basic commands
minishell > echo "Hello, World!"
Hello, World!

minishell > pwd
/home/samia/minishell

minishell > cd /tmp
minishell > pwd
/tmp

# Environment variables
minishell > export MY_VAR=42
minishell > echo $MY_VAR
42

minishell > unset MY_VAR
minishell > echo $MY_VAR
                          ← (empty)

# Pipes
minishell > echo "apple\nbanana\ncherry" | grep an
banana

minishell > ls -la | sort | head -5

# Redirections
minishell > echo "log entry" >> app.log
minishell > cat < app.log
log entry

# Heredoc
minishell > cat << END
> hello
> from heredoc
> END
hello
from heredoc

# Exit status
minishell > ls /nonexistent
ls: cannot access '/nonexistent': No such file or directory
minishell > echo $?
2

# Signal behavior
minishell > sleep 10
^C                        ← Ctrl+C terminates sleep
minishell > echo $?
130

🔬 Technical Details

Compilation

CC     = cc
CFLAGS = -Wall -Wextra -Werror
LFLAGS = -lreadline
  • Strict compilation with zero-warning tolerance (-Werror)
  • Linked against GNU Readline for interactive input with history

Memory Management

A custom garbage collector (t_gc) tracks all heap allocations throughout the parsing and execution lifecycle. On each iteration of the shell loop, memory is freed in bulk — preventing leaks from complex branching paths.

External Libraries Used

Library Purpose
libft (custom) String manipulation, memory utils (ft_split, ft_strjoin, etc.)
gnl (custom) get_next_line() — used for heredoc input reading
readline (system) Interactive prompt with line editing and history

Code Statistics

Metric Value
Source files ~46 .c files
Lines of code ~5,900 (excl. libft/gnl)
Built-in commands 7
I/O operators 5 (< > >> << |)
Data structures 7+
Compiler flags -Wall -Wextra -Werror

Made with 💻 and ☕ as part of the *1337 school curriculum.

"The shell is the window to the OS — so we built the window."

About

A lightweight Unix shell written in C that mimics bash behavior, focusing on parsing, execution, file descriptors, and process control.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors