Skip to content

Latest commit

 

History

History
1604 lines (1328 loc) · 39.9 KB

File metadata and controls

1604 lines (1328 loc) · 39.9 KB

🎯 Top SQL Interview Questions and Answers

A comprehensive collection of the most frequently asked SQL interview questions with detailed answers, examples, and explanations for all skill levels.

📋 Table of Contents


🔰 Fundamental SQL Concepts

1. What is SQL and its main commands?

Answer: SQL (Structured Query Language) is a domain-specific language for managing relational databases.

Main Command Categories:

DDL (Data Definition Language)

-- CREATE: Create database objects
CREATE TABLE Employees (
    ID INT PRIMARY KEY,
    Name VARCHAR(50),
    Salary DECIMAL(10,2)
);

-- ALTER: Modify existing objects
ALTER TABLE Employees ADD Email VARCHAR(100);

-- DROP: Delete database objects
DROP TABLE Employees;

DML (Data Manipulation Language)

-- SELECT: Retrieve data
SELECT Name, Salary FROM Employees WHERE Salary > 50000;

-- INSERT: Add new records
INSERT INTO Employees (ID, Name, Salary) VALUES (1, 'John', 60000);

-- UPDATE: Modify existing records
UPDATE Employees SET Salary = 65000 WHERE ID = 1;

-- DELETE: Remove records
DELETE FROM Employees WHERE ID = 1;

2. Difference between SQL and NoSQL databases

Aspect SQL NoSQL
Structure Table-based, fixed schema Document, key-value, graph, column-family
ACID Full ACID compliance Eventual consistency (BASE)
Scalability Vertical scaling Horizontal scaling
Query Language Standardized SQL Varies by database
Examples MySQL, PostgreSQL, SQL Server MongoDB, Cassandra, Redis
-- SQL Example
SELECT c.CustomerName, COUNT(o.OrderID) as OrderCount
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerName;
// NoSQL (MongoDB) Example
db.customers.aggregate([
  {
    $lookup: {
      from: "orders",
      localField: "customerId",
      foreignField: "customerId",
      as: "orders"
    }
  },
  {
    $project: {
      customerName: 1,
      orderCount: { $size: "$orders" }
    }
  }
]);

3. Database terminology explained

Answer:

  • Database: Collection of related tables and objects
  • Table: Structure that holds data in rows and columns
  • Field/Column: Vertical entity representing an attribute
  • Row/Record: Horizontal entity representing a single data entry
  • Column: Vertical attribute of the table
-- Example table structure
CREATE TABLE Students (
    StudentID INT,      -- Field/Column
    FirstName VARCHAR(50),  -- Field/Column
    LastName VARCHAR(50),   -- Field/Column
    Age INT                 -- Field/Column
);

-- Each INSERT creates a new row/record
INSERT INTO Students VALUES (1, 'John', 'Doe', 20);  -- Row 1
INSERT INTO Students VALUES (2, 'Jane', 'Smith', 19); -- Row 2

4. Primary Key, Foreign Key, and Constraints

Primary Key

-- Single column primary key
CREATE TABLE Departments (
    DeptID INT PRIMARY KEY,
    DeptName VARCHAR(50)
);

-- Composite primary key
CREATE TABLE Enrollments (
    StudentID INT,
    CourseID INT,
    EnrollDate DATE,
    PRIMARY KEY (StudentID, CourseID)
);

Foreign Key

CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    DepartmentID INT,
    FOREIGN KEY (DepartmentID) REFERENCES Departments(DeptID)
);

Constraints

CREATE TABLE Products (
    ProductID INT PRIMARY KEY,
    ProductName VARCHAR(100) NOT NULL,      -- NOT NULL constraint
    Price DECIMAL(10,2) CHECK (Price > 0),  -- CHECK constraint
    ProductCode VARCHAR(20) UNIQUE,         -- UNIQUE constraint
    CategoryID INT,
    FOREIGN KEY (CategoryID) REFERENCES Categories(CategoryID)
);

5. JOINs and their types

Sample Tables

-- Employees table
CREATE TABLE Employees (
    EmployeeID INT,
    Name VARCHAR(50),
    DepartmentID INT
);

-- Departments table
CREATE TABLE Departments (
    DepartmentID INT,
    DepartmentName VARCHAR(50)
);

-- Sample data
INSERT INTO Employees VALUES 
(1, 'John', 10), (2, 'Jane', 20), (3, 'Bob', NULL);

INSERT INTO Departments VALUES 
(10, 'IT'), (20, 'HR'), (30, 'Finance');

INNER JOIN

-- Returns only matching records from both tables
SELECT e.Name, d.DepartmentName
FROM Employees e
INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID;

-- Result: John-IT, Jane-HR

LEFT JOIN

-- Returns all records from left table
SELECT e.Name, d.DepartmentName
FROM Employees e
LEFT JOIN Departments d ON e.DepartmentID = d.DepartmentID;

-- Result: John-IT, Jane-HR, Bob-NULL

RIGHT JOIN

-- Returns all records from right table
SELECT e.Name, d.DepartmentName
FROM Employees e
RIGHT JOIN Departments d ON e.DepartmentID = d.DepartmentID;

-- Result: John-IT, Jane-HR, NULL-Finance

FULL OUTER JOIN

-- Returns all records from both tables
SELECT e.Name, d.DepartmentName
FROM Employees e
FULL OUTER JOIN Departments d ON e.DepartmentID = d.DepartmentID;

-- Result: John-IT, Jane-HR, Bob-NULL, NULL-Finance

6. Difference between WHERE and HAVING

-- WHERE: Filters rows before grouping
SELECT DepartmentID, COUNT(*) as EmployeeCount
FROM Employees
WHERE Salary > 50000  -- Filters individual rows
GROUP BY DepartmentID;

-- HAVING: Filters groups after aggregation
SELECT DepartmentID, COUNT(*) as EmployeeCount
FROM Employees
GROUP BY DepartmentID
HAVING COUNT(*) > 5;  -- Filters grouped results

7. Indexes in SQL

Purpose: Improve query performance by creating a separate structure that points to table rows.

-- Create index
CREATE INDEX idx_employee_name ON Employees(LastName, FirstName);

-- Unique index
CREATE UNIQUE INDEX idx_employee_email ON Employees(Email);

-- Composite index
CREATE INDEX idx_employee_dept_salary ON Employees(DepartmentID, Salary);

-- Check index usage
EXPLAIN SELECT * FROM Employees WHERE LastName = 'Smith';

Benefits:

  • Faster SELECT queries
  • Faster WHERE, ORDER BY, GROUP BY operations
  • Faster JOIN operations

Drawbacks:

  • Additional storage space
  • Slower INSERT/UPDATE/DELETE operations

8. Normalization

Definition: Process of organizing data to reduce redundancy and improve data integrity.

Normal Forms:

1NF (First Normal Form):

-- Violates 1NF (multiple values in single field)
CREATE TABLE Students_Bad (
    StudentID INT,
    Name VARCHAR(50),
    Subjects VARCHAR(200)  -- 'Math, Science, English'
);

-- Follows 1NF (atomic values)
CREATE TABLE Students_Good (
    StudentID INT,
    Name VARCHAR(50)
);

CREATE TABLE StudentSubjects (
    StudentID INT,
    Subject VARCHAR(50)
);

3NF (Third Normal Form):

-- Violates 3NF (transitive dependency)
CREATE TABLE Employees_Bad (
    EmployeeID INT,
    DepartmentID INT,
    DepartmentName VARCHAR(50),  -- Depends on DepartmentID, not EmployeeID
    DepartmentLocation VARCHAR(50)
);

-- Follows 3NF
CREATE TABLE Employees (
    EmployeeID INT,
    DepartmentID INT
);

CREATE TABLE Departments (
    DepartmentID INT,
    DepartmentName VARCHAR(50),
    DepartmentLocation VARCHAR(50)
);

9. ACID Properties

A - Atomicity: All operations in a transaction succeed or fail together

BEGIN TRANSACTION;
    UPDATE Account SET Balance = Balance - 100 WHERE AccountID = 1;
    UPDATE Account SET Balance = Balance + 100 WHERE AccountID = 2;
COMMIT;  -- Both updates succeed or both fail

C - Consistency: Database remains in valid state before and after transaction

-- Constraint ensures consistency
ALTER TABLE Account ADD CONSTRAINT chk_balance CHECK (Balance >= 0);

I - Isolation: Concurrent transactions don't interfere with each other

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

D - Durability: Committed changes persist even after system failure

COMMIT;  -- Changes are permanently saved

10. Views, Stored Procedures, and Triggers

Views

-- Create view
CREATE VIEW EmployeeDetails AS
SELECT e.Name, e.Salary, d.DepartmentName
FROM Employees e
JOIN Departments d ON e.DepartmentID = d.DepartmentID
WHERE e.Salary > 50000;

-- Use view
SELECT * FROM EmployeeDetails WHERE DepartmentName = 'IT';

Stored Procedures

-- Create stored procedure
CREATE PROCEDURE GetEmployeesByDepartment
    @DepartmentID INT
AS
BEGIN
    SELECT * FROM Employees WHERE DepartmentID = @DepartmentID;
END;

-- Execute procedure
EXEC GetEmployeesByDepartment @DepartmentID = 10;

Triggers

-- Create trigger
CREATE TRIGGER tr_audit_salary_changes
ON Employees
AFTER UPDATE
AS
BEGIN
    IF UPDATE(Salary)
    BEGIN
        INSERT INTO SalaryAudit (EmployeeID, OldSalary, NewSalary, ChangeDate)
        SELECT i.EmployeeID, d.Salary, i.Salary, GETDATE()
        FROM INSERTED i
        JOIN DELETED d ON i.EmployeeID = d.EmployeeID;
    END
END;

🎯 Essential Query-Based Problems

1. Find the Third-Highest Salary

Method 1: Using ROW_NUMBER()

WITH RankedSalaries AS (
    SELECT Salary, ROW_NUMBER() OVER (ORDER BY Salary DESC) as rn
    FROM Employees
)
SELECT Salary 
FROM RankedSalaries 
WHERE rn = 3;

Method 2: Using DENSE_RANK()

WITH RankedSalaries AS (
    SELECT Salary, DENSE_RANK() OVER (ORDER BY Salary DESC) as rank
    FROM Employees
)
SELECT DISTINCT Salary 
FROM RankedSalaries 
WHERE rank = 3;

Method 3: Using Subquery

SELECT DISTINCT Salary
FROM Employees e1
WHERE 2 = (
    SELECT COUNT(DISTINCT Salary)
    FROM Employees e2
    WHERE e2.Salary > e1.Salary
);

2. Find Duplicate Emails

Sample Table

CREATE TABLE Person (
    Id INT,
    Email VARCHAR(100)
);

INSERT INTO Person VALUES 
(1, 'a@b.com'), (2, 'c@d.com'), (3, 'a@b.com');

Solution

-- Method 1: Using GROUP BY
SELECT Email
FROM Person
GROUP BY Email
HAVING COUNT(*) > 1;

-- Method 2: Using Window Function
SELECT DISTINCT Email
FROM (
    SELECT Email, COUNT(*) OVER (PARTITION BY Email) as cnt
    FROM Person
) t
WHERE cnt > 1;

3. Customers Who Never Placed an Order

Sample Tables

CREATE TABLE Customers (
    CustomerID INT,
    CustomerName VARCHAR(50)
);

CREATE TABLE Orders (
    OrderID INT,
    CustomerID INT
);

Solutions

-- Method 1: Using LEFT JOIN
SELECT c.CustomerName
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE o.CustomerID IS NULL;

-- Method 2: Using NOT IN
SELECT CustomerName
FROM Customers
WHERE CustomerID NOT IN (
    SELECT CustomerID 
    FROM Orders 
    WHERE CustomerID IS NOT NULL
);

-- Method 3: Using NOT EXISTS
SELECT CustomerName
FROM Customers c
WHERE NOT EXISTS (
    SELECT 1 
    FROM Orders o 
    WHERE o.CustomerID = c.CustomerID
);

4. Employees Hired in Current Year

-- Method 1: Using YEAR function
SELECT *
FROM Employees
WHERE YEAR(HireDate) = YEAR(GETDATE());

-- Method 2: Using date range (more efficient)
SELECT *
FROM Employees
WHERE HireDate >= DATEFROMPARTS(YEAR(GETDATE()), 1, 1)
  AND HireDate < DATEFROMPARTS(YEAR(GETDATE()) + 1, 1, 1);

-- Method 3: Using DATEPART
SELECT *
FROM Employees
WHERE DATEPART(YEAR, HireDate) = DATEPART(YEAR, GETDATE());

5. Employees with Salary > Company Average

-- Method 1: Using subquery
SELECT Name, Salary
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);

-- Method 2: Using window function
SELECT Name, Salary, AvgSalary
FROM (
    SELECT Name, Salary, 
           AVG(Salary) OVER () as AvgSalary
    FROM Employees
) t
WHERE Salary > AvgSalary;

6. Customers with More Than 5 Orders

SELECT c.CustomerName, COUNT(o.OrderID) as OrderCount
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerID, c.CustomerName
HAVING COUNT(o.OrderID) > 5;

7. Movies Query (Odd ID, Not Boring, Sorted by Rating)

CREATE TABLE Cinema (
    id INT,
    movie VARCHAR(50),
    description VARCHAR(100),
    rating DECIMAL(2,1)
);

SELECT *
FROM Cinema
WHERE id % 2 = 1                           -- Odd ID
  AND description NOT LIKE '%boring%'       -- Not boring
ORDER BY rating DESC;                       -- Sorted by rating descending

8. Count Employees by Department

SELECT d.DepartmentName, COUNT(e.EmployeeID) as EmployeeCount
FROM Departments d
LEFT JOIN Employees e ON d.DepartmentID = e.DepartmentID
GROUP BY d.DepartmentID, d.DepartmentName
ORDER BY EmployeeCount DESC;

9. Top N Earners in Each Department

-- Top 3 earners in each department
WITH RankedEmployees AS (
    SELECT Name, Salary, DepartmentID,
           DENSE_RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) as rank
    FROM Employees
)
SELECT d.DepartmentName, r.Name, r.Salary, r.rank
FROM RankedEmployees r
JOIN Departments d ON r.DepartmentID = d.DepartmentID
WHERE r.rank <= 3
ORDER BY d.DepartmentName, r.rank;

10. Find Duplicate Records Using GROUP BY

-- Find duplicate employees (same name and department)
SELECT FirstName, LastName, DepartmentID, COUNT(*) as DuplicateCount
FROM Employees
GROUP BY FirstName, LastName, DepartmentID
HAVING COUNT(*) > 1;

-- Get all records of duplicates
SELECT e.*
FROM Employees e
JOIN (
    SELECT FirstName, LastName, DepartmentID
    FROM Employees
    GROUP BY FirstName, LastName, DepartmentID
    HAVING COUNT(*) > 1
) duplicates ON e.FirstName = duplicates.FirstName 
              AND e.LastName = duplicates.LastName 
              AND e.DepartmentID = duplicates.DepartmentID;

🚀 Intermediate & Advanced SQL Queries

1. Managers with at Least 5 Direct Reports

-- Sample table with manager hierarchy
CREATE TABLE Employee (
    Id INT,
    Name VARCHAR(50),
    Department VARCHAR(50),
    ManagerId INT
);

-- Solution
SELECT m.Name as ManagerName, COUNT(e.Id) as DirectReports
FROM Employee m
JOIN Employee e ON m.Id = e.ManagerId
GROUP BY m.Id, m.Name
HAVING COUNT(e.Id) >= 5;

2. Users with Balance > $10,000

CREATE TABLE Users (
    UserId INT,
    UserName VARCHAR(50)
);

CREATE TABLE Transactions (
    TransactionId INT,
    UserId INT,
    Amount DECIMAL(10,2),
    TransactionType VARCHAR(10)  -- 'CREDIT' or 'DEBIT'
);

-- Solution
SELECT u.UserName, SUM(
    CASE 
        WHEN t.TransactionType = 'CREDIT' THEN t.Amount
        WHEN t.TransactionType = 'DEBIT' THEN -t.Amount
        ELSE 0
    END
) as Balance
FROM Users u
JOIN Transactions t ON u.UserId = t.UserId
GROUP BY u.UserId, u.UserName
HAVING SUM(
    CASE 
        WHEN t.TransactionType = 'CREDIT' THEN t.Amount
        WHEN t.TransactionType = 'DEBIT' THEN -t.Amount
        ELSE 0
    END
) > 10000;

3. Median Salary by Company (Without Built-in Functions)

WITH OrderedSalaries AS (
    SELECT CompanyId, Salary,
           ROW_NUMBER() OVER (PARTITION BY CompanyId ORDER BY Salary) as rn,
           COUNT(*) OVER (PARTITION BY CompanyId) as total_count
    FROM Employees
),
MedianValues AS (
    SELECT CompanyId, Salary
    FROM OrderedSalaries
    WHERE rn IN (
        (total_count + 1) / 2,      -- For odd count
        (total_count + 2) / 2       -- For even count
    )
)
SELECT CompanyId, AVG(Salary) as MedianSalary
FROM MedianValues
GROUP BY CompanyId;

4. Recursive Query for Organizational Hierarchy

WITH EmployeeHierarchy AS (
    -- Anchor: Top-level managers (no manager)
    SELECT EmployeeId, Name, ManagerId, 0 as Level, 
           CAST(Name AS VARCHAR(1000)) as HierarchyPath
    FROM Employee
    WHERE ManagerId IS NULL
    
    UNION ALL
    
    -- Recursive: Employees with managers
    SELECT e.EmployeeId, e.Name, e.ManagerId, eh.Level + 1,
           CAST(eh.HierarchyPath + ' -> ' + e.Name AS VARCHAR(1000))
    FROM Employee e
    JOIN EmployeeHierarchy eh ON e.ManagerId = eh.EmployeeId
)
SELECT EmployeeId, Name, Level, HierarchyPath
FROM EmployeeHierarchy
ORDER BY Level, Name;

5. Pivot Without PIVOT Keyword

-- Sample data: Sales by quarter
CREATE TABLE Sales (
    SalesPersonId INT,
    Quarter VARCHAR(2),
    Amount DECIMAL(10,2)
);

-- Pivot using CASE statements
SELECT SalesPersonId,
       SUM(CASE WHEN Quarter = 'Q1' THEN Amount ELSE 0 END) as Q1,
       SUM(CASE WHEN Quarter = 'Q2' THEN Amount ELSE 0 END) as Q2,
       SUM(CASE WHEN Quarter = 'Q3' THEN Amount ELSE 0 END) as Q3,
       SUM(CASE WHEN Quarter = 'Q4' THEN Amount ELSE 0 END) as Q4
FROM Sales
GROUP BY SalesPersonId;

6. Window Functions Examples

SELECT EmployeeId, Name, Salary, DepartmentId,
       -- Ranking functions
       ROW_NUMBER() OVER (ORDER BY Salary DESC) as RowNum,
       RANK() OVER (ORDER BY Salary DESC) as Rank,
       DENSE_RANK() OVER (ORDER BY Salary DESC) as DenseRank,
       
       -- Partition by department
       ROW_NUMBER() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) as DeptRank,
       
       -- Aggregate functions
       SUM(Salary) OVER () as TotalSalary,
       AVG(Salary) OVER (PARTITION BY DepartmentId) as DeptAvgSalary,
       
       -- Running totals
       SUM(Salary) OVER (ORDER BY EmployeeId) as RunningTotal,
       
       -- Lead and Lag
       LAG(Salary, 1) OVER (ORDER BY Salary) as PrevSalary,
       LEAD(Salary, 1) OVER (ORDER BY Salary) as NextSalary,
       
       -- First and Last value
       FIRST_VALUE(Salary) OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) as HighestInDept,
       LAST_VALUE(Salary) OVER (PARTITION BY DepartmentId ORDER BY Salary DESC 
                                ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as LowestInDept
FROM Employees;

7. Find Gaps in Sequential IDs

-- Find missing IDs in sequence
WITH NumberSequence AS (
    SELECT MIN(Id) as start_id, MAX(Id) as end_id
    FROM TableWithIds
),
AllNumbers AS (
    SELECT start_id as num
    FROM NumberSequence
    UNION ALL
    SELECT num + 1
    FROM AllNumbers
    WHERE num < (SELECT end_id FROM NumberSequence)
)
SELECT num as MissingId
FROM AllNumbers
WHERE num NOT IN (SELECT Id FROM TableWithIds);

-- Alternative method using LAG
SELECT Id + 1 as GapStart, 
       NextId - 1 as GapEnd
FROM (
    SELECT Id, 
           LEAD(Id) OVER (ORDER BY Id) as NextId
    FROM TableWithIds
) t
WHERE NextId - Id > 1;

8. Find Alternate Rows (Even/Odd Logic)

-- Get odd rows
SELECT *
FROM (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Id) as rn
    FROM Employees
) t
WHERE rn % 2 = 1;

-- Get even rows
SELECT *
FROM (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Id) as rn
    FROM Employees
) t
WHERE rn % 2 = 0;

9. Employees Who Are Also Managers

SELECT DISTINCT e1.EmployeeId, e1.Name
FROM Employee e1
JOIN Employee e2 ON e1.EmployeeId = e2.ManagerId;

-- Alternative using EXISTS
SELECT EmployeeId, Name
FROM Employee e1
WHERE EXISTS (
    SELECT 1 
    FROM Employee e2 
    WHERE e2.ManagerId = e1.EmployeeId
);

10. Deadlock Detection and Error Handling

-- Error handling with TRY-CATCH
BEGIN TRY
    BEGIN TRANSACTION;
        UPDATE Account SET Balance = Balance - 100 WHERE AccountId = 1;
        UPDATE Account SET Balance = Balance + 100 WHERE AccountId = 2;
    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
        ROLLBACK TRANSACTION;
    
    -- Handle specific errors
    IF ERROR_NUMBER() = 1205  -- Deadlock
    BEGIN
        PRINT 'Deadlock detected. Transaction rolled back.';
        -- Retry logic can be implemented here
    END
    ELSE
    BEGIN
        PRINT 'Error: ' + ERROR_MESSAGE();
    END
END CATCH;

🏢 Scenario and Company-Specific Questions

Amazon: Most Viewed Product in Each Category

CREATE TABLE ProductViews (
    ProductId INT,
    CategoryId INT,
    Views INT
);

-- Solution
WITH RankedProducts AS (
    SELECT ProductId, CategoryId, Views,
           ROW_NUMBER() OVER (PARTITION BY CategoryId ORDER BY Views DESC) as rn
    FROM ProductViews
)
SELECT ProductId, CategoryId, Views
FROM RankedProducts
WHERE rn = 1;

TCS: Remove Duplicate Rows Using ROW_NUMBER

-- Delete duplicates keeping the first occurrence
WITH DuplicateRows AS (
    SELECT *, ROW_NUMBER() OVER (
        PARTITION BY FirstName, LastName, Email 
        ORDER BY EmployeeId
    ) as rn
    FROM Employees
)
DELETE FROM DuplicateRows WHERE rn > 1;

-- Alternative: Create new table without duplicates
SELECT *
FROM (
    SELECT *, ROW_NUMBER() OVER (
        PARTITION BY FirstName, LastName, Email 
        ORDER BY EmployeeId
    ) as rn
    FROM Employees
) t
WHERE rn = 1;

Teradata: QUALIFY and CASE Examples

-- QUALIFY to filter ranked results (Teradata syntax)
SELECT EmployeeId, Name, Salary, DepartmentId
FROM Employees
QUALIFY ROW_NUMBER() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) <= 3;

-- CASE in SELECT clause
SELECT EmployeeId, Name, Salary,
       CASE 
           WHEN Salary > 100000 THEN 'Senior'
           WHEN Salary > 70000 THEN 'Mid-level'
           WHEN Salary > 40000 THEN 'Junior'
           ELSE 'Intern'
       END as Level,
       CASE DepartmentId
           WHEN 1 THEN 'Engineering'
           WHEN 2 THEN 'Sales'
           WHEN 3 THEN 'Marketing'
           ELSE 'Other'
       END as Department
FROM Employees;

Oracle: ROWNUM and Hierarchical Queries

-- ROWNUM example (Oracle)
SELECT * FROM (
    SELECT *, ROWNUM as rn
    FROM Employees
    ORDER BY Salary DESC
) 
WHERE rn <= 10;  -- Top 10 highest paid

-- Hierarchical query with CONNECT BY (Oracle)
SELECT LEVEL, EmployeeId, Name, ManagerId,
       SYS_CONNECT_BY_PATH(Name, '/') as HierarchyPath
FROM Employee
START WITH ManagerId IS NULL
CONNECT BY PRIOR EmployeeId = ManagerId
ORDER SIBLINGS BY Name;

🛠️ Database Design & Optimization

1. Query Optimization Techniques

Slow Query Example

-- Slow query
SELECT e.Name, d.DepartmentName, COUNT(p.ProjectId)
FROM Employees e
JOIN Departments d ON e.DepartmentId = d.DepartmentId
LEFT JOIN ProjectAssignments pa ON e.EmployeeId = pa.EmployeeId
LEFT JOIN Projects p ON pa.ProjectId = p.ProjectId
WHERE e.HireDate > '2020-01-01'
GROUP BY e.EmployeeId, e.Name, d.DepartmentName;

Optimization Steps

1. Add Indexes:

CREATE INDEX idx_employee_hiredate ON Employees(HireDate);
CREATE INDEX idx_employee_dept ON Employees(DepartmentId);
CREATE INDEX idx_project_assign_emp ON ProjectAssignments(EmployeeId);

2. Rewrite Query:

-- Optimized query
WITH RecentEmployees AS (
    SELECT EmployeeId, Name, DepartmentId
    FROM Employees
    WHERE HireDate > '2020-01-01'
)
SELECT re.Name, d.DepartmentName, COUNT(p.ProjectId)
FROM RecentEmployees re
JOIN Departments d ON re.DepartmentId = d.DepartmentId
LEFT JOIN ProjectAssignments pa ON re.EmployeeId = pa.EmployeeId
LEFT JOIN Projects p ON pa.ProjectId = p.ProjectId
GROUP BY re.EmployeeId, re.Name, d.DepartmentName;

3. Check Execution Plan:

-- SQL Server
SET STATISTICS IO ON;
EXPLAIN QUERY PLAN;

-- MySQL
EXPLAIN SELECT ...;

-- PostgreSQL
EXPLAIN ANALYZE SELECT ...;

2. Data Security Best Practices

Encryption

-- Column-level encryption (SQL Server)
CREATE TABLE SecureEmployees (
    EmployeeId INT,
    Name VARCHAR(50),
    SSN VARBINARY(256)  -- Encrypted column
);

-- Insert encrypted data
INSERT INTO SecureEmployees 
VALUES (1, 'John Doe', EncryptByKey(Key_GUID('SSN_Key'), '123-45-6789'));

-- Decrypt data
SELECT EmployeeId, Name,
       CONVERT(VARCHAR, DecryptByKey(SSN)) as SSN
FROM SecureEmployees;

Access Control

-- Create roles
CREATE ROLE db_reader;
CREATE ROLE db_writer;

-- Grant permissions
GRANT SELECT ON Employees TO db_reader;
GRANT SELECT, INSERT, UPDATE ON Employees TO db_writer;

-- Create users and assign roles
CREATE USER 'analyst'@'localhost' IDENTIFIED BY 'password';
GRANT db_reader TO 'analyst'@'localhost';

3. Scalable Database Schema Design

Partitioning Example

-- Horizontal partitioning by date
CREATE TABLE Orders_2023 (
    OrderId INT,
    OrderDate DATE,
    CustomerId INT,
    Amount DECIMAL(10,2)
) PARTITION BY RANGE (YEAR(OrderDate)) (
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION pmax VALUES LESS THAN MAXVALUE
);

Sharding Strategy

-- Shard by customer ID
-- Shard 1: Customer IDs 1-1000000
-- Shard 2: Customer IDs 1000001-2000000
-- etc.

CREATE TABLE Customer_Shard1 (
    CustomerId INT CHECK (CustomerId BETWEEN 1 AND 1000000),
    CustomerName VARCHAR(100),
    -- other columns
);

4. Clustered vs Non-Clustered Indexes

-- Clustered index (determines physical storage order)
CREATE CLUSTERED INDEX idx_employee_id ON Employees(EmployeeId);

-- Non-clustered index (separate structure pointing to data)
CREATE NONCLUSTERED INDEX idx_employee_name ON Employees(LastName, FirstName);

-- Covering index (includes additional columns)
CREATE NONCLUSTERED INDEX idx_employee_dept_salary 
ON Employees(DepartmentId) 
INCLUDE (Salary, HireDate);

🎲 Real-World and Tricky Query Cases

1. Swap Alternate Students' Seat IDs

CREATE TABLE Seat (
    id INT,
    student VARCHAR(50)
);

INSERT INTO Seat VALUES 
(1, 'Abbot'), (2, 'Doris'), (3, 'Emerson'), (4, 'Green'), (5, 'Jeames');

-- Solution using CASE
SELECT id,
       CASE 
           WHEN id % 2 = 1 AND id < (SELECT MAX(id) FROM Seat) THEN 
               (SELECT student FROM Seat s2 WHERE s2.id = Seat.id + 1)
           WHEN id % 2 = 0 THEN 
               (SELECT student FROM Seat s2 WHERE s2.id = Seat.id - 1)
           ELSE student
       END as student
FROM Seat
ORDER BY id;

-- Alternative using ROW_NUMBER
WITH SwappedSeats AS (
    SELECT id, student,
           CASE 
               WHEN ROW_NUMBER() OVER (ORDER BY id) % 2 = 1 
               THEN LEAD(student, 1, student) OVER (ORDER BY id)
               ELSE LAG(student) OVER (ORDER BY id)
           END as new_student
    FROM Seat
)
SELECT id, new_student as student
FROM SwappedSeats
ORDER BY id;

2. Consecutive IDs with High People Count

CREATE TABLE Stadium (
    id INT,
    visit_date DATE,
    people INT
);

-- Find 3+ consecutive rows with people >= 100
WITH ConsecutiveGroups AS (
    SELECT id, visit_date, people,
           id - ROW_NUMBER() OVER (ORDER BY id) as grp
    FROM Stadium
    WHERE people >= 100
),
GroupCounts AS (
    SELECT grp, COUNT(*) as consecutive_count,
           MIN(id) as start_id, MAX(id) as end_id
    FROM ConsecutiveGroups
    GROUP BY grp
    HAVING COUNT(*) >= 3
)
SELECT s.*
FROM Stadium s
JOIN GroupCounts gc ON s.id BETWEEN gc.start_id AND gc.end_id
ORDER BY s.id;

3. Employee Count by Gender with Date Range

SELECT 
    Gender,
    COUNT(*) as TotalCount,
    COUNT(CASE WHEN DOB BETWEEN '1990-01-01' AND '1999-12-31' THEN 1 END) as Count_90s,
    COUNT(CASE WHEN DOB BETWEEN '1980-01-01' AND '1989-12-31' THEN 1 END) as Count_80s,
    COUNT(CASE WHEN DOB BETWEEN '2000-01-01' AND '2009-12-31' THEN 1 END) as Count_2000s
FROM Employees
GROUP BY Gender;

4. Create Empty Table with Same Structure

-- Method 1: Using LIKE (MySQL)
CREATE TABLE Employees_Backup LIKE Employees;

-- Method 2: Using SELECT with FALSE condition
CREATE TABLE Employees_Backup AS
SELECT * FROM Employees WHERE 1 = 0;

-- Method 3: Explicit structure copy (SQL Server)
SELECT TOP 0 * 
INTO Employees_Backup 
FROM Employees;

5. Customers by Month with Product Count

WITH MonthlyData AS (
    SELECT c.CustomerId, c.CustomerName,
           YEAR(c.JoinDate) as JoinYear,
           MONTH(c.JoinDate) as JoinMonth,
           COUNT(DISTINCT o.ProductId) as ProductCount
    FROM Customers c
    LEFT JOIN Orders o ON c.CustomerId = o.CustomerId
    GROUP BY c.CustomerId, c.CustomerName, YEAR(c.JoinDate), MONTH(c.JoinDate)
)
SELECT JoinYear, JoinMonth,
       COUNT(*) as CustomerCount,
       AVG(ProductCount) as AvgProductsPerCustomer,
       SUM(ProductCount) as TotalProducts
FROM MonthlyData
GROUP BY JoinYear, JoinMonth
ORDER BY JoinYear, JoinMonth;

6. Detect and Handle Duplicate Values

-- Detect columns with duplicates
SELECT 'FirstName' as ColumnName, COUNT(DISTINCT FirstName) as UniqueValues, COUNT(*) as TotalRows
FROM Employees
UNION ALL
SELECT 'LastName', COUNT(DISTINCT LastName), COUNT(*)
FROM Employees
UNION ALL
SELECT 'Email', COUNT(DISTINCT Email), COUNT(*)
FROM Employees;

-- Handle duplicates by adding sequence number
SELECT *,
       ROW_NUMBER() OVER (PARTITION BY FirstName, LastName ORDER BY EmployeeId) as DuplicateSequence
FROM Employees;

-- Update duplicate emails
UPDATE Employees 
SET Email = Email + '_' + CAST(ROW_NUMBER() OVER (PARTITION BY Email ORDER BY EmployeeId) AS VARCHAR)
WHERE Email IN (
    SELECT Email 
    FROM Employees 
    GROUP BY Email 
    HAVING COUNT(*) > 1
);

📝 Frequently Asked Short/Conceptual Questions

Quick Reference Table

Question SQL Example Key Concept
Find employees whose first name starts with "S" SELECT * FROM Employee WHERE fname LIKE 'S%' LIKE operator
List departments where avg male salary > overall avg SELECT dept FROM emp WHERE gender='M' GROUP BY dept HAVING AVG(sal) > (SELECT AVG(sal) FROM emp) Subquery with HAVING
Ranking customers by balance with skipped ranks SELECT *, RANK() OVER (ORDER BY balance DESC) FROM customers Window functions
UNION vs UNION ALL UNION removes duplicates, UNION ALL keeps all Set operations
JOIN on three tables FROM A JOIN B ON... JOIN C ON... Multi-table joins
Managers with 5+ employees SELECT mgr FROM emp GROUP BY mgr HAVING COUNT(*) > 5 GROUP BY + HAVING
Remove all rows (TRUNCATE vs DELETE) TRUNCATE TABLE emp vs DELETE FROM emp DDL vs DML
Most recent hire per department SELECT dept, MAX(hire_date) FROM emp GROUP BY dept Aggregate functions

Detailed Examples

1. Employees Starting with "S"

-- Case-sensitive search
SELECT * FROM Employee WHERE FirstName LIKE 'S%';

-- Case-insensitive search
SELECT * FROM Employee WHERE UPPER(FirstName) LIKE 'S%';

-- Multiple patterns
SELECT * FROM Employee WHERE FirstName LIKE 'S%' OR FirstName LIKE 'A%';

2. Departments with High Male Average Salary

WITH OverallAvg AS (
    SELECT AVG(Salary) as avg_salary FROM Employee
),
DeptMaleAvg AS (
    SELECT DepartmentId, AVG(Salary) as male_avg_salary
    FROM Employee
    WHERE Gender = 'M'
    GROUP BY DepartmentId
)
SELECT d.DepartmentName, dma.male_avg_salary
FROM DeptMaleAvg dma
JOIN Departments d ON dma.DepartmentId = d.DepartmentId
CROSS JOIN OverallAvg oa
WHERE dma.male_avg_salary > oa.avg_salary;

3. Customer Ranking with RANK vs DENSE_RANK

SELECT CustomerId, Balance,
       RANK() OVER (ORDER BY Balance DESC) as rank_with_gaps,
       DENSE_RANK() OVER (ORDER BY Balance DESC) as rank_no_gaps,
       ROW_NUMBER() OVER (ORDER BY Balance DESC) as row_number
FROM Customers
ORDER BY Balance DESC;

4. UNION vs UNION ALL Demonstration

-- UNION (removes duplicates)
SELECT EmployeeId, FirstName FROM Employees WHERE DepartmentId = 1
UNION
SELECT EmployeeId, FirstName FROM Employees WHERE Salary > 50000;

-- UNION ALL (keeps duplicates)
SELECT EmployeeId, FirstName FROM Employees WHERE DepartmentId = 1
UNION ALL
SELECT EmployeeId, FirstName FROM Employees WHERE Salary > 50000;

5. Three-Table JOIN

SELECT e.EmployeeName, d.DepartmentName, p.ProjectName
FROM Employees e
JOIN Departments d ON e.DepartmentId = d.DepartmentId
JOIN Projects p ON e.ProjectId = p.ProjectId
WHERE e.Status = 'Active';

6. Tree Node Classification

-- Identify root, leaf, and inner nodes
SELECT Id,
       CASE 
           WHEN ParentId IS NULL THEN 'Root'
           WHEN Id NOT IN (SELECT DISTINCT ParentId FROM Tree WHERE ParentId IS NOT NULL) THEN 'Leaf'
           ELSE 'Inner'
       END as NodeType
FROM Tree;

🔬 Advanced Topics & Performance

1. Deadlock Prevention and Detection

Understanding Deadlocks

-- Scenario that can cause deadlock
-- Session 1:
BEGIN TRANSACTION;
UPDATE Account SET Balance = Balance - 100 WHERE AccountId = 1;
-- Session 1 waits for Session 2 to release lock on AccountId = 2

-- Session 2:
BEGIN TRANSACTION;
UPDATE Account SET Balance = Balance - 50 WHERE AccountId = 2;
-- Session 2 waits for Session 1 to release lock on AccountId = 1
-- DEADLOCK occurs here

Prevention Strategies

-- 1. Always access resources in same order
BEGIN TRANSACTION;
UPDATE Account SET Balance = Balance - 100 WHERE AccountId = 1;  -- Always update lower ID first
UPDATE Account SET Balance = Balance + 100 WHERE AccountId = 2;
COMMIT;

-- 2. Use shorter transactions
BEGIN TRANSACTION;
    DECLARE @balance1 DECIMAL(10,2), @balance2 DECIMAL(10,2);
    SELECT @balance1 = Balance FROM Account WHERE AccountId = 1;
    SELECT @balance2 = Balance FROM Account WHERE AccountId = 2;
    
    IF @balance1 >= 100
    BEGIN
        UPDATE Account SET Balance = @balance1 - 100 WHERE AccountId = 1;
        UPDATE Account SET Balance = @balance2 + 100 WHERE AccountId = 2;
    END
COMMIT;

-- 3. Use appropriate isolation levels
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

2. Window Functions Deep Dive

-- Advanced window function examples
SELECT EmployeeId, Name, Salary, DepartmentId, HireDate,
       
       -- Running calculations
       SUM(Salary) OVER (ORDER BY HireDate ROWS UNBOUNDED PRECEDING) as RunningPayroll,
       AVG(Salary) OVER (ORDER BY HireDate ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) as MovingAvg,
       
       -- Percentile functions
       PERCENT_RANK() OVER (ORDER BY Salary) as SalaryPercentile,
       CUME_DIST() OVER (ORDER BY Salary) as CumulativeDistribution,
       NTILE(4) OVER (ORDER BY Salary) as SalaryQuartile,
       
       -- Advanced LAG/LEAD
       LAG(Salary, 2, 0) OVER (PARTITION BY DepartmentId ORDER BY HireDate) as SalaryTwoPositionsBack,
       LEAD(HireDate, 1) OVER (PARTITION BY DepartmentId ORDER BY HireDate) as NextHireDate,
       
       -- Frame specifications
       COUNT(*) OVER (PARTITION BY DepartmentId) as DeptSize,
       MIN(Salary) OVER (PARTITION BY DepartmentId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as MinSalaryUpToNow
FROM Employees;

3. Running Totals and Rolling Averages

-- Sales data example
CREATE TABLE DailySales (
    SaleDate DATE,
    Amount DECIMAL(10,2)
);

SELECT SaleDate, Amount,
       -- Running total
       SUM(Amount) OVER (ORDER BY SaleDate) as RunningTotal,
       
       -- 7-day rolling average
       AVG(Amount) OVER (ORDER BY SaleDate ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as RollingAvg7Day,
       
       -- Month-to-date total
       SUM(Amount) OVER (PARTITION BY YEAR(SaleDate), MONTH(SaleDate) ORDER BY SaleDate) as MTDTotal,
       
       -- Year-over-year comparison
       LAG(Amount, 365) OVER (ORDER BY SaleDate) as PriorYearAmount,
       Amount - LAG(Amount, 365) OVER (ORDER BY SaleDate) as YoYChange
FROM DailySales;

4. Materialized Views and Query Optimization

-- Create materialized view (PostgreSQL syntax)
CREATE MATERIALIZED VIEW mv_department_stats AS
SELECT d.DepartmentId, d.DepartmentName,
       COUNT(e.EmployeeId) as EmployeeCount,
       AVG(e.Salary) as AvgSalary,
       SUM(e.Salary) as TotalSalary,
       MIN(e.HireDate) as EarliestHire,
       MAX(e.HireDate) as LatestHire
FROM Departments d
LEFT JOIN Employees e ON d.DepartmentId = e.DepartmentId
GROUP BY d.DepartmentId, d.DepartmentName;

-- Refresh materialized view
REFRESH MATERIALIZED VIEW mv_department_stats;

-- Query using materialized view
SELECT * FROM mv_department_stats WHERE AvgSalary > 60000;

5. Query Execution Plans

-- SQL Server execution plan
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
EXPLAIN QUERY PLAN 
SELECT e.Name, d.DepartmentName
FROM Employees e
JOIN Departments d ON e.DepartmentId = d.DepartmentId
WHERE e.Salary > 50000;

-- Analysis points:
-- 1. Index Seek vs Index Scan vs Table Scan
-- 2. Join algorithms (Nested Loop, Hash Join, Merge Join)
-- 3. Sort operations
-- 4. Key Lookups
-- 5. Estimated vs Actual rows

6. Advanced Optimization Techniques

Partitioning Strategy

-- Range partitioning by date
CREATE TABLE Orders (
    OrderId INT,
    OrderDate DATE,
    CustomerId INT,
    Amount DECIMAL(10,2)
)
PARTITION BY RANGE (YEAR(OrderDate)) (
    PARTITION p2021 VALUES LESS THAN (2022),
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025)
);

-- Hash partitioning by customer
CREATE TABLE CustomerData (
    CustomerId INT,
    CustomerName VARCHAR(100),
    Region VARCHAR(50)
)
PARTITION BY HASH(CustomerId) PARTITIONS 4;

Index Strategies

-- Covering index
CREATE INDEX idx_employee_covering 
ON Employees(DepartmentId, Salary) 
INCLUDE (FirstName, LastName, HireDate);

-- Filtered index
CREATE INDEX idx_active_employees 
ON Employees(DepartmentId, Salary) 
WHERE Status = 'Active';

-- Columnstore index (SQL Server)
CREATE COLUMNSTORE INDEX idx_sales_columnstore 
ON Sales(ProductId, SaleDate, Amount, Quantity);

7. Full-Text Search and Spatial Queries

Full-Text Search

-- Create full-text catalog and index
CREATE FULLTEXT CATALOG ProductCatalog;

CREATE FULLTEXT INDEX ON Products(ProductName, Description)
KEY INDEX PK_Products
ON ProductCatalog;

-- Full-text search queries
SELECT * FROM Products 
WHERE CONTAINS(ProductName, 'wireless AND bluetooth');

SELECT * FROM Products 
WHERE FREETEXT(Description, 'high quality audio device');

-- Ranking results
SELECT ProductId, ProductName,
       RANK = KEY_TBL.RANK
FROM Products
INNER JOIN CONTAINSTABLE(Products, ProductName, 'wireless') AS KEY_TBL
ON Products.ProductId = KEY_TBL.[KEY]
ORDER BY KEY_TBL.RANK DESC;

Spatial Queries (SQL Server)

-- Create table with spatial data
CREATE TABLE Locations (
    LocationId INT,
    LocationName VARCHAR(100),
    Coordinates GEOGRAPHY
);

-- Insert spatial data
INSERT INTO Locations VALUES 
(1, 'Store A', GEOGRAPHY::Point(40.7128, -74.0060, 4326)),  -- NYC
(2, 'Store B', GEOGRAPHY::Point(34.0522, -118.2437, 4326)); -- LA

-- Find locations within distance
DECLARE @center GEOGRAPHY = GEOGRAPHY::Point(40.7589, -73.9851, 4326); -- Times Square

SELECT LocationName, 
       Coordinates.STDistance(@center) / 1000 as DistanceKM
FROM Locations
WHERE Coordinates.STDistance(@center) <= 50000  -- Within 50km
ORDER BY Coordinates.STDistance(@center);

🎯 Conclusion

This comprehensive guide covers the most important SQL interview questions from basic concepts to advanced topics. Practice these queries regularly and understand the underlying concepts to excel in SQL interviews.

🔑 Key Takeaways

  1. Master the Fundamentals: JOINs, subqueries, and window functions
  2. Practice Problem-Solving: Work through query-based problems regularly
  3. Understand Performance: Learn about indexes, execution plans, and optimization
  4. Know Your Database: Each RDBMS has specific features and syntax
  5. Real-World Application: Focus on practical scenarios and business problems

📚 Study Strategy

  1. Start with Basics: Ensure solid understanding of fundamental concepts
  2. Practice Coding: Write queries by hand and test them
  3. Analyze Performance: Always consider efficiency and optimization
  4. Learn from Examples: Study well-written queries and understand patterns
  5. Mock Interviews: Practice explaining your solutions clearly

🚀 Next Steps

  • Practice on platforms like LeetCode, HackerRank, and SQLBolt
  • Set up your own test database with sample data
  • Study database-specific documentation
  • Join SQL communities and forums
  • Build projects that demonstrate your SQL skills

Good luck with your SQL interviews! 🌟

Remember: Understanding the 'why' behind each solution is more important than memorizing queries.