🎧 Splicing and Copy-Move Audio Forgery Dataset Generator

This project contains two audio forgery dataset generators based on the TIMIT speech corpus. It simulates splicing and copy-move forgeries for use in training and evaluating audio forensic systems.

🛠️ Overview

The dataset generation process involves applying transformations to authentic audio files from TIMIT using two distinct methods:

🔀 1. RandomPosition Method

Simulates forgeries by:

Selecting a random segment from the original audio.
Inserting that segment at a random new position.
Reconstructing the audio so that the inserted segment appears naturally within the waveform.

📌 Forgery Sample Generation

Original A: ---[Original Audio A] Original B: ---[Original Audio B]---
Forgery: ---[Segment from A]---[Segment from B]---[Remaining A]---

🔁 2. Concatenation Method

Based on the paper:
"Autoencoder for Audio Forgery Detection using Spliced and Copy-Move Audio",
📄 Shaikh et al., 2021
Read the paper here

This method simulates forgeries by:

Extracts 2-second and 1-second segments from each audio file.
Concatenates them in different combinations to simulate forged samples.
Produces:
- 3-second forgered audio
- 2-second forgered audio

📌 Forgery Sample Generation

Forgery: 2s [Segment from A] + 1s [Segment from B] → 3s [Forgered Audio]
Forgery: 1s [Segment from A] + 1s [Segment from B] → 2s [Forgered Audio]
Forgery: 1s [Segment from A] + 1s [Segment from B] + 1s [Segment from A] → 3s [Forgered Audio]
Forgery: 0.5s [Segment from A] + 1s [Segment from B] + 0.5s [Segment from A] → 2s [Forgered Audio]

📂 Output

For each original audio file, this tool will generate:

Original audio dataset
Copy-move forgeries dataset
Splicing forgeries dataset

📌 Use Cases

Training deep learning models for audio forgery detection
Evaluating robustness of audio forensic systems
Dataset creation for research in speech integrity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎧 Splicing and Copy-Move Audio Forgery Dataset Generator

🛠️ Overview

🔀 1. RandomPosition Method

🔁 2. Concatenation Method

📂 Output

📌 Use Cases

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🎧 Splicing and Copy-Move Audio Forgery Dataset Generator

🛠️ Overview

🔀 1. RandomPosition Method

🔁 2. Concatenation Method

📂 Output

📌 Use Cases