GitHub - Jina-Kim/11775-hw1: Audio-based Multimedia Event Detection

HW1: Audio-based Multimedia Event Detection

Feature extraction (run.feature.sh)
- MFCC
  - extract audio (.wav) from video files and MFCC features
  - train k-means model and reduce MFCC feature dimension into k-dim vector (count the predicted clusters in each video files and normalize the vector)
- ASR Transcriptions
  - preprocess transcriptions txt files and convert the dialog in each video into the word vectors (CountVectorizer, TfidfVectorizer, and Doc2Vec)
- MFCC + ASR Transcriptions
  - concatenate two extracted features
Multimedia event detection classification (run.med.sh)
- Classifiers
  - train each classifier (SVM and XGBoost) in train set and calculate average precision on validation/test set

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.idea		.idea
asr_pred		asr_pred
mfcc_pred		mfcc_pred
mixed_pred		mixed_pred
scripts		scripts
README.md		README.md
run.feature.sh		run.feature.sh
run.med.sh		run.med.sh