Skip to content

Jina-Kim/11775-hw1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HW1: Audio-based Multimedia Event Detection

CMU 11-775 Large-Scale Multimedia Analysis (Spring2020)

Pipeline

  • Feature extraction (run.feature.sh)

    • MFCC
      • extract audio (.wav) from video files and MFCC features
      • train k-means model and reduce MFCC feature dimension into k-dim vector (count the predicted clusters in each video files and normalize the vector)
    • ASR Transcriptions
      • preprocess transcriptions txt files and convert the dialog in each video into the word vectors (CountVectorizer, TfidfVectorizer, and Doc2Vec)
    • MFCC + ASR Transcriptions
      • concatenate two extracted features
  • Multimedia event detection classification (run.med.sh)

    • Classifiers
      • train each classifier (SVM and XGBoost) in train set and calculate average precision on validation/test set

About

Audio-based Multimedia Event Detection

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors