ssbd

clip-video-classifier

Install

git clone https://github.com/sizhky/ssbd
cd ssbd
pip install -e .

How to use

Use

clip-video-classifier --help
clip-video-classifier COMMAND --help

to see help for all commands

To train a transformer model on SSBD videos using CLIP as feature extractor

First setup your annotations

DATA_DIR="/mnt/347832F37832B388/ml-datasets/ssbd"
RAW_VIDEO_DIR="$DATA_DIR/ssbd-raw-videos"
clip-video-classifier setup-annotations $DATA_DIR
clip-video-classifier download-raw-videos $DATA_DIR/annotations.csv $RAW_VIDEO_DIR
clip-video-classifier setup-annotations $DATA_DIR --fill-gaps --videos-folder $RAW_VIDEO_DIR

Next extract frames for each video

# change the num-frames-per-sec in the below script if needed
$ chmod +x scripts/extract_frames.sh
$ ./extract_frames.sh

Now extract embeddings for each frames.tensor file saved

clip-video-classifier frames-to-embeddings "/mnt/347832F37832B388/ml-datasets/ssbd/ssbd-frames/5fps" "/mnt/347832F37832B388/ml-datasets/ssbd/ssbd-embeddings/5fps" "ViT-B/32" "cuda"

Finally you can run the notebook nbs/models/02_transformer_clip.ipynb by pointing to the approprirate ssbd-embeddings folder and setting the right hyperparameters

API

You can directly run the notebook nbs/model/03_infer.ipynb to load the deeplearning model and make predictions on every 5 second intervals

You can launch a fastapi server uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload and send a post request like so

This site is open source. Improve this page.