This wraps Meta's TRIBE v2 model, which predicts fMRI brain responses to video, audio, and text using a combo of LLaMA 3.2, V-JEPA2, and Wav2Vec-BERT encoders mapped onto the cortical surface (fsaverage5, roughly 20k vertices). You'd use it for in-silico neuroscience work, like predicting which brain regions light up when someone watches a video or hears speech, without needing actual fMRI data. The inference API is clean: pass a video path, get back a timesteps-by-vertices array you can plot on a brain surface or slice by ROI. Training support is included if you have your own fMRI datasets. Fair warning, you'll need HuggingFace auth for LLaMA access and enough GPU memory, or fall back to CPU for small jobs.
npx skills add https://github.com/aradotso/trending-skills --skill tribev2-brain-encoding