To produce our Fanboys podcast I have to sync my local recordings with those of others participating via Facetime. I usually do this by lining up the tracks in an audio editor. After years, today I had the idea to finally automate this with a simple python script. Yay!

I works like this: First I calculate a minimal overlapping area of audio (because sometime some recordings start way earlier than other), and pick a small area to sample in the audio I need to sync. That sample gets further reduce to a 2 second snippet where someone is actually

Show thread

speaking via voice segmentation. After that I match for that snippet in the master track via FFT convolution. Then it's just calculating offsets and cutting wav files: Fully automatic double-ender. I'm so happy!

@map very nice! if I understand correctly that does account for clocks that speed up / slow down, right? e.g. the temperature changes and one track might be in sync in the beginning and end but drift out-of-sync in the middle

@map we had this phenomenon even sitting at the same desk with 4 USB microphones. I could have used a solution like this to reduce manual stretching and compressing of time

@lastfuture no, I don't control for stretching as it isn't an issue usually™. This takes care of syncing up tracks to a master so I have a synced separate track for each speaker so I can do crossgating and stuff like that.

Sign in to participate in the conversation

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!