I works like this: First I calculate a minimal overlapping area of audio (because sometime some recordings start way earlier than other), and pick a small area to sample in the audio I need to sync. That sample gets further reduce to a 2 second snippet where someone is actually
speaking via voice segmentation. After that I match for that snippet in the master track via FFT convolution. Then it's just calculating offsets and cutting wav files: Fully automatic double-ender. I'm so happy!
@map very nice! if I understand correctly that does account for clocks that speed up / slow down, right? e.g. the temperature changes and one track might be in sync in the beginning and end but drift out-of-sync in the middle
@map we had this phenomenon even sitting at the same desk with 4 USB microphones. I could have used a solution like this to reduce manual stretching and compressing of time
@lastfuture no, I don't control for stretching as it isn't an issue usually™. This takes care of syncing up tracks to a master so I have a synced separate track for each speaker so I can do crossgating and stuff like that.
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!