You could always replace all the sound in post...re-record any dialogue, get some ambiance recordings of the locations and fill in some music that you can get the rights to (local bands are usually pretty eager for exposure, or you can find a composer who can do "sound alike" club music, etc). You'd lose some of the authenticity (though as a Herzog fan I like the idea of blurring fact and fiction), but you would gain a higher control over the mix, which would be really nice if you want your audience to understand what people are saying.
However, I assume the people you are following will not be actors, so getting them to re-record their speech will not be easy. It would be really easy for it to look really bad, but depending on the direction you are going with the documentary, it could make for a neat surreal off-kilter effect.