My suggested approach assumes hash as a baseline approach (identify actually identical files -- which DropBox is apparently already doing) and goes beyond this (identify files that are identical in source and intent, but different owing to random trivial errors).