Well, it's sure speculation on my part what the root cause is, but i think OpenAI is already trying to ensure the network generalises. It's just common behaviour for neural network to memorise frequent samples, so I think my guess is quite realistic. I don't think OpenAI would not notice large-scale memorisation in their model. But as long as they don't publish more details it's just guesswork.
Just keep in mind that it's a statistical tool. You can't really formally prove that it won't memorise, but I think with enough work you can get it unlikely enough that it won't matter. It's their first iteration.
Just keep in mind that it's a statistical tool. You can't really formally prove that it won't memorise, but I think with enough work you can get it unlikely enough that it won't matter. It's their first iteration.