It looks like they didn't split up the two training sets (criminal/noncriminal) into two testing and training sets?
Which would explain this 'paradox', it's just overtraining:
>The seeming paradox that Sc [the criminal set] and Sn [the noncriminal set] can be classified but the average faces of Sc [the criminal set] and Sn [the noncriminal set] appear almost the same can be explained, if the data distributions of Sc [the criminal set] and Sn [the noncriminal set] are heavily mingled and yet separable.
They're heavily mingled because they're identical and you're just testing your predictions with your training data.
Which would explain this 'paradox', it's just overtraining:
>The seeming paradox that Sc [the criminal set] and Sn [the noncriminal set] can be classified but the average faces of Sc [the criminal set] and Sn [the noncriminal set] appear almost the same can be explained, if the data distributions of Sc [the criminal set] and Sn [the noncriminal set] are heavily mingled and yet separable.
They're heavily mingled because they're identical and you're just testing your predictions with your training data.