<three-layer of input - hidden - output>
If hidden layer size is too small compared to input size, then the learning will saturate early.
The loss will be saturated at relatively high value.
The hidden/input ratio for the ideal hidden size grows as the input size becomes large.
To make AEs with a big input size, train layer by layer: from the largest one to the central bottleneck layer.
Ex. to train n1 - n2 - n3 - n4 - n5 NN, first train n1 - n2 - n5, and, train n2 - n3 - n4, and then, train n1 - n2 - n3 - n4 - n5 as a whole.