ResNet

Posted on 2025-09-19 Edited on 2025-10-20 In CV Views:

待更新

Is learning better networks as easy as stacking more layers?

Traditional model: The deeper network has higher training error.

Batch Normalization: Force the input of each layer to maintain a stable, consistent distribution.

SGD: Stochastic Gradient Descent

ReLu: Rectified Linear Unit