From LeNet to SENet: A Convolution Neural Network Review
Classic Network
GoogLeNet and ResNet
Inception Genre
ResNet Genre
Mobile End
SENet
To Sum Up
AI Technology Review’s Note:
This article is written by Zhejiang University’s Star Fan for AI Technology Review as an exclusive piece. Unauthorized reproduction is strictly prohibited.
Starting from the early 1990s, convolutional neural networks (CNNs) have evolved significantly, transforming how we approach complex pattern recognition tasks. From the pioneering work of LeNet in 1998 to the groundbreaking advancements of ResNet and SENet in recent years, the journey through the development of CNNs is nothing short of remarkable.
Classic Network
LeNet
1998 marked the inception of modern CNNs with the introduction of LeNet, which introduced the fundamental building blocks of convolution, activation, and pooling. These components form the backbone of today’s CNN architectures. However, the depth of learning did not experience significant breakthroughs until the mid-2000s.
AlexNet
In 2012, AlexNet brought about a paradigm shift. With an impressive ImageNet Top5 error rate of 16.4%, it surpassed the previous state-of-the-art by a wide margin. The key innovations included the use of a larger network depth, the introduction of ReLU activation functions, Dropout regularization, and advanced training techniques like data augmentation and learning rate schedules.
VGG
Following AlexNet, VGG networks took the concept of depth to the next level. By stacking more layers (16 and 19 layers compared to AlexNet’s 8 layers), VGG networks demonstrated the power of deeper architectures. Despite the increased complexity, the rapid advancements in hardware, particularly the availability of powerful GPUs, made these networks feasible.
GoogLeNet and ResNet
Inception (V1)
In 2015, GoogLeNet introduced the Inception module, which allowed for efficient parallel processing of different filter sizes. The Inception V1 network achieved an ImageNet Top5 error rate of 6.7%. The core innovation was the use of 1x1 convolutions to reduce computational complexity while maintaining depth.
ResNet
In 2016, ResNet introduced the concept of residual connections, which addressed the vanishing gradient problem and enabled the training of extremely deep networks. The ResNet-50 achieved an ImageNet Top5 error rate of 3.57%, showcasing the effectiveness of residual connections in facilitating the training of deeper networks.
Inception Genre
Inception V2 to V4
Inception V2 (BN-Inception)
In 2015, BN-Inception introduced Batch Normalization to further stabilize and accelerate the training process. By normalizing the activations within each mini-batch, BN-Inception achieved an ImageNet Top5 error rate of 4.8%.
Inception V3
In 2015, Inception V3 further refined the architecture, replacing certain convolution kernels with multiple smaller ones to reduce computational complexity. This resulted in an improved ImageNet Top5 error rate of 3.5%.
Inception V4 and Inception-ResNet V1/V2
In 2016, Inception V4 and Inception-ResNet V1/V2 continued to build upon the Inception architecture, incorporating residual connections to facilitate the training of even deeper networks. These networks achieved an ImageNet Top5 error rate of 3.08%.
ResNet Genre
WRN, DenseNet, ResNeXt, Xception
DenseNet
In 2016, DenseNet introduced dense connectivity, where each layer is directly connected to all preceding layers. This allows for better feature reuse and simpler training, although it increases memory usage and computational complexity.
ResNeXt
In 2017, ResNeXt introduced grouped convolutions, which divide the input channels into groups and apply transformations independently. This results in a more lightweight architecture while maintaining high accuracy. The ResNeXt-101 (64x4d) achieved an ImageNet Top5 error rate of 3.03%.
Xception
In 2016, Xception pushed the envelope with depthwise separable convolutions, which separate the spatial and channel-wise convolutions. This reduces computational complexity while maintaining high accuracy. The Xception network achieved an ImageNet Top5 error rate of 3.03%.
Mobile End
SqueezeNet, MobileNet v1 and v2, ShuffleNet
MobileNet v1
In 2017, MobileNet v1 introduced depthwise separable convolutions to achieve a balance between performance and efficiency for mobile applications. It achieved an ImageNet Top5 error rate of 6.7%.
MobileNet v2
In 2018, MobileNet v2 introduced Linear Inverted Residuals Bottlenecks, which further optimized the network architecture for mobile devices. It achieved an ImageNet Top5 error rate of 3.57%.
ShuffleNet
In 2017, ShuffleNet introduced channel shuffling to improve the flow of information across groups. This allowed the network to achieve similar accuracy to AlexNet while running 13 times faster in practice.
SENet
SENet
In addition to the aforementioned networks, SENet introduced the Squeeze-Excitation module, which enhances feature recalibration by applying a global average pooling operation followed by fully connected layers. This allows the network to focus on more relevant features, leading to improved performance with minimal additional computational cost.
Summary
Top5 ImageNet Error Rate Summary Table
Year | Network | Top5 Error Rate |
---|---|---|
2012 | AlexNet | 16.4% |
2014 | VGG | 7.3% |
2015 | Inception V1 | 6.7% |
2016 | ResNet-50 | 3.57% |
2016 | Inception V4 | 3.08% |
2017 | MobileNet v1 | 6.7% |
2018 | MobileNet v2 | 3.57% |
Conclusion
Choosing the right network depends on your specific needs. For general applications, ResNet-50, Xception, or Inception v3 are recommended. For macroscopic precision, ResNeXt-101 (64x4d) or Inception-ResNet v2 are excellent choices. For mobile applications, ShuffleNet or MobileNet v2 are highly recommended. SENet can be added to existing networks to improve accuracy with minimal additional computation.
Acknowledgments
This article was written by Zhejiang University’s Star Fan for AI Technology Review. Unauthorized reproduction is strictly prohibited.
References
Originally published: February 15, 2018
Published on March 6, 2018, 15:41:35
Convolution Neural Network Report
Share This Article:
- Share to a Circle of Friends
- Share to QQ
- Share on Twitter
- Copy the Link to Clipboard
- Scan QR Code
Community Concerns Cloud Scan Code + Tencent Cloud Receive Vouchers