Rotation Weight Update:
Another Way Different from Dropout to Introduce Lazy Neurons in Image Recognition Tasks
Abstract
Numerous approaches have been developed to enhance the capabilities of deep learning in image recognition. We propose a method called ”Rotational-update,” which cyclically updates the weight of neurons. This approach segments the neurons in a fully connected layer into groups of equal size, each containing √N neurons, with N representing the total neuron count in the layer. It selectively updates the weights of one group at a time per mini-batch. This selective updating mechanism aims to curb excessive learning, potentially reducing overfitting and enhancing validation accuracy. A notable aspect of this method is its compatibility for concurrent use with other techniques, including batch-normalization and dropout.
We used the CIFAR10 dataset for image recognition tasks to validate the method’s efficacy, employing three neural network architectures: VGG-16, ResNet-110, and ResNet-152. Our findings indicate that integrating our proposed method with batch normalization outperforms the accuracy of the combination of dropout and batch normalization. Specifically, the proposed Rotational-update method achieved an accuracy improvement of up to 5 percentage points in VGG-16 and one percentage point in ResNet-110 compared to traditional methods. Thus, we deduce that substituting dropout with our proposed method enhances image recognition task performance and reduces overfitting.
References
of object detection. arXiv preprint arXiv:2004.10934, 2020.
[2] W. Dai, C. Dai, S. Qu, J. Li, and S. Das. Very deep convolutional neural networks
for raw waveforms. In 2017 IEEE international conference on acoustics, speech and
signal processing (ICASSP), pp. 421–425. IEEE, 2017.
[3] N. Erfanian, A. A. Heydari, A. M. Feriz, P. Ianez, A. Derakhshani, M. Ghasemigol, ˜
M. Farahpour, S. M. Razavi, S. Nasseri, H. Safarpour, et al. Deep learning applications in single-cell genomics and transcriptomics data analysis. Biomedicine &
Pharmacotherapy, 165:115077, 2023.
[4] N. Funabiki, Y. Takenaka, and S. Nishikawa. A maximum neural network approach
for n-queens problems. Biological Cybernetics, 76(4):251–255, 1997.
[5] R. Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on
computer vision, pp. 1440–1448, 2015.
[6] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate
object detection and semantic segmentation. In Proceedings of the IEEE conference
on computer vision and pattern recognition, pp. 580–587, 2014.
[7] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, and Y. Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
[8] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In
Proceedings of the IEEE conference on computer vision and pattern recognition, pp.
770–778, 2016.
[9] I. Ilievski, T. Akhtar, J. Feng, and C. Shoemaker. Efficient hyperparameter optimization for deep learning algorithms using deterministic rbf surrogates. In Proceedings
of the AAAI Conference on Artificial Intelligence, Vol. 31-1, 2017.
[10] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by
reducing internal covariate shift. In International conference on machine learning, pp.
448–456. PMLR, 2015.
[11] A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images.
2009.
[12] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep
convolutional neural networks. Advances in neural information processing systems,
25, 2012.
[13] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, ´
A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using
a generative adversarial network. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pp. 4681–4690, 2017.
[14] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with
deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434,
2015.
[15] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified,
real-time object detection. In Proceedings of the IEEE conference on computer vision
and pattern recognition, pp. 779–788, 2016.
[16] J. Redmon and A. Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the
IEEE conference on computer vision and pattern recognition, pp. 7263–7271, 2017.
[17] J. Redmon and A. Farhadi. Yolov3: An incremental improvement. arXiv preprint
arXiv:1804.02767, 2018.
[18] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing
systems, 28, 2015.
[19] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale
image recognition. arXiv preprint arXiv:1409.1556, 2014.
[20] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout:
a simple way to prevent neural networks from overfitting. The journal of machine
learning research, 15(1):1929–1958, 2014.
[21] I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural
networks. Advances in neural information processing systems, 27, 2014.
[22] S. N. Yoichi Takenaka, Nobuo Funabiki. Maximum neural network algorithms for nqueen problems. Journal of Information Processing (Information Processing Society
of Japan), 37(10):1781–1788, 1996.
[23] T. Yu and H. Zhu. Hyper-parameter optimization: A review of algorithms and applications. arXiv preprint arXiv:2003.05689, 2020.
[24] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation
using cycle-consistent adversarial networks. In Proceedings of the IEEE international
conference on computer vision, pp. 2223–2232, 2017.