References Models articles: Deep Residual Learning for Image Recognition Optimizer articles: AdaHessian: An Adaptive Second Order Optimizer for Machine Learning Combining Optimization Methods Using an Adaptive Meta Optimizer Decoupled Weight Decay Regularization Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks Other Python libraries: PyTorch: An Imperative Style, High-Performance Deep Learning Library tqdm