Section A

A Comparative Study of Alzheimer’s Disease Classification using Multiple Transfer Learning Models

Deekshitha Prakash1, Nuwan Madusanka1, Subrata Bhattacharjee1, Hyeon-Gyun Park1, Cho-Hee Kim2, Heung-Kook Choi1,*
Author Information & Copyright
1Department of Computer Engineering, Inje University, Gimhae, Korea,,,,
2Department of Digital Anti-Aging Healthcare, Inje University, Gimhae, Korea,
*Corresponding Author : Heung-Kook Choi, Inje University, Gimhae, Republic of Korea, +82 55 320 3437,

© Copyright 2019 Korea Multimedia Society. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Dec 11, 2019; Revised: Dec 20, 2019; Accepted: Dec 23, 2019

Published Online: Dec 31, 2019


Over the past decade, researchers were able to solve complex medical problems as well as acquire deeper understanding of entire issue due to the availability of machine learning techniques, particularly predictive algorithms and automatic recognition of patterns in medical imaging. In this study, a technique called transfer learning has been utilized to classify Magnetic Resonance (MR) images by a pre-trained Convolutional Neural Network (CNN). Rather than training an entire model from scratch, transfer learning approach uses the CNN model by fine-tuning them, to classify MR images into Alzheimer’s disease (AD), mild cognitive impairment (MCI) and normal control (NC). The performance of this method has been evaluated over Alzheimer’s Disease Neuroimaging (ADNI) dataset by changing the learning rate of the model. Moreover, in this study, in order to demonstrate the transfer learning approach we utilize different pre-trained deep learning models such as GoogLeNet, VGG-16, AlexNet and ResNet-18, and compare their efficiency to classify AD. The overall classification accuracy resulted by GoogLeNet for training and testing was 99.84% and 98.25% respectively, which was exceptionally more than other models training and testing accuracies.

Keywords: Alzheimer’s disease; CNN; MR images; Transfer learning


The Alzheimer’s disease (AD) is a neurological brain disorder which causes permanent damage to brain cells associated with the ability of thinking and memorizing. The cognitive decline caused by this disorder ultimately leads to dementia. The report implies that the brain changes identified with Alzheimer’s may begin 20 or more years before the appearance of symptoms [1] and currently there is no treatment for AD [2]. According to 2018 Alzheimer’s disease facts and figures, United States is the sixth leading cause of death and about 5.7 million Americans are living with AD [3].

While no cure exists for the disease yet, there is consensus on the need and benefit for early diagnosis of AD. Currently, many neurologists and medical researchers have been contributing considerable time to researching methods to allow for early detection of AD, and promising results have been continually achieved [4]. At present, there have been many studies about diagnosis of AD based on Magnetic Resonance Imaging (MRI) data, playing a significant role in classifying AD. Various computer-assisted techniques are proposed, classifying the characterized extracted features from the input images. These features are usually extracted from the regions of interest (ROI) and volume of interests (VoI) [5] or even combine different extracted features [6]. While most of the existing work has focused on the binary classification which only classifies AD from NC, proper treatment requires classifying AD, MCI and NC. MCI is a stage prior to AD, where patients will result in mild symptoms of AD and bare the chance of getting transformed to dementia [7].

Recently, machine learning techniques, particularly deep learning, show great potential in aiding the diagnosis of AD using MRI scans. Deep learning methods, such as CNN, have been shown to outperform existing machine learning methods [8-9]. It has made a massive progress in the field of image processing, mainly due to the availability of large labeled datasets such as ImageNet, for better and accurate learning of models. ImageNet offers around 1.2 million natural images with above 1000 distinctive classes. CNN trained over such images results in high accuracy also improves medical image categorization. However, there are certain limitations of training CNN from scratch, requirement of large dataset is one of them. As a result, another alternative approach called transfer learning can be used to overcome this problem which requires minimum dataset and consumes less time [10, 11]. Transfer learning is a machine learning method where a pre-trained network is reused as the starting point for a model on a second task.

In this paper, we employ three pre-trained base models to illustrate transfer learning to effectively classify AD. The main objective of this paper is to show how instead of training a completely new model from scratch, we can utilize transfer learning approach without any preprocessing of the MR images and achieve high accuracy. Moreover, in this study we compare the performance of different deep learning models such as GoogLeNet, VGG-16, AlexNet and SqueezeNet by applying transfer learning method.

II. Materials Used

The data used in the study were taken from the Alzheimer’s disease Neuroimaging Initiative (ADNI) database ( The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies and non-profit organizations, as a $60 million, 5-year public-private partnership. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). ADNI is the result of efforts of many investigators and subjects have been recruited from over 50 sites across the U.S. and Canada.

All subjects were required to be 60 years of age or older. The entire image set was classified subjectively by a neurologist, radiologist, and psychiatrist into categories AD, MCI, and NC. The details of MR images used in this study is shown in Table 1.

Table 1. Characteristics of selected subjects from ADNI dataset.
Class Male Female Age range (years)
AD 160 170 70-90
MCI 190 210 70-90
NC 143 147 70-90
Download Excel Table

III. Classification using Transfer Learning

The proposed method exploits the transfer learning technique for 3-way classification of AD. The architecture of utilizing transfer learning is shown in Fig. 1.

Fig. 1. Architectural representation of transfer learning approach.
Download Original Figure

The transfer learning approach is helpful if we have a small training dataset for parameter learning [12]. We take a trained network, e.g., GoogLeNet as a starting point to learn a new task. GoogleNet pre-trained on ImageNet is taken as a base model to train a brain MR images from ADNI dataset. To use the transfer learning, the fully-connected layers are removed since the outputs of these layers are 1000 categories and is replaced by a new fully-connected layer followed by a softmax layer and an output layer for classifying 3 classes. Then, we train the network by providing training set MR images in addition to training options. Next, we test our model and obtain testing accuracy of the model. Finally, we deploy results using confusion matrix.

IV. Pre-trained CNN Architecture

4.1. Overview

Deep learning is a subfield of machine learning and a collection of algorithms that are inspired by the structure of human brain and try to imitate their functions. CNN is one such deep learning algorithm in which the transformations are done using the convolution operation. A typical CNN is comprised of three basic layers; a convolutional layer, a pooling layer and a fully-connected layer. However, an activation layer, normalization layer and a dropout layer also plays significant role in the deep architecture of CNN model.

The convolutional layer is the core building block of a CNN and is responsible for most of the computations done. It extracts the features from the input image which is to be classified [13]. Its parameter consists set of kernels or learnable filters. It performs the convolution operation or filtration over the input, forwarding the response to the next layer as a feature map [14]. The pooling layer is used to spatially reduce the spatial representation and the computational space [15]. It performs the pooling operation on each of the sliced inputs, reducing the computational cost for the next convolutional layer. The application of convolutional and pooling layers results in the extraction and reduction of features from the input images. The objective of a fully-connected layer is to take the output feature maps of the final convolutional or pooling layers and use them to classify the image into a label.

4.1.1. GoogLeNet

GoogLeNet has been trained on over a million images and can classify into 1000 object categories. It was introduced by a Google team and was the winner of ILSVRC-2014. The network is designed with computational efficiency and practicality in mind. The network is made up of 22-layers deep as shown in Fig. 2(a). All the convolutions, including those inside the inception modules, use rectified linear unit (ReLU) as an activation function. The size of the receptive field of our network is 224×224in the RGB color space with zero mean. Hence all the images were cropped and converted to 224×224×3 size which is a valid input size for our model. Uniqueness lies in the same 9 Inception modules used in GoogLeNet model [16]. Fig. 2(b) shows detailed structure of Inception layer.

4.1.2. AlexNet

The original AlexNet architecture was trained over the ImageNet dataset [17] comprising images belonging to 1000 object classes. It was designed by Alex Krizhevsky and was the winner of ISLRVC-2012 [18]. The architecture of Alexnet is depicted in Fig. 3. It contains 8 layers with the first 5 layers as convolutional followed by 3 fully connected layers. AlexNet accepts input images of size 227×227in RGB color space. Therefore all the images were resized and converted to fit the network criteria to perform transfer learning.

4.1.3. VGG-16

VGG-16 is a CNN model proposed by K. Simonyan and A. Zisserman in 2014 [19]. As the name indicates, the VGG-16 model contains 16 layers in total. Since VGG-16 was trained on RGB i.e., 3 channel images it can accept input only if it has exactly 3 channels. Thus, the input to the first convolutional layer is a fixed size 224×224 RGB image, resulted by cropping and converting the image. Fig. 4 shows the overall architecture of VGG-16 model.

Fig. 2. Block diagram of CNN model (a) GoogLeNet architecture representing 22 layers, (b) Layers in all nine inception modules.
Download Original Figure
Fig. 3. Eight layers representing architecture of AlexNet model.
Download Original Figure
4.1.4. ResNet-18

Deep Residual networks, shortly named as ResNet is developed based on the core idea addressed as shortcut connections or skip connections. These connections provide alternate pathway for data and gradients to flow, thus making training possible. The simplest model is ResNet-18 which has 18 layers. The input image size of this model is 224×224×3. It was developed by Kaiming et al., [20] and was winner of ILSVRC-2015. The detailed architecture of ResNet-18 and how skip connections run in parallel is shown in Fig. 5.

Fig. 4. Architectural representation of 16 layered VGG model.
Download Original Figure
Fig. 5. Architecture of ResNet-18 with skip connections running parallel.
Download Original Figure
4.2. Creating Training and Testing Datasets

An ADNI dataset of total 1020 images is shuffled and split into train and test set in the ratio 70:30. The training and testing set used for 3-way classification (AD vs. MCI vs. NC) for all three networks are same and is shown in Table 2.

Table 2. Training and testing dataset used for all models.
Class label Training set size Testing Set size
AD 231 99
MCI 280 120
NC 203 87
Download Excel Table

We evaluated the performance by changing the learning rate, which controls the amount of change required in response to the estimated error each time the model weights are updated. Choosing the learning rate is challenging as a value too small may result in a long training process, whereas a value too large may result in learning a sub-optimal set of weights too fast or an unstable training process. It also affects the accuracy of the model.

V. Results and Discussion

The classification model was built using MATLAB 2018, which easily offers transfer learning. Training options such as 100 epochs by considering validation patience of 5 as an early stopping parameter, a learning rate of 0.0001, stochastic gradient descent with momentum (SGDM) as an optimizer to minimize the loss function as well as adjust the weight and bias factors, and a mini batch size of 12 were selected to train 621 images. One Epoch is defined as completed when entire dataset is passed forward and backward through neural network. In our case, it took 51 iterations to complete 1 epoch. Validation patience or an extra epoch checks if the model remains stable without any further improvement and thus prevents the model from over-fitting problem.

Since accuracy is the primary evaluation metric, we analyzed training and testing accuracy results by changing the learning rate from 1e-2 to 1e-5 to see how learning rate affects the accuracy and loss of the model. However, as expected the model produced optimal results for learning rate 1e-4. Therefore, we evaluated all models by fixing the learning rate to 1e-4. The accuracy was obtained using a confusion matrix, which describes the performance of a classification model. The confusion matrix of GoogLeNet, AlexNet, VGG-16 and ResNet-18 are showed from Table 3 to 6. The training and validation progress of all the models used in this paper is shown in Figs. 6-9.

Table 3. Confusion matrix obtained by testing GoogLeNet.
Class label AD MCI NC Total Data Accuracy
AD 96 0 3 99 97.05%
MCI 0 120 0 120 100.00%
NC 2 0 85 87 97.70%
Download Excel Table
Table 4. Confusion matrix obtained by testing AlexNet.
Class label AD MCI NC Total Data Accuracy
AD 88 8 3 99 86.90%
MCI 5 114 1 120 95.00%
NC 0 0 87 87 100.00%
Download Excel Table
Table 5. Confusion matrix obtained by testing VGG-16.
Class label AD MCI NC Total Data Accuracy
AD 76 0 23 99 76.80%
MCI 4 107 9 120 89.2%
NC 0 0 87 87 100.00%
Download Excel Table
Table 6. Confusion matrix obtained by testing ResNet-18.
Class label AD MCI NC Total Data Accuracy
AD 92 5 2 99 92.90%
MCI 0 120 0 120 100%
NC 2 0 85 87 97.70%
Download Excel Table
Fig. 6. Training and validation progress of GoogLeNet for 1e-4 learning rate.
Download Original Figure
Fig. 7. Training and validation progress of AlexNet for 1e-4 learning rate.
Download Original Figure
Fig. 8. Training and validation progress of VGG-16 for 1e-4 learning rate.
Download Original Figure
Fig. 9. Training and validation progress of ResNet-18 for 1e-4 learning rate.
Download Original Figure

Figs. 5~8 show how the models progresses when training the model and how its accuracy and loss change. Although every models were trained for 100 epochs, it was trained for less epochs due to early stopping in order to prevent the model from overfitting. In Fig. 5, we can clearly note that the training loss of GoogLeNet steadily declined and almost reached zero while raising the accuracy of the model. The possble reason for this is the inception modules used in GoogLeNet, which increases the depth of the model and learns more features from the image. Nevertheless, other models were also capable in reducing the loss with an increase in accuarcy. ResNet-18 in Fig. 8 also performed well with the reduction in loss to almost zero. It was more efficient as compared to AlexNet and VGG-16 due to skip connections used in the network.

In Table 3., we can clearly note that GoogLeNet was successful in classification with 97.05% of AD, 100% of MCI and 97.70% of NC being correctly classified. Thus, the overall testing accuracy of GoogLeNet resulted in 98.25%, Moreover, the other models also resulted in good accuracies such as AlexNet, VGG-16 and ResNet-18 shown in Table 4, 5 and 6 respectively. Among these models ResNet-18 outperfomed the other two models with 92.90% of AD, 100% of MCI and 97.70% of NC correctly classified. However, GoogLeNet surpasssed the other models with the highest testing accuracy, hence, outcoming as the best model to classify AD.

From Table 7, we can conclude that GoogLeNet is a powerful deep learning model for medical images specifically MR images classification. GoogLeNet produced highest training and testing accuracy over other models. The second highest accuracies was obtained by ResNet-18 outperforming AlexNet and VGG-16.

Table 7. Comparison of different transfer learning model results.
Models Training Accuracy Testing Accuracy
GoogLeNet 99.84% 98.25%
AlexNet 99.18% 93.97%
VGG-16 98.37% 88.66%
ResNet-18 99.02% 96.8%
Download Excel Table


The detection of Alzheimer’s disease remains a difficult problem, yet important for early diagnosis to get proper treatment. Currently, there is no treatment for AD and hence, early detection and classification is very important task to treat the patient. While there are many classification algorithms used in present days, classification using deep learning has captivated every researchers due to its flexibility and capacity to produce optimal results. However, in medical field acquiring enough data (images) is quite difficult as well as training a model from scratch is time consuming. Therefore, to overcome these problems transfer learning is used which requires only minimal data and takes few hours to classify MR images.

In this paper, we used transfer learning approach using different deep learning models such as GoogLeNet, AlexNet, VGG-6 and ResNet-18 as the base model for accurately classifying MR images amongst three different classes: AD, MCI and NC. We analyzed these models by changing the learning rate and the performance obtained at 1e-4 outperformed the performance of other cases. The models were well trained for our datasets and all the models were efficient in classification. Among all the other models, GoogLeNet produced the highest training and testing accuracy of 99.84% and 98.25% respectively. Therefore, transfer learning using GoogLeNet is definitely a successful approach to classify MR images into AD, MCI and NC.


This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2018R1D1A1A09083734).



F. Giannini and G. Leuzzi, Nonlinear Microwave Circuit Design. NewYork, NY: John Wiley & Sons Inc., 2004.


V.L. Villemagne, S. Burnham, P. Bourgeat and B. Brown, “Amyloid β deposition, neurodegeneration, and cognitive decline in sporadic Alzheimer's disease: a prospective cohort study,” The Lancet Neurology, vol. 12, no. 4, pp. 357-367, Apr. 2013.


S. Belleville, C. Fouquet, S. Duchesne and D.L. Collins, “Detecting early preclinical Alzheimer’s disease via cognition, neuropsychiatry, and neuroimaging: qualitative review and recommendations for testing,” Journal of Alzheimer’s disease, vol. 42, no. 4, pp. 375–382, Jan. 2014.


J. Gaugler, B. James, T. Johnson, A. Marin and J. Weuve, “2019 Alzheimer's disease facts and figures,” Alzheimers & Dementia, vol. 15, no. 3, pp. 321-87, Mar. 2019.


M. Tabaton, P. Odetti, S. Cammarata and R. Borghi, “Artificial neural networks identify the predictive values of risk factors on the conversion of amnestic mild cognitive impairment,” Journal of Alzheimer’s Disease, vol. 19, no. 3, pp. 1035–1040, Jan. 2010.


J. B. Toledo, M. Bjerke, K. W. Chen and M. Rozycki, “Memory, executive, and multidomain subtle cognitive impairment Clinical and biomarker findings,” Neurology, vol. 85, no. 2, pp. 144–153, Jul. 2015.


N. Madusanka, H. K. Choi, J. H. So and B. K. Choi, “Alzheimer's Disease Classification Based on Multi-feature Fusion,” Current Medical Imaging, vol. 15, no. 2, pp. 161-9, Feb, 2019.


S. Gauthier, B. Reisberg, M. Zaudig, R. C. Petersen, K. Ritchie, K. Broich, et. Al., “Mild cognitive impairment,” The lancet, vol. 367, no. 9518, pp. 1262-70, Apr. 2006.


Y. LeCun, Y. Bengio and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436-444, May. 2015.


A. Ortiz, J. Munilla, J. M. Gorriz, and J. Ramirez, “Ensembles of deep learning architectures for the early diagnosis of the Alzheimer’s disease,” International Journal of neural systems, vol. 26, no. 07, pp. 1650025, No. 2016.


H. C. shin, H. R. Roth, M. Gao, “Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1285-1298, Feb. 2016.


K.H. Jin, M.T. McCann, E. Froustey, “Deep Convolutional Neural Network for Inverse Problems in Imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509-4522, Jun. 2017.


S. J. Pan, and Q. Yang, “A survey on transfer learning,” IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345-1359, Oct. 2009.


M. Xin and Y. Wang, “Research on image classification model based on deep convolution neural network,” EURASIP Journal on Image and Video Processing, no. 1, pp. 40, Dec. 2019.


J. Fang, Y. Zhou, Y. Yu, and S. Du, “Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 7, pp. 1782-1792, Nov. 2016.


J. Bouvrie, “Notes on convolutional neural networks,” In Practice, 2006.


C. Szegedy, L. Wei, J. Yangqing, S. Pierre, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.


J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, Jun. 2009.


A. Krizhevsky, I. Sutskever and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” In Advances in neural information processing systems, pp. 1097-1105, 2012.


K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, Sep. 2014.


K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.


Deekshitha Prakash


Deekshitha Prakash is a researcher at the Inje University Medical Image Technology Laboratory (MITL) .In 2018, she pursued her Bachelor of Engineering in the department of Computer Science and Engineering from St. Joseph Engineering College affiliated under Visvesvaraya Technological University (VTU). Her interests include computer graphics, medical image processing and analysis, and various researches are being conducted.

Nuwan Madusanka


Nuwan Madusanka received his BS in the Department of Computer Science & Technology from Uva Wellassa University, Sri Lanka and MS and PhD degrees in the Department of Computer Engineering from Inje University, Korea, in 2015 and 2019, respectively. In 2019 September, he joined the Department of Computer Engineering as a Post-doctoral researcher at Inje University. His research interests include image analysis and processing, computer graphic, computer vision.

Subrata Bhattacharjee


Subrata Bhattacharjee is a Research Assistant of Medical Image Technology Laboratory (MITL), Inje University, Korea. He received the B.Sc degree in Information Technology (IT) from University of Derby, UK, in 2016. Currently, he is perusing his master degree in Computer Engineering department at Inje University. His research interests includes image processing and analysis, computer graphics, machine learning and deep learning in classification.

Hyeon-Gyun Park


Hyeon-Gyun Park is a researcher at the Inje University Medical Image Technology Laboratory (MITL). He pursued his Bachelor of Engineering in the department of Computer Engineering from Inje University, Korea. He is currently pursuing a master's degree in computer science at Inje University. His interests includes image processing and analysis and machine learning.

Cho-Hee Kim


Cho-Hee Kim is a researcher at the Inje University Medical Image Technology Laboratory (MITL). She pursued her Bachelor of Engineering in the department of Computer Engineering from Inje University, Korea. She is currently pursuing a master's degree in digital anti-aging healthcare from Inje University. Her interests include image processing and analysis on medical images.

Heung-Kook Choi


Heung-Kook Choi is a professor of Computer Engineering at Inje University and operates the Medical Image Technology Laboratory (MITL). In 1988 and 1990, he received his Bachelor of Engineering and Master of Engineering from Linköping University in Sweden, and his Ph.D. from Uppsala University in 1996. He is interested in computer graphics, multimedia, image processing and analysis.