Journal of Multimedia Information System
Korea Multimedia Society
Section A

Medical Diagnosis Algorithm Based on Tongue Image on Mobile Device

Zibo Zhou1,2, Dongliang Peng1,2, Fumeng Gao2, Leng Lu1,2,*
1School of Software, Nanchang Hangkong University, Nanchang 330063, China,,,,
2Science and Technology College, Nanchang Hangkong University, Gongqingcheng 332020, China
*Corresponding Author : Lu Leng, Nanchang Hangkong University, 696 Fenghe South Avenue, Nanchang City, 330063, P.R. China, 0086-791-86453251,

© Copyright 2019 Korea Multimedia Society. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Apr 24, 2019; Accepted: May 09, 2019

Published Online: Jun 30, 2019


In traditional Chinese medical (TCM) science, tongue images can be observed for medical diagnosis; however, the tongue diagnosis of TCM is influenced by the subjective factors of doctors, and the diagnosis results vary from person to person. Quantitative TCM tongue diagnosis can improve the accuracy of diagnosis and increase the application value. In this paper, digital image processing and pattern recognition technologies are employed on mobile device to classify tongue images collected in different health states. First, through grayscale integral projection processing, the trough is found to localize the tongue body. Then the tongue body image is transferred from RGB color space to HSV color space, and the average H and S values are considered as the color features. Finally, the diagnosis results are obtained according to the relationship between the color characteristics and physical symptoms.

Keywords: tongue diagnosis; traditional Chinese medicine; image processing; mobile device


Tongue diagnosis is one of the most widely used diagnostic methods in traditional Chinese medicine [1]. The benefit of a tongue diagnosis is that it is simple and straightforward: people who need a health check can quickly determine their pathology through a regular tongue diagnosis.

1.1 Tongue segmentation algorithm

At present, the category of the tongue segmentation algorithm mainly include color model transformation, active contour model, watershed algorithm, neural network and so on.

It is the easiest way to think of the separation of the tongue and other parts by color [2]. [3] observed the H channel data distribution characteristics of the image, and used it as a key segmentation factor, combined with the I channel data to segment. The author judges the largest part of the Unicom area in the segmentation result as the tongue, and finally corrects the final segmentation result by the morphological algorithm. Under the technical conditions at that time, it was not possible to accurately distinguish the color of the tongue, so the color of the middle part of the mechanical tongue was taken as the color of the tongue coating, and the color of the tongue took the color of the edge part of the tongue. At the same time, due to the difference in hardware facilities and methods, the results of tongue color recognition are also different. Because the color model conversion method based on color difference is not ideal, researchers have proposed a new active contour model algorithm [4]. Ref. [5] converted the original image from the initial coordinate system to the polar coordinate system, performed boundary enhancement and boundary extraction, and binarizing the result to the initial boundary of the contour. In [6], authors used the double-snake energy function algorithm for tongue segmentation, which improved the accuracy by nearly 10% compared to the ordinary snake algorithm. The authors of [7] used the C2G2FSnake algorithm for tongue segmentation, which increases the curve rate and reduces the complexity of the algorithm. In [8], athors used the maximum inter-class variance method to binarize the original tongue image, and then use the mathematical morphology method to modify the boundary of the binary image. The algorithm combines the maximum inter-class variance and mathematical morphology. Segmentation accuracy. It can be seen that the active contour model [9] has attracted the attention of researchers because of the high accuracy of segmentation results and the smooth contour curve. The watershed algorithm usually brings the problem of excessive segmentation. The researchers have proposed some improvement measures for this problem. Marker control is a commonly used improvement method.

The watershed algorithm usually brings the problem of excessive segmentation. The researchers have proposed some improvement measures for this problem. Marker control is a commonly used improvement method. [10] proposed an image segmentation algorithm based on fast two-step mark control. The tongue segmentation technique of the color model conversion method has gradually matured. [11] proposed to first find an initial object region, by transforming and thresholding the morphological components of the image in the HSI color space and morphological operations, and then image clustering the RGB components of the initial object region to find the root of the tongue. The gap region between the upper lip and the upper lip is finally removed by means of the gap region to remove the false tongue region such as the upper lip, and the tongue is extracted from the initial target region. The tongue segmentation algorithm is summarized as shown in Table 1.

Table 1. Tongue segmentation threshold table.
Method Category Year
snake-based approach active contour model 2007
HIS color color model transformation 2008
double snake energy function active contour model 2009
two-step marker-controlled Watershed algorithm 2012
C2G2FSnake function active contour model 2013
Threshold and clustering color model transformation, 2017
Download Excel Table
2. Tongue feature extraction algorithm

In recent years, there have been few studies on tongue image feature extraction algorithms, and their research focuses on how to comprehensively and effectively acquire disease-related features.

[12] tried to use wavelet transform to extract the color and texture features of different parts of the tongue image, and statistical analysis of these feature data, and finally to classify the healthy tongue image and the diseased tongue image. The authors of [13] believe that most tongue feature classification methods do not take into account regional information. Therefore, the study uses a color-texture segmentation algorithm to obtain a series of homogeneous regions, and classifies these regions according to the Earth Mover distance, and finally performs corresponding feature analysis. [14] studied different methods to obtain color and texture feature information in the tongue image, and based on this information, the appendicitis disease diagnosis experiment, the experimental results have higher accuracy. [15] classified the shape of the tongue based on the geometric characteristics of the tongue. The study corrected the skew of the tongue with three geometric criteria and classified the shape of the tongue with seven geometric features. In order to obtain better classification results, the research used the analytic hierarchy process to increase the weight of relevant factors and used the fuzzy fusion framework to express the certainty and accuracy between these factors and the tongue category.

At the same time, in order to avoid the subjective and qualitative problems of traditional tongue diagnosis, several computer-aided diagnosis systems have been proposed [16]. For example, [17] proposed a Bayesian network-based system capable of identifying five different diseases with an accuracy of approximately 75%. [18] used the model established by Bayesian decision to map the tongue image features of the sample with the existing classification results and identified the color of the tongue, which played a certain role in the clinical tongue diagnosis. [19] proposed a hybrid image segmentation algorithm combining region-based methods with boundary-based automatic classification methods.

At present, mobile phone penetration rate in some countries is almost 100%. Most people carry mobile phones with them. The built-in sensors of smart phones (such as magnetometers, accelerometers, cameras) enable the development of new sensor systems to measure the status of telephone users and his surrounding environment, so the study of a person who monitors his health through a tongue diagnosis of a smartphone [20] is beginning to rise. The concept of computerized tongue diagnosis is not new, but the research on tongue diagnosis based on mobile phones is relatively rare. Much of the work in this area has been based on the assumption that the tongue image comes from a well-controlled environment and can only be used by Chinese medicine practitioners. [21] developed an Android-based automatic tongue diagnosis application based on Canny algorithm. However, due to various lighting conditions, they do not discuss how to calibrate the image color.


1. Tongue image segmentation
1.1. Image graying

In order to diagnose the color of the tongue image, it is very important to extract the image of the tongue position. Most of the tongue segmentation algorithm first converts the image into gray image. The gray-scale integral projection algorithm adopted in this paper also needs to be divided first. The image is converted to a grayscale image. The use of the simple and color singular representation of the gray image can better segment the region. In this paper, the weighted average method is used to obtain the gray image, which is used to prepare the segmented tongue image. There are several common grayscale processing methods (R, G, and B represent the three primary colors, namely, red, green, and blue):

Maximum method:

GRAY = max ( R,G,B )

Component method:

GRAY = R or G or B

Average method:

GRAY = R+G+B 3

Weighted average method

GRAY = 0 .299×R+0 .578×G+0 .114×B
1.2. Grayscale integral projection

In a tongue image, at the junction of two objects, the gray value changes greatly, and the transition of the gray value is prone to occur. According to the distribution characteristics of the color of the tongue image, the tongue and the non-region have relatively obvious boundaries and the body area of tongue is a whole in a block shape. According to these characteristics of the image, we can choose the gray integral projection algorithm to determine the position of the tongue image. The gray integral projection divides the region by the gray integral value in a certain direction and selects the divided region according to the condition. In this paper, the gray level is integrated into the horizontal and vertical directions of the image to determine the position of the tongue.

GS ( x ) = &y1 &y2 GRAY ( x,y ) dy ,
GS ( y ) = &x1 &x2 GRAY ( x,y ) dx .

GS(x) represents horizontal gray-scale integration, GS(y) represents vertical integration, and GRAY(x,y) represents gray (x,y) coordinates in [x1,x2],[y1,y2] images. After obtaining the gray level integral projection of the horizontal and vertical directions of the tongue image, respectively, the distribution curve of the gray level integral in the horizontal and vertical directions will be drawn, and the distribution will show many peaks and troughs, but we know that in the tongue In the image, the tongue and non-tongue regions have different brightness values. Due to the existence of the edge of the tongue, troughs will be generated, and the position of the trough can be found to determine the region of the tongue.

After obtaining the gray integral projection value, the projection function curve is fitted with smooth lines. As shown in Fig 1, the distribution and difference of the gray value of the image can be clearly seen, and then the appropriate interval is set to find the minimum value of the gray integral. (This article limits the distance between two adjacent minimum points to be no less than 20 pixels).

Fig. 1. Grayscale integral projection result. (a) source image, (b) Vertical gray level integration, (c) Horizontal gray level integration.
Download Original Figure
| MIN ( i ) -MIN ( i+1 ) | 20px,

where MIN(i) represents the i minimum value, and px represents that the pixel unit obtains the minimum value point, and the two extreme points in the horizontal direction and the vertical direction are respectively selected as the boundary of the tongue to determine the position of the tongue. The red "*" point in the figure indicates: horizontal integral projection or vertical integral projection trough position tongue coating color classification.

2.1 Tongue classification
2.1.1 RGB and HSV color space

Color information is an important component in image processing technology. In many image processing algorithms, such as image segmentation algorithm, edge detection algorithm, and algorithm implementation are based on accurate color information. Nowadays, there are many color spaces, such as RGB space, HSI space, HSV space, etc., which can be used to accurately represent a color image. Different color spaces determine their different research roles. Facing an actual image processing problem, how to It is especially important to choose the right color space. This article will introduce the RGB color space and the HSV color space.

The RGB color space is based on the three primary colors of human vision - red (R), green (G), and blue (B). The color space is considered to be red (R), green (G), and blue (B). Proper mixing can cause perception of all colors in the spectrum. In the RGB color space, each image is composed of one pixel, and the color of each pixel is composed of three components of R, G, and B, which together constitute the color feature of the image. In the tongue image, the R, G, and B values of various tongue colors are not much different, and they do not show certain regularity. In order to better classify the tongue image according to the tongue color, it is necessary to select a more suitable color space, and RGB. The space is converted to other color spaces. This article uses the HSV color space.

The HSV color space is based on human visual perception. In the HSV color space, three components are no longer used to represent the image color, only the H component is used, and the S component represents the depth of the color, called saturation. The S component and the H component play an important role in the color classification of the tongue image. The luminance (V) component indicates the degree of lightness and darkness of the color, and the range is [0, 1]. It is these characteristics of the HSV color space that are more suitable for the classification of the color features of the tongue image than the RGB color space. This paper mainly uses the mean value of the H and S components of the HSV color space to determine the color classification threshold.

2.1.2. RGB and HSV space conversion

After experimenting and analyzing the tongue image in different color spaces, it is found that the HSV color space is more suitable for the color classification of the tongue image. After segmenting the image of the tongue position, the space of the segmented tongue is converted and converted. The formula is as in formula (2.8). The range of R, G, and B is: [0-255], max represents the maximum of the three values of R, G, and B, and min represents the minimum of the three values of R, G, and B.

{ H= { 60* ( G-B ) / ( SV ) S 0 and max=R 60* ( 2+ ( B-R ) / ( SV ) ) S 0 and max=G 60* ( 4+ ( B-R ) / ( SV ) ) S 0 and max=B if H<0 .H=H+360, S=max V=(max-min)/max
2.1.3. Threshold classification

Common tongue colors are pale white, red, purple, black, etc. The common tongue coating color is divided into yellow, white, gray and so on. For these color types, this paper divides the color detection of tongue image into five categories: light red tongue, white moss, yellow moss, red tongue and purple black tongue. After the experimental analysis of the collected tongue image, under the HSV color space, according to H The mean value of S determines the classification threshold of the five types of tongue images. After converting the RGB color space to the HSV space, separate the chromaticity (H), saturation (S), and brightness (V), and only use H to represent the color, and remove the red, green, and blue colors in the RGB color space. Correlation is a good guarantee for the classification of tongue image color. Under the HSV color space, the average values of H and S are obtained respectively, H̅ and S̅. Under the HSV space, the H̅ values of the tongue images differ greatly and have a certain regularity, that is, from the red tongue. , yellow moss, purple black tongue, white moss and red tongue, H̅ value increased in turn, but the white moss and red tongue are basically the same. However, their H̅ values are larger than those of other categories, and their S̅ value is observed. The S̅ value of the reddish tongue is greater than 0.20, while the S̅ value of white moss is less than 0.20. For these two points, we can well classify the tongue image into the five categories mentioned above according to the two values of H̅ and S̅.

The tongue image segmentation is performed by taking the tongue image sample obtained on the Internet and the tongue image sample obtained by taking photos of itself, and then calculating the H̅ and S̅ values in the HSV color space to determine the color classification of the tongue image as shown in Table 2. The range of H̅ in Table 2 is [0, 255], the unit is degree, and the range of S̅ is [0, 1].

Table 2. Tongue segmentation threshold table.
Tongue category
Red Tongue <17 0.40~0.65
Yellow Tongue 17~135 0.25~0.45
Purple Tongue 135~240 0.10~0.35
White Tongue H>=240 0.07~0.20
Light Red Tongue H>=240 0.20~0.40
Download Excel Table


By downloading the already-collected tongue image and the mobile device to start the tongue diagnosis system to obtain the tongue image for detection, the JPG format picture is used for testing, and the detection result is recorded. Good results have been achieved in both tongue segmentation and tongue color diagnosis.

3.1. Grayscale image of tongue

Before performing the segmentation of the tongue, the tongue image should be processed to convert the image into a grayscale image. Fig 2 shows the grayscale result of the experimental tongue image.

Fig. 2. Tongue image grayscale result.
Download Original Figure
3.2. Tongue segmentation

The gray level integral projection method is used to determine the positional condition of the tongue is slightly harsh, and the photographing light is required to be good, and the collected tongue image basically does not contain other objects. Under such conditions, a better segmentation effect can be obtained. As shown in Fig 3 below, the image inside the white rectangle is the position of the tongue determined by the gray integral projection algorithm, which can accurately locate the position of the tongue, but the lack of part of the non-tongue part in the white rectangle is To a certain extent, it affects the color detection behind.

Fig. 3. Tongue segmentation result image.
Download Original Figure
3.3. Tongue image color diagnosis

After the position of the tongue is determined, the average of R, G, and B is separately classified into the RGB space of the segmented tongue image to classify the color of the tongue image, and the same type of tongue image is found, and the R, G, and B values are found. There is no convergence to a certain interval, and there is no regularity, as shown in Table 3. In terms of R-means, only the red tongue value is more prominent than the value of any category, while in other categories, the mean values of G and B do not show any regularity. In the RGB color space, it is basically impossible to find a suitable classification standard, which is seriously affected by the high correlation of its color components.

Then, the cut tongue image is converted into the HSV space, and the collected sample images are color-classified according to the H̅ and S̅ thresholds of the tongue-segment threshold table in Table 4, which can better detect the red tongue, yellow tongue, white tongue, purple black tongue and red tongue five categories. 48 tongue images were tested, 41 of which were correctly diagnosed, and the diagnostic rate was over 85.4%.

Table 3. Different RGB image range distribution.
Tongue category
Red Tongue 140~220 60~90 60~85
Yellow Tongue 110~145 85~130 40~80
Purple Tongue 40~130 50~110 50~100
White Tongue 65~135 60~120 60~120
Light Red Tongue 90~140 40~110 35~130
Download Excel Table
Table 4. HSV color space recognition.
Tongue category Total number of detections Correct identification number Detection rate
Red Tongue 10 9 90.00%
Yellow Tongue 9 7 77.78%
Purple Tongue 11 9 81.82%
White Tongue 6 4 66.67%
Light Red Tongue 12 12 100.00%
Total 48 41 85.42%
Download Excel Table


Through the experimental results, it is found that the accurate recognition of the color of the tongue image is based on the accurate determination of the position of the tongue. If the tongue image is not correctly segmented, the H̅ and S̅ under the HSV color space are used to judge the color. Significant. The gray-scale integral projection algorithm used in this paper does not accurately determine the position of the tongue. The four corners of the segmented rectangular tongue image still contain non-tongue images. Although it is a small amount, it still affects the back. Color detection. In order to improve the accuracy of the segmentation of the tongue, it is necessary to achieve accurate segmentation of the tongue after the approximate determination of the position of the tongue. At the same time, in the system design, the image should be roughly judged before the diagnosis of the tongue image. The image of the tongue object is not detected.

The detection of H̅ and S̅ values in the HSV color space is very good in the five categories of red tongue, yellow moss, white moss, purple black tongue and red tongue. However, this paper only divides the color into five categories, and the later research It can be considered that the color category is added based on this classification, and the category is subdivided. Different types of diseases in different positions of the tongue are a very good research direction for image feature extraction of different parts of the tongue. When the classification threshold is determined, many sample images are downloaded from the website, and their collection environments are not the same, which has a greater impact on the classification threshold determination. At the same time, the sample base is too small when the threshold is determined. The sample collected in the same environment should be used to correct the color classification threshold of the tongue image. At the same time, the sample size required is as large as possible. The accurate classification threshold is the guarantee of the reliability of tongue diagnosis.


This work was supported partially by the National Natural Science Foundation of China (Grants No. 61866028, 61763033, 61662049, 61741312, 61881340421, 61663031, and 61866025), the Key Program Project of Research and Development (Jiangxi Provincial Department of Science and Technology) (20171ACE50024, 20161BBE50085), the Construction Project of Advantageous Science and Technology Innovation Team in Jiangxi Province (20165BCB19007), the Application Innovation Plan (Ministry of Public Security of P. R. China) (2017YYCXJXST048), and the Open Foundation of Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition (ET201680245, TX201604002), Innovation Foundation for Postgraduate (YC2018094, YC2017067), and “Triple-little” Extracurricular Academic Projects (2018ZD071, 2017YBRJ034).



Zhang B, Wang X, You J, et al. "Tongue color analysis for medical application." Evidence-Based Complementary and Alternative Medicine, pp. 1-11, Mar. 2013.


Wei C C, Wang C H, Huang S W. "Using threshold method to separate the edge, coating and body of tongue in automatic tongue diagnosis. " The 6th International Conference on Networked Computing and Advanced Information Management, IEEE, pp. 653-656, Aug. 2010.


Du J Q, Lu Y S, Zhu M F, Zhang K, Ding C H. "A novel algorithm of color tongue image segmentation based on HSI. " 2008 International Conference on BioMedical Engineering and Informatics, IEEE, vol. 1, pp. 733-737, May. 2008.


Kass M, Witkin A, Terzopoulos D. "Snakes: Active contour models." International Journal of Computer Vision, vol. 1, no. 4, pp. 321-331, Jan. 1988.


Zhang H, Zuo W, Wang K, et al. "A snake-based approach to automated segmentation of tongue image using polar edge detector. " International Journal of Imaging Systems and Technology, vol. 16, no. 4, pp. 103-112, Feb. 2007.


Zhai X, Lu H, Zhang L. "Application of image segmentation technique in tongue diagnosis." International Forum on Information Technology and Applications, IEEE, vol. 2, pp. 768-771. 2009.


Miao J S, Li G Z, Li F. "C2G2FSnake: automatic tongue image segmentation utilizing prior knowledge." Science China Information Sciences, vol. 56, no. 9, pp. 1-14, Sep. 2013.


Gao Q H, Gang j, Wang Y H, et al. "Tongue image segmentation based on two-dimensional maximum inter-class variance and mathematical morphology." Computer and Digital Engineering, vol. 45, no. 6, pp. 1200-1203, 2017.


Kanawong R, Xu W, Xu D, et al. "An automatic tongue detection and segmentation framework for computer–aided tongue image analysis." International Journal of Functional Informatics and Personalised Medicine, vol. 4, no. 1, pp. 56-58, Nov. 2012.


Han X, Fu Y, Zhang H. "A fast two-step marker-controlled watershed image segmentation method." IEEE International Conference on Mechatronics and Automation, pp. 1275-1380, Aug. 2012.


Li Z, Yu Z, Liu W, et al. "Tongue image segmentation via thresholding and clustering." IEEE 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1-5, Oct. 2017.


Dhanalakshmi M, Premchand P, Goverdhan A. "Applying linear wavelet transforms and statistical feature analysis for digital tongue image" Pattern Recognition Letters, vol. 16, no. 1, pp. 95-102, 2014


Wang Y G, Yang J, Zhou Y, et al. "Region partition and feature matching based color recognition of tongue image." Pattern Recognition Letters, vol. 28, no. 1, pp. 11-19, Jan. 2007.


Pang B, Zhang D, Wang K. "Tongue image analysis for appendicitis diagnosis." Information Sciences, vol. 175, no. 3, pp. 160-176, Oct. 2005.


Huang B, Wu J, Zhang D, et al. "Tongue shape classification by geometric features." Information Sciences, vol. 180, no. 2, Jan. 2010.


Wang X, Zhang D. "A high quality color imaging system for computerized tongue image analysis." Expert systems with Applications, vol. 40, no. 15, pp. 5854-5866, Nov. 2013.


Zhang H Z, Wang K Q, Zhang D, et al. "Computer aided tongue diagnosis system." 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, pp. 6754-6757, Jan. 2006.


Wang Y, Zhou Y, Yang J, et al. "An image analysis system for tongue diagnosis in traditional Chinese medicine." International Conference on Computational and Information Science, Springer Berlin Heidelberg, pp. 1181-1186, 2004.


Kanawong R. "Computer-aided tongue image diagnosis and analysis." University of Missouri—Columbia, 2012.


Hu M C, Cheng M H, Lan K C. "Color correction parameter estimation on the smartphone and its application to automatic tongue diagnosis." Journal of medical systems, vol. 40, no. 1, pp. 18, Jan. 2016.


Zhang Q, Shang H L, Zhu J, et al. "A new tongue diagnosis application on Android platform." 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE, pp.334-327, 2013.


Zibo Zhou


Zibo Zhou received his BS in Tongda College of Nanjing University of Posts and Telecommunications, Yangzhou, P. R. China in 2017. Currently he is pursuing his MS degree in School of Software, Nanchang Hangkong University, Nanchang, P. R. China.

His research interests include palmprint recognition, image processing and deeplearning.

Dongliang Peng


Dongliang Peng received his BS in Nanchang Hangkong University, Nanchang, P. R. China in 2017.

His research interests include palmprint recognition and image processing.

Fumeng Gao


Fumeng Gao received his BS in Xinyu Universtiy, Xinyu, P. R. China in 2016. Currently he is pursuing his MS degree in School of Software, Nanchang Hangkong University, Nanchang, P. R. China.

His research interests include palmprint recognition, image processing, and computer vision.

Lu Leng


Lu Leng received his Ph.D. degree from Southwest Jiaotong University, Chengdu, P. R. China, in 2012. He performed his post-doctoral research at Yonsei University, Seoul, South Korea, and Nanjing University of Aeronautics and Astronautics, Nanjing, P. R. China. He was a visiting scholar at West Virginia University, USA. Currently, he is an associate professor at Nanchang Hangkong University.

He has published more than 60 international journal and conference papers and been granted several scholarships and funding projects for his academic research. He is the reviewer of several international journals and conferences. His research interests include image processing, biometric template protection, and biometric recognition.

Dr. Leng is a member of the Institute of Electrical and Electronics Engineers (IEEE), the Association for Computing Machinery (ACM), the China Society of Image and Graphics (CSIG), and the China Computer Federation (CCF).