With the development of society, the pursuit of human health is constantly improving, and with it the importance of preventing diseases in daily life. For this reason, it is very important to have a variety of constant physiological information detection methods. Due to the increase of care and awareness about health in individuals, the world has observed a rapid growth of wearable and non-wearable technologies that can measure the state of human health at every moment. Vital signs are good indicators of the health and current condition of an individual. Heart rate can be considered the most basic and important among the vital signs. It can also serve as an important physiological parameter to know the state of the human body.
Heart rate can be defined as the number of pulses per unit of time, usually expressed as pulses per minute, and it expresses the present condition of the heart. Although heart rate changes according to the fitness of the human body, the normal heart rate on a healthy adult in resting-state ranges from 60 to 100 beats per minute (bpm) . It is also possible to identify heart conditions by monitoring the heart rate. A slower than normal heart rate in a resting state is called Bradycardia (“slow heart rate”), on the other hand, a faster than normal heart rate in resting-state is called Tachycardia (“fast heart rate”) . If the resting heart rate is too high or low, different cardiovascular symptoms such as irregular heartbeat, dizziness, fainting, pain, shortness of breath, can be presented.
The human significant physiological parameter heart rate can be measured using the photoplethysmography (PPG) technique. This technique is a simple optical technique used to monitor the light absorption intensity of illumination on the surface of tissues that depend on blood circulation in the human body. A low-cost and non-invasive technique is PPG which measures the pulse wave of blood in the skin. This technique also provides worthy information about the cardiovascular system which is more crucial for measuring heart rate.
Based on PPG, heart rate measurement methods are classified into two categories: contact and non-contact. Contact measurement of heart rate involves the use of sensors or devices attached to the patient’s skin. However, in some situations it the wiring system of these sensors can restrict the movement of the patient. Besides this, sometimes it is not possible to attach the sensors due to the condition of the patient’s body. An example can be seen in The Neonatal Intensive Care Unit (NICU) where the use of adhesives to attach sensors to the skin of premature infants can cause pain and skin irritation . For these limitations, the non-contact technique to estimates PPG signals without directly attached a device to the subject’s body can be a more suitable option.
A growing body of literature is exploring the probability of applying a non-contact camera-based method known as non-contact PPG to measure the heart rate. The principle of this method is the subtle change of color intensity in the skin surface due to circulating blood in the subject’s body. Using a standard RGB video camera, these changes in color intensity of the skin can be recorded. Heart rate could be measured from the change of skin color intensity using signal processing methods. We note that when the heart rate is measured based on RGB camera video data, these methods have some limitations:
■ When a video is recorded the subject present slight movements unconditionally.
■ The measured PPG signal was vincible to noise caused by the subject’s movement.
In this paper, we propose a computer vision non-contact method for monitoring heart rate from a subject’s face videos. This method utilizes face recognition technology to automatically extract the forehead and cheeks on every frame of the video, then the video data is used to measure the color intensity values in the skin surface and heart rate is measured using the signal processing algorithms. At first, three color channels red, green and blue are split from the selected face region. The spatial average is taken from the selected region and obtained the photoplethysmographic signal from that. Then three different algorithms, i.e. Fast Fourier Transform (FFT), Independent Component Analysis (ICA), and Principal Component Analysis (PCA) are used on the acquired photoplethysmographic signal to extract the heartbeat signal. The Butterworth bandpass filter is used to eliminate noises from the signal before measuring the heart rate. Finally, FFT is applied to the filtered signal and heart rate is estimating using spectrum analysis in the frequency domain.
In the following section of this paper, related literature, as well as background information needed for the development of this project, will be presented. Additionally, the followed methodology for the proposed hearth rate measurement method as well as the experimental result obtained after the implementation will be shown and analyzed.
II. RELATED WORK
The non-contact pulse measuring method could be easily implemented to a platform for health monitoring, there have various studies related to the measurement of heart rate use of image processing technique. Recently non-intrusive heart rate measurement is a popular topic for commercial and academic purposes. The vital signs of heart rate can be detecting human physiological status. “Monitoring physiological parameters, such as heart rate (HR), respiratory rate (RR), heart rate variability (HRV), blood pressure, and oxygen saturation is of great importance to an access individual’s health condition” . The most important organ is the heart of the body for estimation and monitoring of heart rate, and it is inevitable for the supervision of the cardiovascular catastrophe and the treatment therapies of chronic diseases . The first non-contact-based health monitoring method was explored in 1995 . After that non-contact vital signs such as heartbeat and respiration measurement technique was proposed in 1997 . They developed a radar-based vital signs monitoring system from a 10 meter distance of the object.
Ballistocardiography and PPG are two ideas that are used to estimate the heart rate from human face video . In this paper, a critical review is also given on a digital camera-based heart rate measuring method from facial skin. They have discussed the theory and principles of the proposed methods that how to measure heart rate from face video. Significant contributions to improving reliability as well as significance and challenges were discussed.
The plethysmographic signal can be estimated remotely from video using a standard RGB camera in ambient light . The three channels of the RGB color spectrum, red, green and blue are used to detect signals from the video of exposed skin tissue. While the green channel holds the higher plethysmographic signal information, the blue and red channel also holds plethysmographic signal information. The signal can be detected from multiple locations of the body, but it contained important information on the face, especially on the forehead. The plethysmographic signal may be detected in the raw color channel data, it is mixed with another source of color variation such as changes in ambient light or motion. The proposed paper discussed the measurement of heart rate remotely based on videos and challenge the other advanced methods which are based on motion artifacts and change of color illumination of skin tissue . They briefly discussed how the technique of non-contact heart rate measurement provides a more comfortable and convenient feeling for patients and will be more beneficial for medical applications in human healthcare.
III. THEORETICAL BACKGROUND
An optical measurement method is PPG, this method can be applied to detect blood volume change in microvascular tissues and estimate from the skin blood flow using infrared light . It is a non-invasive and low-cost technique of cardiovascular activity measurement. There is a widespread application for clinical devices using the PPG technique, it is even utilized on clinical devices commercially available.
The plethysmograph is used to register and determine the variation in the flow of blood volume in the body which depends on heart pulse . PPG technology requires only the component of optoelectronic: the light source is illuminated on skin tissue and measure the subtle variation of intensity using a photodetector in the skin surface with perfusion of the blood volume. PPG is a simple, convenient, easy to set up and economically efficient method for measuring heart rate . The volumetric measurement of the organ in the body is called plethysmography which depends on the fluctuating amount of blood, therefore it can be used to detect pulse rate by observing the change of blood flow continuously. “There has two basic techniques of PPG: transmittance and reflectance” . Within a device, a transmitter emits light to the skin and the sensor detects and measures how much light returned to the device as shown in Figure 1. Normally, infrared light emitting from the sources and a phototransistor is used for the detector. When a fingertip is illuminated from a source, three things happen: some amount of light is transmitted, some amount of light is absorbed, and some amount of light is reflected back to the device . The reflected light intensity changes according to the capillary dilation and constriction of blood volume in the fingertip, which varies according to heart rate. Especially, “lower intensity of reflected light indicates the higher volume of blood and vice versa” . The amount of blood flowing through the blood vessels is proportional to the PPG voltage signals. It cannot be used to determine the amount of blood but it is possible to detect subtle changes in the amount of blood using this method.
The PPG signal refers to the changing light intensity of skin over time . The volumetric change of arterial blood in a PPG signal that is involved in cardiac activity, the changes of blood volume in veins modulates the PPG signals, the optical properties of tissue, and subtle energy changes in the body are related to the direct component (DC) component. This mostly occurs when the blood flow is varying in the arteries and not in the vein.
The PPG signals are composed of two components: DC and alternating component (AC) as shown in Figure 2 . The AC corresponds to the variation of blood volume in synchronization with heartbeat pulse, and it is used to measure heart rate. The DC is derived from optical signals reflected or sent by tissue, it is determined by the tissue’s frame as well as the amount of blood in the veins and blood vessels. It shows fractional changes in respiration. Thus to perform the AC analysis the DC must be removed. A small portion of the AC needs to be filtered to obtain an accurate PPG signal. The basic frequency of the AC is used to detect heart rate.
Remote PPG is a technique for remote measurement of human cardiac activities by detecting a small pulse-induced variation on the subject’s skin utilizing a multi-wavelength RGB camera . It shares the same principle as PPG, but the measurement is done remotely. It measures the variance on RGB light reflection as the brightness between specular reflection and diffused reflection. In recent years, researchers have proposed several remote PPG methods for calculating the pulse signal from videos.
“Consider a light source illuminating a piece of human skin tissue containing pulsatile blood and a remote color camera recording this image, as illustrated” in Figure 3 . When light falls on the skin surface, the specular and diffused reflection that remains from the scattering and absorption in skin tissues varies with the change of blood volume. The observed intensity depends on the camera and light source distance to the measurement point in the subject’s king. Subtle changes can be observed over time in the color depending on the blood circulation, movement and specular variation in the body.
The proposed method uses the facial RGB color on subjects’ videos as input and outputs the measured heart rate. The first step consists of identifying the subject’s face from the input frame and be able to track it through each frame of video. After this, temporal filtering is applied to the selected region of interest (ROI) to isolate the frequency of interest. Following, the cardiovascular pulse signals from face videos are extracted by using component analysis algorithms. Our proposed method performs with the FFT, PCA  and FastICA algorithms. Then, the FFT technique is applied to detect the maximum spectrum in each component. The peak value of the spectrum indicates the heart rate value of the subject.
An initial stage of preprocessing needs to be performed in the video before the component analysis. It includes methods for face detection, tracking the ROI, data interpolation, noise removal  and temporal filtering. There are three main steps to detect heart rate from facial videos. A more detailed discussion of these steps can be found in the following sections. At first, the face area must be identified in each frame since it is the part of the video that contains the heart rate information. Second, the desired ROI is selected from the face bounding box. Thirdly, the PPG signal is extracted based on variations over time of the color intensity in the ROI of the skin. Finally, the signal processed with noise reduction methods and analyzed to determine the heart rate.
Face detection is an inevitable step of many applications like face authentication, recognition, tracking, emotion recognition, etc. A Face detection algorithm objective is to determine if a face appears in the image or not. It is an easy task for humans to detect faces, but for a computer, it can represent a challenging task. For this reason, this task has been an interesting research topic for the past decades.
Using a face detection algorithm, the subject’s face needs to detect from a video frame which is a basic part of videos. The heart rate monitoring system needs to figure out each image frame properly one by one at specific times from videos. Then, the most important facial region for the proposed algorithm has to be tracked in every video frame. It must be noted the importance of maintaining a relatively consistent pose from the subject across the entire video to be capable of performing the needed calculations. The proposed method needs a reliable face tracking method to perform face detection accurately. Due to this reason, for this project implementation, the ‘CascadeObject-Detection’ method ,  was utilized from the Open source Computer Vision (OpenCV) developed by Viola-Jones .
The ROI is selected on specific criteria from the video frames. The information contained in this region is then used for the computation process. Making use of a face detection algorithm, the facial pixels need to be identified and separated from the background pixels to be able to place a bounding box around the face. From the area contained in the bounding box, the ROI is obtained. The most suitable area to observe the skin color intensity variations are the forehead and cheeks. Since these regions provide more detailed changes it is easier to track the variations. The dimensions of the rectangle placed in this area are related to the face detection box, and its size varies depending on the distance between subject to webcam.
From the ROI, the red, green and blue channel values are extracted to obtain the corresponding signal for each color . Since every video frame consists of three color channels, every pixel has a 3x1 matrix of color values. With this information, it is possible to convert the three color channel values to the desired color signal signals. For this, three channels of the ROI are separated and the average pixel value is calculated for the y-line of each channel. From each value of the y-line, the mean value of the signal is removed from the signal itself and the standard deviation. Then select the higher color intensity value from the standard deviation value . The average value of all channels of the video are the red, green and blue signals. On this signal, a 3rd order Butterworth band pass filter is applied for noise reduction of the desired signal with the heart rate range from 40 bpm to 180 bpm. Every step of the proposed method is shown in Figure 4.
The most notable difference with the traditional method is the implementation of a pixel intensity change method utilized to extract the RGB signal. While ICA and PCA are used to remove the noise from the time serial signal pixel intensity changes method tries to remove the noise from the image level. As described in , the pixel intensity changes method takes each of the and each channel of the ROI and for each frame f and calculates the intensity component IB(x,y,c,f) where c represents each of the RGB channels. The intensity components of each row of pixels (y) of the ROI are averaged by Eq. (1):
Among the IB(y,f) values obtained, the higher 5% are selected for the estimation of the heart rate. Figure 5 illustrates the process for calculating the IB(y,f) from an image frame.
The photoplethysmographic signal has been obtained from the selected regions in each video frame by averaging pixel values. The signal processing techniques of the FastICA and PCA are used to reduce noise, eliminate motion artifacts and obtain a higher quality desired signal. Using FFT to computes the Discrete Fourier Transform (DFT) of noise-free signal. In other words, the DFT frequencies of a noise-free signals are extracted and the heart rate is finally estimated using Power Spectrum. The results of different processes can be seen in Figure 6.
V. EXPERIMENT RESULT
In this experiment, a web camera was used to record the video in an indoor environment, and a pulse oximeter was also used for the reference heart rate data recorded to validate the findings.
The data acquisition was driven by ten (10) participants (all masculine) of different ages ranging from 24 to 35 years. The subjects were from different regions of the world and with different skin colors. The example recordings were taken in an indoor environment with sufficient artificial light. The windows of the laboratory room where the videos were recorded were covered during the time of the data acquisition. The participants were informed about the purpose of the study and were requested to seat in front of a webcam placed at 0.8 meters from their faces. All participants were asked to keep their normal breathing and to maintain a stable position facing towards the webcam while their video was recorded. At the same time as the face video recording, a pulse oximeter device was attached to the participants’ left-hand index finger; a second camera recorded pulse oximeter bpm measurement at the same time as the face video was being recorded. The recording of the device measurements was processed with an optical character recognition (OCR) technique to generate a comma separated values (CSV) with the heart bpm every second.
Each participant was recorded in the described conditions, obtaining the participant’s face video and a CSV file with the ground truth that allowed to determine the hearth rate at any specific time in the video. Figure 8 shows an example of the data collection setup.
The proposed method was performed by selecting two regions of the subject’s face, forehead and cheeks. These regions carry the most crucial plethysmography signal values. On the photoplethysmographic signal, three algorithms were applied to extract the heart rate value. In the experiments, 60 seconds face videos were utilized to evaluate the proposed method. Based on the proposed method, the heart rate value was extracted from the selected region using the change of color intensity value. The FFT technique was applied on the filtered photoplethysmographic signal to get the power spectrum signal. The heart rate value was estimated from the power spectrum signal using the frequency domain. The maximum power spectrum was detected for the corresponding frequency and the maximum value of the power spectrum represents the heart rate value. Table 1 represents the measured heart rate value using the proposed method. From this table, the accuracy of the three algorithms was calculated for the selected ROI. The following equations 2 and 3 were used for the accuracy calculation:
where PE is the percentage error; GT is ground truth value, and M is the measured value.
The accuracy summary of the proposed method is presented in Table 2. For all algorithms, the proposed method given an average accuracy of more than 85%. It was seen that the FastICA algorithm provided good average accuracy from all algorithms for both forehead and cheek regions. The FastICA algorithm average accuracy is more than 92% for both regions. The 10th subject contained the best accuracy among all with the FastICA algorithm. His accuracy was more than 99% for the forehead region. The 3rd subject gave the worst accuracy among all subjects for the FFT algorithm. Its accuracy was less than 72% for the forehead region.
|Algorithms||Forehead (Accuracy in %)||Cheek (Accuracy in %)|
Table 1 presents the obtained results of the proposed method. Discrepancies among the prediction and the ground truth can be observed. Some possible causes that can further affect these differences can be movements from the subjects, illumination problems and other noise present in the environment. In order to solve these issues, further filtering algorithms can be explored in the future looking to improve the accuracy.
The ROI average color value was utilized to extract the heart rate using the proposed method. The output results are shown in Table 3. From Table 3, the average accuracy of the three algorithms was calculated, the results are shown in Table 4. It can be observed that this method’s average accuracy is more than 84% for average color values and produce a maximum accuracy of 93.39% for the FastICA algorithm in the cheek region.
|Algorithm||Forehead (Accuracy in %)||Cheek (Accuracy in %)|
Even though the proposed method provided a higher accuracy on all algorithms, it was noted that in the cheek region the traditional method based on average color value gave a higher accuracy with the FastICA algorithm. Figure 9 and 10 show a comparison of the accuracy obtained by the proposed and traditional method on the forehead and cheek ROI respectively.
Non-contact based heart rate estimation method is becoming more popular thanks to the higher versatility when comparing it to the contact-based measurement methods utilized in clinical environments. In this paper, the proposed non-contact method estimates heart rate values from RGB face videos using intensity color value. This method is non-invasive, easier to use in some situations and has a wider range of applications than the traditional method. It depends on a physiological signal which contains the crucial information of heart rate and extracts heart rate value using the PPG signal. The proposed non-contact or remote heart rate monitoring method is an easy to implement, low-cost and comfortable method.
Even though the PPG method produces a high accuracy when measuring heart rate, there exists an individual time window when the measurement accuracy decrease. This inconsistency can be due to face movement, light changes, or another kind of environmental noise.
In this paper, the utilization of the pixel intensity changes is proposed to improve on the traditional method that utilize the average color value. The proposed method gives an accuracy higher than 85% when the three algorithms are averaged on forehead and cheek regions. It was also seen that the FastICA technique gives the highest average accuracy among the three algorithms with 92% in both regions. It was observed from the result that the proposed method based on color intensity obtained a higher accuracy on both regions with all the algorithms except for the FastICA result on the cheek region which was the only one overperformed by the traditional method. The utilization of the pixel intensity changes method obtained a 1.94% higher average accuracy than the traditional method.
As follow-up research, the improvement of the method focusing on filtering out environmental noise to increase the accuracy can be explored. Additionally, an interesting implementation of the proposed method could be in the human expression detection field. The utilization of a non-contact heart rate measurement method as presented in this paper can be useful to some state-of-the-art human emotion predictors to improve even further.