I. INTRODUCTION
In Republic of Korea, the proportion of old buildings over 30 years old nationwide is rapidly increasing every year, with 36.5% in 2017, 37.1% in 2018, and 37.8% in 2019, and the number of disasters and accidents caused by weather disasters is increasing. In particular, Busan has the highest proportion of old buildings in the country with 54.3% of all buildings as of the end of December 2019, and the proportion of buildings over 20 years old is 75%, which is much higher than the national average [1-3] (Fig. 1).
In order to prevent risks caused by old facilities in advance, management to maintain the condition of facilities at an appropriate level is necessary, and until now, safety inspections of aging buildings have been primarily conducted visually by workers, which has led to reduced reliability of diagnostic results and increased potential risks. The use of drones for safety inspections of various aging buildings has improved on-site work efficiency and worker safety.
Recently, drones have been equipped with individual sensors such as multispectral sensors, RGB, LiDAR, etc. to acquire data on cracks, disappearance, etc. on the outside of buildings, and safety diagnosis is carried out based on this information. However, while drone-collected data evaluates visual changes in aging buildings, such as cracks and losses, it remains insufficient for assessing and analyzing internal displacements such as subsidence and vibrations.
Changes in building safety are not only influenced by external conditions but also by internal factors such as vibration, tilting, and subsidence, which are difficult for workers to detect visually. However, these factors are often overlooked in building safety diagnostic evaluations due to the limitation that they are difficult to detect immediately on-site.
Additionally, to confirm changes in the condition of buildings, it is necessary to collect data periodically over a certain period to manage the factors causing the changes. However, periodic data collection using drones has limitations in terms of the efficiency of building management and operation.
This paper proposes a method for identifying changes in building conditions by utilizing IoT sensor data, considering the limitations of existing diagnostic methods based on video data obtained from drones.
Over a seven-day period at a university-industry collaboration building, IoT sensor data was collected for vibration, tilt, altitude change. To achieve precise and stable monitoring of building conditions, it is necessary to evaluate and analyze IoT data that provides objective internal factor information, identify risk zones based on the Isolation Forest algorithm and risk scores, and comprehensively analyze the accumulated data on safety changes to develop a real-time diagnostic system.
II. THEORETICAL BACKGROUND
Until recently, various studies utilizing IoT for old buildings have been actively promoted. In order to utilize IoT as an automated management technology for state-managed facilities, the IoT network-based facility remote diagnosis and automation, advanced network maintenance, and unmanned facility repair and reinforcement technology were applied to establish a comprehensive IoT-based maintenance plan in order to utilize the passive facility maintenance system as an active maintenance system [4].
In addition, an application technique that can efficiently combine spatial information such as semi-permanent major facilities and design and construction drawings by tracking the deformation of micro-displacement using GNSS/USN /IoT has been proposed [5], and the facility inspection method can be reduced in cost and improved in objectivity if the facility inspection video is used as a reference video and the inspection method is based on drone video centered on automatic processing [6].
On the other hand, a conceptual model of an S-LCC platform for efficient operation of facility systems that automatically analyzes and collects electricity consumption data using IoT was derived [7], and a method for analyzing and processing big data to perform real-time, always-on safety checks on bridges, which are road facilities, through an IoT platform and support rapid decision-making by facility managers was presented [8].
By fully automating the concrete crack detection framework through machine learning, it is possible to maintain facilities more effectively and efficiently than the existing expert-oriented facility maintenance [9], and to apply IoT devices using IoT crack meters to actual sites, it is necessary to establish an LTE communication network and develop a large-capacity battery in addition to its technical capabilities [10].
Based on a paper on embedding PCB-based wireless inductive coupling corrosion potential sensors in reinforced concrete and soil to measure the degree of corrosion and long-term monitoring as a way to monitor safety changes in buildings, it was proposed to monitor buildings by interlocking IoT sensors and platforms [11], or to use a system and artificial intelligence algorithm that identifies and classifies six categories of data: building occupants, indoor environmental conditions, external environmental conditions, control systems and devices, equipment technology, and energy flow [12].
In addition, the possibility of using remote sensing systems as a tool to monitor asphalt road pavement conditions has been proposed [13], and studies have been conducted to detect safety changes in buildings with algorithms such as SVM and CNN models, parameter settings, and reported accuracy [14-18].
III. IoT SENSOR BOARD FABRICATION AND PERFORMANCE EVALUATION
The IoT sensor board used in this paper was designed and manufactured by the author using PCB to check the internal condition change factors of the building, and 20 pieces were manufactured, and considering that it is attached to the outside of the building, a battery and a solar panel were attached to the inside of the sensor board. In addition, a moisture-proof agent was sprayed on the case to prevent the penetration of rainwater.
To check the magnitude, tilt, and relative altitude of the building’s vibration, the IoT sensor board is equipped with gyroscope, altitude sensor, and a geomagnetometer sensors, and the MCU uses ESP32-WROOM-32 microprocessor. It is configured to operate on top of the TCP/IP protocol using MQTT, an ISO standard publish-subscribe-based messaging protocol, to connect to remote locations with limited network bandwidth (Table 1).
In order to secure the objectivity of IoT data acquired from buildings, the prototype was tested for angular range, angular accuracy, acceleration accuracy, and data precision by requesting the Busan City *** public institution, and completed the KOLAS test results in January 2024.
Accuracy and precision are commonly used as performance metrics for sensors. Accuracy is a measure of whether a sensor consistently and accurately displays results that are true to the actual value, and precision is a measure of whether a sensor consistently displays results when measuring the same value multiple times.
Since the IoT sensor board produced in this paper aims to check the impact of internal condition changes in buildings, we adopted angle range, angle accuracy, acceleration accuracy, and data precision as performance indicators (Table 2, Fig. 2).
Performance | Unit | Test result |
---|---|---|
Angle range | Degree | ±10 degrees |
Angle accuracy | Degree | ±0.01 degree |
Acceleration accuracy | g | ±0.01 g |
Data precision | %F.S | ±1% |
The entire data analysis process for anomaly detection and analysis from data collection, preprocessing, anomaly detection to risk score calculation is as shown in Fig. 3.
IV. EXPERIMENTATION AND DATA COLLECTION
The Industry-Academic Cooperation Building of *** University in Busan was selected as the target building for the study. The building is used by many students and is suitable for the study of vibration, tilt, and settlement of the building itself, and it is easy to acquire data, so it was set as the target building of this study, and the location of the building is indicated by the blue square on Fig. 4.
IoT sensors produced in this study were installed on the rooftop of the industry-academic cooperation center for a total of approx. 7 days from 2024.04.02. to 2024.04.08.
The total number of data acquired during this period was 9,716,989, which is much more real-world data than previ ous studies with thousands or tens of thousands of data points (Table 3).
Data acquisition | Result |
---|---|
Total number of data | 9,716,989 |
Start time | 2024-04-02 18:32:21 |
End time | 2024-04-08 23:59:59 |
A total of 20 sensor boards were installed at a certain distance on the rooftop space of the Industry-Academia Cooperation Center, and the installation location of each sensor board is shown in Fig. 5 and Table 4 in terms of latitude and longitude coordinates. Each position value is automatically acquired from the sensor board.
The gyro, barometric pressure, and altimeter data generated by the IoT sensors were stored in real time by installing a local computer on the roof of the Industry-Academic Cooperation Center and utilizing Ethernet and Wi-Fi networks to minimize the loss of original data.
Fig. 6 shows that the data acquired from the IoT sensor board is stored in MongoDB on the desktop PC in the experimental environment.
V. DATA ANALYSIS
The data acquired from IoT sensors was exported from MongoDB to CSV format to reduce unnecessary conversion processes, and includes gyroscope data and barometric pressure sensor data. The data is recorded in milliseconds, which is characterized by the ability to detect instantaneous changes in the industry-academia cooperation center.
In the Fig. 7, the timestamp column is represented in milliseconds, and the sensordata.gyro[0], gyro[1], and gyro[2] columns represent the angular velocity in each x, y, and z axis direction, which is used to analyze small vibrations and slope changes of the structure.
The sensordata.atm[0] column represents the air pressure at the point where the sensor is located, and is used to analyze the elevation change and settlement of the structure.
The gyroscope data is used to analyze the vibration and tilt of the structure, checking the x-axis angular velocity, y-axis angular velocity, and z-axis angular velocity, and the air pressure and altitude data is used to check whether the structure is settling or not. The columns required for the study were selected from the entire CSV file and modified to gyro_x, gyro_y, gyro_z, pressure, and DateTime to preprocess and extract features.
In the Fig. 8, gyro_magnitude indicates the magnitude of the vibration by combining the gyroscope data of the x, y, and z axes, and the larger the value, the stronger the overall vibration of the industry-academia collaboration center.
tilt_angle is a tilt angle that uses gyroscope x, y axis data and expresses the degree of abnormal tilt of the industry-academia collaboration center in degrees (°).
relative_altitude is the relative altitude (m) of the sensor location using barometric pressure data, which measures the subsidence phenomenon as a change in the relative altitude of the industry-academia collaboration center.
altitude_change is the change in relative altitude from the previous measurement, and a positive value indicates an increase in altitude, while a negative value indicates a decrease. In the case of subsidence, negative values are consistently displayed.
In this study, we use IoT sensor data to test changes in buildings based on time with establish anomaly intervals, and analyze cumulative displacement outliers.
In general, LSTM algorithm is mainly used for time series prediction, which predicts the future based on past data, and has the advantage of estimating future changes, but this study aims to examine the change status of the structure from the currently collected data, and utilizes the Isolation
Forest algorithm, an unsupervised learning-based anomaly detection model used to identify anomalies with a small data volume.
The analysis environment is Intel(R) Core (TM) i9-14900 K 3.20 GH, 128 GB RAM, NVIDIA RTX 4090, Windows 11 pro, Visual Studio Code, and the Python version used is 3.10.14.
From the entire datasets, a tree(iTree) randomly selects a sampling ?(psi), and randomly selects q attribute to be used as branching criteria from all attributes in the datasets.
e.g., this study has tree features gyro_magnitude, tilt_anlge, altitude_change.
A branch point p is randomly selected between the minimum and maximum values of the selected attribute q. If xp is less than the p, the data is split into a subtree TL (Tree Left); if xp is equal or larger than the p, than data is split into a subtree TR (Tree Right).
Through split condition, the data x is moved from the root of iTree along the tree nodes until it becomes isolated, such as until only one sample remains at the leaf node, or until the pre-defined height limit is reached. And this is called the path length and is denoted by h(x). The average value of h(x) across all trees is calculated.
Since the path length varies depending on the number of samples, it cannot be compared based on length value. Therefore, it is needed to mathematically approximate the expected average path length of normal data when all samples are selected with the same probability, and divide the path length by the expected average path length obtained through this process.
The expected average path length obtained here is called the normalization constant c(ψ).
If data x is isolated within a short path in the iTree, it is likely to be an outlier. Therefore, for a specific data x in the range of 0 to 1, the expected path length based on the sub sample size ψ is normalized to obtain the outlier score s(x, ψ). The data points are then sorted in descending order of outlier scores, and the top contamination % with the lowest scores are labeled as outliers. The contamination rate serves as the threshold value for determining the outlier score.
The Isolation Forest algorithm is like to a binary tree from regression tree. It involves randomly selecting a sample, called ?(psi), to create a tree structure(iTree). It is divided random values at each node. This process calculates the distance of an object from a leaf node, which is isolated as the outlier score.
A random sample of ?(psi) sample is selected from the entire data set to create a single iTree. Next, a variable q of data x is randomly selected from the iTree, which is used as the splitting axis by randomly selecting one of all attributes at each node division. Then uniform distribution is used to sample a split value p between the minimum and maximum values of the selected variable.
This process is recursively repeated until the sub trees satisfy splitting criteria, either until a single sample remains or until the maximum depth limit (height limit) is reached.
The number of trees(t) was set to 100 in this experiment. Although the variance decreases as the number of trees(t) increases and it converges to ROC AUC value at t=100, but based on previous literature indicating that additional performance improvements are limited despite increased computational cost, this parameter was set to 100.
The sub-sampling size(?) was set to 256 based on prior literature. This consideration accounts for the fact that outliers can easily become isolated even in small samples.
Contamination was set to 0.3% (0.003). This was set how to detect outliers caused by internal factors of the building and to minimize external factors such as this building’s construction work, vehicle movement, and sensor problem issues. Out of the total 9,716,989 data points, the period from 6:00 to 18:00 on April 3rd, 2024 was considered as the ground truth for pseudo labels, achieving precision≈92.0%, recall≈92.0%, and ROC AUC≈93.0%, which indicates that the parameters were appropriately set.
The height limit was set to log2(ψ) based on previous literature. The maximum height of a binary tree is log2(ψ). If the height limit is small branching ends at the root node, limiting a distinction between normal and abnormal values. If it is large, normal values have longer paths and abnormal values have shorter paths, improving outlier detection.
A total of 28,405 anomalies were identified by analyzing the gyro_magnitude data, which represents the magnitude of the vibration, and shown in Fig. 9, where the x-axis is time and the y-axis is magnitude of the vibration, the blue line represents the magnitude of the vibration measured over time, and the red area represents the anomalies. From 18:00 on April 2 to 6:00 on April 3, high vibration was detected in almost the entire area, and from 18:00 on April 3, the vibration increased again.
A total of 26,942 anomalies were identified by analyzing the tilt_angle data. The tilt angle measured by the sensor in blue is the tilt angle (unit: degree), and the points in red indicate the anomalies in tilt. In Fig. 10, most of the anomalies are around –150 degrees (°), indicating that the industry-academia collaboration center moved abnormally in one direction relative to the sensor.
Next, the data from the sensors were integrated to determine the risk level of the building in a time series. By integrating the data on the magnitude of vibration (gyro_manitude), tilt (tilt_angle), and relative altitude change (altitude_change), the risk score was calculated on an hourly basis.
The risk score is a single, comparable metric of multiple signals (sensor characteristics), and a moving average was used to smooth out short-term noise and identify trends in the continuous signals of the sensor data.
The weights used in the risk score are based on three data points: the magnitude of the structure’s vibration (gyro_magnitude), the change in tilt (tilt_angle), and the change in relative altitude (altitude_change).
The criteria for selecting the weights is based on features shown in Fig. 8 of IoT data. In isolation forest algorithm, iTree randomly selects a sample and randomly selects q attribute to be used as branching criteria from all attributes.
As shown in Table 5, the magnitude of the vibration and the tilt angle are each weighted 0.4 because they directly reflect the displacement of the building, and the relative elevation change is weighted 0.2 because it can be affected by environmental factors such as air pressure and temperature.
Upon examining the observed data, gyro_magnitude and tilt_angle change values showed high sensitivity, while al-titude_change values exhibited a relatively stable characteristic. To determine the weights for three data points, we derived five combinations that sum to 1.0 and conducted a preliminary review considering the sensitivity characteristic of them mentioned above. As results, the weight was set to gyro_magnitude 0.4, tilt_angle 0.4, altitude_change 0.2 for reflection in all data change values.
For the purposes of this study, a risk score is defined as follows.
Fig. 11 visualizes the weighted score (magnitude of oscillations, change in slope, change in relative elevation) and the score per component as a time series evolution, with red dots (markers) indicating points where the score rises sharply from the previous point in time.
The risk score is almost constant for most of the time intervals, indicating that the building’s condition change is affected by both vibration and tilt changes, and that height changes such as settlement are relatively small and constant.
However, the width of the score decreases significantly in the time interval from 06:00 to 18:00 on April 3, 2024, indicating that some outliers are missing and that the Isolation Forest algorithm is not able to reflect the trend of outliers over time.
The result of the test with the isolation forest algorithm shown in Fig. 11 is that the risk_score graph is a result of judging whether the data obtained from the tree-based structure is an outlier, which has limitations in analyzing atypical outliers (correlation between variables). Therefore, we applied extended isolation forest to check for complex outlier changes using risk_score, which is a weighted sum of gyro_magnitude, tilt_angle, and altitude_change, as a single variable in continuous data, and checked the results.
Extended isolation forests represented the points in time when one or more of the following fluctuated rapidly: vibration, tilt, or elevation change, and showed how risk scores and outliers increased together with sustained vibration or tilt change.
The x-axis represents time for approx. 7-days from April 2 to April 9, 2024, and the y-axis represents the change in state based on a weighted sum of the magnitude of the risk_score’s oscillation, slope change, and relative elevation change. The blue line shows the change in risk_score over time, and the red dots indicate outliers detected by the extended isolation forest (Fig. 12).
After applying the extended isolation forest, redundancy and noise among variables were not eliminated, so we used principal component analysis (PCA) to identify the main direction of variance of the data acquired from the sensors, and then applied the underlying isolation forest. This is a research method proposed by Liu et al. [18], and we implemented it for the test of this study to confirm the results.
The extended Isolation Forest was applied to improve the range and segmentation resolution of outlier detection, but the redundancy and noise problems among variables were not eliminated. The risk_score variable is composed of the weighted sum of gyro_magnitude, title_angle, and altitude_changedml, which tend to change simultaneously in time, so we used principal component analysis (PCA), which is used to maximize the total variance while converting high-dimensional data into low-dimensional space, to identify the main direction of variance of the data acquired from the sensor, and then applied isolation forest based on the results. This is a research method proposed by Liu et al. [18] and was applied to the research environment of this study to confirm the results.
The x-axis shows the time flow, and the y-axis shows the dimensionality reduction by PCA based on three variables: gyro_magnitude, tilt_angle, and altitude_chang. The blue part is the full signal, while the red dots represent outliers detected by PCA and isolation forest. Starting on April 2, 2024, the sensor values show variability, with possible temporary building vibrations or tilt changes between April 5 and April 6. From April 8, it has been found to be stable.
Fig. 13 shows the correlation of sensors rather than a single variable in tracking building condition changes and detecting anomalies based on IoT sensors. This shows that by applying PCA and isolation forest in combination, the changes that were limited to be detected by the existing isolation forest could be identified.
The performance comparison and improvement of the algorithms used in this paper, PCA and isolation forest showed improvement overall, but showed little changed in altitude_change. isolation forest basically has good performance in anomaly detection, but it was difficult to distinguish anomalies at specific points or there were some points, and Extended Isolation was able to see complex anomalies better while complementing the limitations of it.
PCA and isolation forest were shown to remove redundancy and noise of three data (gyro_manitude, til_angle, altitude_change), and showed good results based on Risk score.
VI. CONCLUSION
In this paper, we analyzed the state change of the structure based on the vibration, tilt, and relative elevation change data collected using IoT sensors for about 7 days from April 2 to April 9, 2024, for the Industry-Academic Cooperation Center of Busan *** University. The total number of data collected was about 9.71 million, and based on this, we analyzed the factors that affect the state change inside the building.
The magnitudes of vibration and tilt changes showed a constant trend in most time intervals, but large changes were observed in certain time periods, while elevation changes were relatively small and stable. A weighted sum of the magnitude of vibration, tilt, and relative elevation changes was used to calculate a risk score, and the score was visualized as a time series to check the association between changes in the risk score and physical events.
In addition to the isolation forest algorithm, the extended isolation forest algorithm was used to analyze the IoT sensor data, and it was shown to effectively separate the data distribution and precisely detect outliers in buildings.
In addition, the combination of PCA and isolation forest was shown to detect some undetected time intervals based on the correlation between variables.
This study confirmed that the accumulated data of IoT can be utilized to build a system that diagnoses structural condition changes and defect factors in real time.
Changes in data on internal factors of these structures are difficult to identify in the static state of the structure, and it is expected that it can be utilized for integrated safety management by supplementing diagnostic methods based on video data of the structure.
The results of this study suggest the possibility of building a data-based risk zone identification and real-time detection system, and given that certain abnormalities were concentrated in some time periods among thousands of data, it is expected that IoT big data for public facilities or structures used by a large number of people can be built in the future, and potential risks that occur under certain conditions or accumulate changes due to internal factors can be detected through long-term observation, contributing to the safety management of society and citizens.
And also, this study has the limitations as follows and needs to be improved by a continuous study in the future.
This paper was conducted based on data collected from IoT sensors over seven days at specific building. As making IoT sensor boards production, we were focused to ensure stability and sustainability. However, during data collection, the changes unrelated to actual internal physical changes within the building may have occurred due to insufficient battery power, or unstable network connection, or and sensor’s malfunctions, and detected as normal noise.
To delete such unrelated sensor noise from datasets and to ensure data’s reliability, it is necessary to additionally conduct pre-filtering algorithms.
External events such as construction works, vehicle movement, and weather changes (e.g., strong winds occurring) near the target building can cause errors in the accuracy of IoT data. It is necessary to collect data on such environmental interferences events as secondary data, and to post-process outliers that occurred during that time period.
In this study, the tested method was applied to a particular building in terms of location, structure, and environment. However, although the framework of this study can be applied to other types of buildings, researchers need to consider the operational differences of IoT sensors when applying them to various building types to improve IoT sensors and optimize data distribution and characteristics. This will enable the method to be applied to other types of buildings.