Article
Open access
Published: 31 January 2025

Real-time discrimination of earthquake signals by integrating artificial intelligence technology into IoT devices

Zhi Geng ORCID: orcid.org/0000-0003-1798-4475^1,2,
Yanfei Wang ORCID: orcid.org/0000-0001-7992-3999^1,2,
Wenyong Pan^1,2,
Caixia Yu^1,2,
Zhijing Bai¹ &
…
Hongzhou Zhang¹

Communications Earth & Environment volume 6, Article number: 73 (2025) Cite this article

4321 Accesses
10 Altmetric
Metrics details

Subjects

Abstract

The real-time detection and analysis of seismic signals is crucial in geophysics research, especially when it comes to monitoring catastrophic events. We present an evolutionary deep learning method that yields a model named MCU-Quake. This model encodes the discrimination process as a single numerical value, offering interpretability with only 2693 parameters. Trained on raw seismic waveforms from Utah, USA, MCU-Quake demonstrates its generalization capability across a global natural earthquake dataset. Notably, the model effectively identifies typical explosions during the Russia-Ukraine war in Europe. The knowledge to discriminate between ambient noise, explosions and natural earthquakes can be represented by values of −5.01 (std: 1.14), 1.96 (std: 0.36), 1.01 (std: 0.49), respectively. The model can be deployed on Internet of Things (IoT) devices, including most microcontrollers, which are constrained by limited computational resources (kilo-bytes of memory) and energy consumption (micro-Watts). The results indicate the prospect of on-site missions of artificial intelligent sensors.

Universal neural networks for real-time earthquake early warning trained with generalized earthquakes

Article Open access 27 September 2024

Real-time determination of earthquake focal mechanism via deep learning

Article Open access 04 March 2021

Deep learning forecasts the spatiotemporal evolution of fluid-induced microearthquakes

Article Open access 07 August 2025

Introduction

Real-time monitoring of seismic activity is crucial for early warning systems, helping to safeguard life and property in high-risk areas by detecting hazardous earthquakes¹, volcano eruptions² and potential breaches of international law³. Seismic signals from natural earthquakes tend to have low frequencies, while explosion events exhibit a broader frequency range, and vehicular movements present resonant signatures. These spectral characteristics usually can only be effectively analyzed when long enough seismic signals have been recorded. Identifying these unique sources is crucial in geophysics, not only for earthquake monitoring but also for applications like detecting illegal blasting or monitoring traffic-induced ground vibrations. With the rise of the Internet of Things (IoT)⁴, there is an increasing demand to deploy numerous monitoring nodes driven by artificial intelligence (AI) technology in concerned areas targeting low-cost and low-latency requirements.

The detection of seismic events has been extensively studied^5,6,7,8,9,10, with many machine learning models performing well for seismic analysis. However, most struggle to meet cost-efficient demands due to their requirements for large computational resources and a significant number of parameters. Popular methods fall into two main categories: classical machine learning and deep learning methods. Classical machine learning models, such as E3WS¹ and 2S-ML-EIOS¹¹, rely heavily on manually defined features, but they are sensitive to noisy seismic data and require substantial real-time computing operations. Specifically, XGBoost, as a key component based on decision-trees algorithm in both E3WS and 2S-ML-EIOS, relies heavily on multi-threaded CPUs to facilitate parallel computing. On the other hand, deep learning models based on neural networks offer resilience to noise in seismic waveforms but often have a large number of parameters, requiring significant computational resources. Some deep learning models also use high-order features such as spectral images extracted using waveforms during inference as input^{5,12,13,14,15,16}, while others use processed multi-component waveforms directly^17,18.

One prevalent challenge for these deep learning models is their substantial number of parameters. For example, Vision-Transformer has up to approximately 24 million parameters¹⁹. Another challenge lies in achieving real-time performance, as most open-source solutions require long input waveforms (tens of seconds to minutes) after a seismic event, which is more suitable for reviewing historical data rather than real-time monitoring^13,16,20. While these models may perform well in situations where all information is known beforehand, real-time monitoring missions face the challenge of an unknowable future. Consequently, making timely judgments based on just a few seconds of data more closely aligns with the demands of such scenarios.

Large deep learning models have attracted significant interest for various tasks, including computer vision^21,22 and language understanding^23,24. These large models consist of millions to billions of parameters and require high-performance computers equipped with expensive GPUs^10,20,25. Many models consume considerable computational resources (tens of gigabytes memory) and electric energy during both training and operation. Consequently, they thrive on high-performance personal computers and cloud-computing platforms that demand kilowatt to megawatt power²⁶.

In contrast, battery-powered devices have received less attention globally, yet billions of low-power (milliwatts to watts) devices such as microcontroller units (MCUs) serve various aspects of our lives, including smart agriculture²⁷, industry²⁸, environment²⁹, healthcare³⁰, and consumer electronics³¹. Local machine-learning operations on IoT devices offer advantages in latency, power consumption, and privacy compared to big data analysis on cloud platforms. However, many IoT devices, especially microcontrollers, have significantly limited memory (measured in kilobytes) and CPU processing capabilities (in megahertz). As a result, these devices are frequently used as both real-time data collectors and preliminary analysis platforms while also serving as intermediary nodes for transmitting control signals to actuators. The demand for computational resources and the large number of parameters in conventional deep learning models substantially limit AI applications to current IoT devices.

Here, we demonstrate the potential of tiny deep learning on MCUs for real-time seismic source discrimination. Our model discriminates among ambient noise, quarry blasts (i.e., explosions), and natural earthquakes using just the vertical component of seconds-long raw waveforms. We first propose an evolutionary deep learning method that constructs a population of lightweight neural networks balancing prediction error, model size, and inference time. The final evolved model, named MCU-Quake, contains only 2693 parameters, representing approximately 0.09% of the parameters in the Vision-Transformer model¹⁹. We evaluate MCU-Quake on various datasets alongside edge-computing-friendly alternatives, including cutting-edge E3WS¹ (a classical machine learning model), LEQNet (a deep learning variant of EQTransformer¹⁸), and a traditional detection method known as short-time average/long-time average (STA/ LTA)³². The MCU-Quake achieves comparable accuracies to large models while demonstrating significant advantages for real-time monitoring on MCUs. This advantage is particularly evident when considering that LEQNet’s inference energy consumption exceeds that of MCU-Quake by over 2700 times. We propose that the integration of AI into IoT devices (AIoT), leveraging compact AI models, could greatly enhance early warning systems for sensitive events. This approach is particularly promising due to its potential for low cost, minimal time delay, and reduced energy consumption.

Results

Evolutionary deep learning progress

Given a period of raw 1-dimensional (1D) signals, our proposed method delivers a population of neural networks that have evolved over eight generations. Although machine learning accuracy was an important focus, multi-objective constraints were also emphasized. The detailed results in Fig.��1 show the neural networks evolved using a 7 s period dataset. These networks for encoding signals evolved through minimizing fitness values (Fig. 1a, b), which consists of three decision factors: model size, prediction error, and inference time (Fig. 1c). As indicated by the red scatter in Fig. 1c, the desired solution, known as the Pareto front, contains a group of neural networks that achieve a balance among the three decision factors.

Initially, there was a population of 256 individual neural networks with unique architectures. Architecture parameters, such as the number of different 1D convolution kernels, feedforward layers, corresponding layer size, pooling size, and stride step are all randomly initiated variables. Correspondingly, the population fitness distribution shape (generation 0 in Fig. 1a) indicates that the individual network performance is mainly concentrated around the median. With evolutionary updates of the selected networks’ architectures, new individuals displayed increasingly smaller fitness values in subsequent generations. As shown in Fig. 1a, with increasing generations, the wider bottom section of the violin plot represents a higher proportion of individual networks with better signal discrimination abilities. Upon closer examination (Fig. 1b), the distribution of the 16 individual networks with the lowest fitness values in each generation displays that an increasing number of superior networks are generated during the evolution processes. Ultimately, a group of neural networks, whose performance was not dominated by any other individual (red scatter and surface in Fig. 1c), was mapped into a multi-objective space defined by the aforementioned three decision factors.

In the Pareto-optimal set, an individual born in the fifth generation with an ID of 20 survived to the end. Unexpectedly, the individual uses only one numerical value to differentiate signals from noise, quarry blasts, and natural earthquakes. We name this network MCU-Quake. The architecture of MCU-Quake is shown in Fig. 1d (open-source code available). MCU-Quake is composed of a 1D-convolution block that transposes raw signals into latent codes, and a feedforward network unit used to encode these latent codes as critical embedding value for specific events. With memory usage of approximately 10 KB for its parameters, MCU-Quake can be deployed on most microcontrollers. For brevity, we discuss MCU-Quake in detail in this study.

Benchmarks on multiple datasets for edge-computing friendly models

Figure 2 summarizes inference metrics for edge-computing friendly models across multiple datasets. The MCU-Quake demonstrates its capability to not only identify signal sources on the UUSS dataset but also accurately detect natural earthquake arrivals on both the UUSS and STEAD datasets. In contrast, other models such as E3WS, LEQNet, and the STA/LTA method are limited to detect seismic arrivals.

**Fig. 2: Confusion matrices of edge-computing friendly models/methods on various datasets.**

Notably, MCU-Quake is the sole deep learning model specifically designed for inference using single-channel (vertical component) seismic waveforms. For seismic source discrimination on the UUSS dataset (Fig. 2a), MCU-Quake achieves precision scores of 87.34% and 84.18% for quarry blasts and natural earthquakes, respectively, with recall scores of 87.49% and 84.74%. In terms of seismic detection, MCU-Quake attains precision and recall scores of 99.37% and 98.96% on the UUSS dataset (Fig. 2a or d) and 98.02% and 96.44% on the STEAD dataset (Fig. 2c). Conversely, the classic STA/LTA method is known to be heavily impacted by noise and preset threshold, resulting in best recall scores of 81.14% and 86.85% for the UUSS dataset (Fig. 2g) and the STEAD dataset (Fig. 2j), respectively.

Using three channels of raw seismic waveforms, MCU-Quake exhibits nearly identical performance as when using only the vertical component on the UUSS dataset (Fig. 2a, b). For seismic detections, MCU-Quake maintains comparable performance on the partially used UUSS dataset for training to that observed with the global STEAD dataset. However, E3WS, originally trained using carefully processed waveform features, achieves a low precision score of 81.8% for ambient noise on the UUSS dataset (Fig. 2h) and 92.5% for natural earthquakes on the STEAD dataset (Fig. 2k). In comparison, LEQNet, a lightweight variant of EQTransformer, shows similar performance to MCU-Quake across both datasets, aside from a slightly lower precision score for ambient noise on the UUSS dataset (95.39%, Fig. 2i).

Overall, MCU-Quake achieves seismic detection metrics akin to cutting-edge machine learning models (F1-score and accuracy in Table 1). However, MCU-Quake’s computational resource requirements are significantly lower than those of other models. The scrolling input signal for MCU-Quake spans a mere 7-second period of single-component raw data, compared to the 60 seconds of highly filtered three-component waveforms required by LEQNet and the 140 manually-defined attributes dynamically calculated using 10 seconds of three-component waveforms for E3WS. Furthermore, the number of neural-network parameters in MCU-Quake is only 2693, equating to approximately 6.8% of those in LEQNet. Consequently, MCU-Quake’s memory usage in a 32-bit system is as low as ~10 KB. This substantially reduces hardware costs for MCU-Quake implementation, typically amounting to around $6 USD for many microcontrollers compared to the $600 USD cost associated with GPUs required by traditional neural-network models.

Table 1 A summary of major differences among models

Full size table

Energy efficiency is evaluated for the aforementioned models and methods. All models can be deployed on single-board computers, which generally possess high-performance CPUs and gigabytes of memory. However, only the classic STA/LTA method and MCU-Quake are compatible with microcontrollers featuring limited CPU clock frequencies (MHz) and memory (kilobytes). As shown in Fig. 3a, MCU-Quake demonstrates a computing efficiency comparable to that of the classic STA/LTA method. On the selected single-board computer (Raspberry Pi 4B, left panel in Fig. 3b), energy consumption for a single input to LEQNet is 334.702 mJ, while that of MCU-Quake (based on Gaussian density function) amounts to 0.121 mJ. The corresponding inference time-lapse values are 350.867 ms and 0.224 ms for LEQNet and MCU-Quake, respectively. Device idle power is approximately 2.3 watts, over ten times larger than that of the selected microcontrollers (right panel in Fig. 3b). Despite the low CPU clock frequency of the microcontroller (Arduino Nano 33) at only 64 MHz (Table 2), MCU-Quake executes inference within a time-lapse of approximately 128 ms. As illustrated in Fig. 3b, when using limited computing resources scaled to microcontrollers, only the classic STA/LTA method and MCU-Quake can facilitate real-time seismic signal detection.

**Fig. 3: Computational resource consumption of different models.**

Table 2 Measurement of computational resource requirements for models/methods deployed on edge-computing devices

Full size table

Learned knowledge for discriminating signal sources

The MCU-Quake model acquires three numerical values: −5.01, 1.96, and 1.01 from events monitored in western USA (the UUSS dataset), representing typical ambient noises, quarry blasts (explosions) and natural earthquakes, respectively (Fig. 4a). MCU-Quake generates an embedding value for each input signal it receives. The three typical values are mode values of corresponding distributions from the UUSS training dataset’s embeddings. The source of a given period of signals can be determined by calculating the likelihood of the input embedding using density functions shown as solid curves in Fig. 4a. Consequently, MCU-Quake discriminates signals by comparing the distance between the input embedding value and these three typical values.

**Fig. 4: Knowledge representation of signal sources.**

Interestingly, even though MCU-Quake has not been trained on distinct global earthquake events (the STEAD dataset, also see Fig. 5b), all the embeddings of ambient noises and most earthquake signals from the STEAD dataset distributed within the learned Westen USA distributions (histogram bars in Fig. 4b). On the other hand, events from the Russia-Ukraine war in Europe split into explosion and earthquake distributions (Fig. 4c), with some believed to be false negatives for explosions (see discussion in detail). The extracted embeddings demonstrate MCU-Quake’s generalization ability and provide evidence for its learned knowledge in discriminating signal sources.

**Fig. 5: Review of Russia–Ukraine war events.**

To investigate MCU-Quake’s discriminative capacity on the UUSS test dataset, we display embedding distributions for input signals containing increasing coverage of seismic waveforms (Fig. 4d, e). For signals with 0.5 s of seismic waveforms, MCU-Quake cannot distinguish key signal sources, as shown by overlapping distributions from using vertical-component datasets (Fig. 4d) or stacked scatters from three-component datasets (Fig. 4e). Signal sources become distinguishable for input signals containing at least 2 s of seismic waveforms. As expected, the best discrimination performance is achieved with input signals containing 7 s of seismic waveforms.

During real-time inference, MCU-Quake can detect seismic arrivals in seconds and estimate signal sources using dynamically scrolling input windows (Fig. 4f). In Fig. 4f, the red shaded areas to the left and right denote noise and the continuation of a weak earthquake signal. The MCU-Quake effectively recognizes the signal source as a ground truth earthquake with high confidence. This demonstrates that even after some time following seismic arrival, the MCU-Quake is able to accurately identify the signal source as an earthquake rather than ambient noise.

Discussion

Interpretability of the model

Only using seconds of the vertical component of raw signal waveforms is an outstanding feature of MCU-Quake. Though seemingly imperceptible, the signal interval holds intrinsic significance. Linville et al. (2019) identified the areas in waveform spectrograms that cause significant influence on average prediction (Fig. 4 in Linville et al.⁵). Their findings reveal that the vertical component alone is adequate for discriminating between quarry blasts and local earthquakes, using a time window of under 10 s. Using models hundreds of times larger than MCU-Quake, they achieved an accuracy of 84.7 ± 0.7% with a time window of 9 s. On the other hand, Lara et al. (2023) analyzed the most import features for the E3WS concerning window duration from 3 to 46 seconds (Fig. 10 in Lara et al.¹). In the initial 3 s post-seismic arrival, four out of five manually-defined features were derived from the vertical component. Although these features were evaluated for magnitude estimation, they underscored the importance of the vertical component. The MCU-Quake embeddings, extracted from two horizontal components (E and N), further demonstrates the superiority of the vertical component data (see Supplementary Fig. 1).

The interpretability and robustness of model inference are paramount in safety-critical applications such as real-time event surveying and decision making. Unfortunately, most machine learning models operate as black boxes, merely producing values between 0 and 1 with no discernible meaning. The prevalent approach is to establish empirical thresholds for classification. For example, LEQNet and EQTransformer set their respective thresholds at 0.5, 0.3, and 0.3 to classify seismic events, P-wave and S-wave. While E3WS gains some interpretability with its 140 input features from waveforms, its performance highly depends on the trigger threshold. For seismic detection, E3WS attained peak performance (99.9%) using well-processed waveforms at a trigger threshold of 0.21¹. However, this study witnesses commendable E3WS performance (90.61%) with raw waveforms at a threshold of 0.8. On the other hand, MCU-Quake employs a contrastive learning approach^33,34 to maximize differences between signal sources. By mapping all input signals into distributions (Fig. 4a), MCU-Quake can estimate the likelihood of a signal source using probability density functions. As seen in Fig. 4a, overlapping distributions between explosion embeddings and earthquake embeddings signify indistinguishable seismic events. Such occurrences are common, particularly for low signal-to-noise ratio events.

Review of Russia–Ukraine war using 1-component seismic array data

The MCU-Quake offers a means to review typical seismic events in the field. In Fig. 5, we review Russia–Ukraine war using seismic array data published by Dando et al.³. Dando et al. (2023) automatically detected and located 1282 explosions from 24 February to 3 November 2022, in a region of northern Ukraine (Fig. 5c–f). However, as noted by Dando et al. (2023), there are numerous false positives due to the low signal-to-noise ratio of events, and some explosions cannot be observed in the waveforms. The local magnitude of most events is only approximately 0.2. Using MCU-Quake, we extract waveform embeddings and map the results in Fig. 4c. Our model detects 1276 events as seismic occurrences, with 371 events identified as typical explosions in northern Ukraine despite trained on quarry blast events in western USA. Dando et al. (2023) reported 64 explosions on 7 March with highest activity and a final heavy bombardment on 31 March. In comparison, MCU-Quake identified 15 typical explosions on 7 March and one explosion event with a local magnitude of 0.68 on 31 March. Although it appears that MCU-Quake missed many claimed explosions by Dando et al. (2023), this discrepancy may be attributed to substantial differences in coupling and energy propagation between surface and underground explosions. Despite these data limitations, MCU-Quake provides real-time results aligning with the war events over several months.

Tiny deep learning for edge computing devices

In traditional deep learning applications, variants of long short-term memory (LSTM) units for recurrent neural networks (RNN) are commonly used for sequence data such as language, audio³⁵ and seismic signals⁵. However, LSTM and RNN demand substantial computational and memory resources due to their operational complexity and the numerous internal variable states^36,37. Although some attempts have been made to alleviate computational complexity³⁸, the weakness of memory-related operations, including attention mechanisms, leads to a significant increase in computational cost with model size. Reports indicate that GPT-3 models trained on large graphics-processing-unit (GPU) clusters take several months to complete approximately $3\times {10}^{23}$ flops³⁹. In contrast, classical image classification models containing millions of parameters consume hundreds of megabytes of memory⁴⁰.

Tiny machine learning emerges as a promising research topic. Our evolutionary deep learning method is a neural network architecture search (NAS). Despite NAS activities causing considerable CO₂ emissions from training large numbers of large models⁴¹, all the tiny models in this study were trained quickly using only CPUs. Conventional neural networks require high memory bandwidth for large data blocks, necessitating expensive specialized hardware like GPUs or TPUs for large models. However, our tiny machine learning system’s data communication overhead using CPUs was significantly smaller than that using GPU devices. In contrast to conventional NAS methods that primarily emphasize accuracy metrics, our approach achieves multiple objectives through evolutionary optimization and the application of interpretable constraints. Despite a vast search space exceeding 10 billion possibilities, CPU-based parallel computing offers an affordable solution for our evolutionary deep-learning mission. Our evolutionary computing was conducted on a cloud platform equipped with multiple CPUs. The mission evaluated approximately 2000 individuals in parallel, with each individual taking an average training time of approximately 1.4 min.

The compact design and minimal hardware requirements of MCU-Quake facilitate efficient deployment on low-power, cost-effective MCUs. In contrast, other research has developed hardware-exclusive models on field programmable gate array (FPGA) devices⁴². FPGA-based systems typically exhibit a significantly higher expense compared to their MCUs counterparts, costing several tenfold more. The MCU-Quake, which is universally compatible across various hardware systems, can provide real-time similarity scores with uncertainties between input signals and various known sources, as demonstrated in Supplementary Movie 1. As summarized in Table 2, an Arm Cortex-M4 device operating at a CPU frequency of 64 MHz conducts inference within approximately 128 milliseconds. The ESP32 S3 device, priced at around six USD, reduces the inference time to approximately 68 milliseconds. On the single board computer (Raspberry Pi4 Model B), MCU-Quake is capable of real-time inferences in 0.2 ms. Although the single-board computer exhibits significantly superior computing performance compared to MCUs, these edge computing devices possess considerably fewer resources and lower costs than conventional personal computers. Of particular note, the power consumption measurements outlined in Table 2 range from approximately 0.1–2.3 W, indicating that all devices can be solar-powered as artificial intelligent sensors in field deployments.

The proposed method is inspired by Fourier transformation of time-series signals. While continuous signals in the time domain can be split into several sinusoids with certain frequencies in the frequency domain (Supplementary Fig. 2), main frequencies of natural signals are unknown and complex (Supplementary Fig. 2b). Some studies have used convolutional neural networks (CNN) to extract key features from spectrum images of time signals^{5,12,13,14,15,16}. Conventional 2D convolution operations are computationally costly, with a complexity of ${{\rm{O}}}({N}^{2}{n}^{2})$ for each kernel per layer, where $N\times N$ is input sample size and $n\times n$ is the convolution kernel size. Researchers integrate Fourier transformation algorithms into neural networks to process features in the frequency domain with a complexity of ${{\rm{O}}}({N}^{2}{{\mathrm{log}}}_{2}N)$ to ${{\rm{O}}}({N}^{2}{{\mathrm{log}}}_{2}n)$^43,44. However, efficiency increases are negligible when $N\gg n$ as is the case in CNN⁴⁴. More importantly, computational resources available for MCUs could not provide intermediate computations for real-time missions. Consequently, we propose a feature extraction method with a complexity of ${{\rm{O}}}({Nn})$ to merge key information from time and depth direction of the feature space (Fig. 7a). Our method increases inference accuracy by approximately 11% on the test dataset compared to conventional convolution operation on time-series data.

**Fig. 7: An evolutionary deep learning solution.**

The MCU-Quake only extracts embedding values using a fixed period of data without considering long-term time-dependent and spatial relations in signals. Although our simple machine learning operations show remarkable discrimination performance, elaborate methods that incorporate time and spatial relations between different sensor stations are believed to contribute significantly. The current version of MCU-Quake is only able to detect seismic arrivals and discriminate signal sources. However, seismic events follow an extremely skewed distribution. That is, the proportion of seismic duration over a long period of time normally does not exceed ~1%⁴⁵. Consequently, MCU-Quake is particularly suitable as a long-running trigger model that invokes large models occasionally in a cascading fashion, achieving maximum capabilities at optimum energy efficiency⁴⁶.

The ability to run MCU-Quake on low-cost hardware is a notable achievement, but we believe its lightweight nature also enables broader applicability. Notably, our model can be effectively deployed on older computational devices still in use in some seismic monitoring setups, thereby expanding its impact and utility. This extended compatibility is particularly significant for existing seismic stations with legacy hardware that cannot support more resource-intensive models. By adapting MCU-Quake to these systems, we can enhance their capabilities without requiring substantial upgrades or replacements. Furthermore, integrating our model with other sensor technologies, such as GPS and environmental sensors, could provide a more comprehensive monitoring system. With its high accuracy, efficiency, and versatility, we believe MCU-Quake is an attractive solution for researchers and practitioners across various fields related to earth and environment monitoring.

Methods

Dataset

Three raw waveform datasets are employed: the localized University of Utah Seismograph Stations (UUSS) dataset for modeling and testing, the global Stanford Earthquake Dataset (STEAD) for generalization testing, and the Russia-Ukraine war dataset for case study. For comparison with other published models requiring three-component seismic waveforms, 13497, 76374, and 1335 waveforms meeting the same conditions were randomly selected to generate UUSS, STEAD, and Russia–Ukraine war datasets, respectively. Figure 6a displays the distributions of all these events on a world map. Figure 6b shows detailed statistics of these three-component datasets used for benchmarking, indicating different characteristics between the UUSS dataset and the STEAD dataset. Time intervals specific to each model’s input waveforms are used for fair comparison (Fig. 6c–e). Shaded areas in Fig. 6c–e represent unavailable signal data with respect to the specific model. All datasets have balanced quantities of ambient noise and seismic waveforms.

Our study uses only one vertical component of raw waveforms for modeling. We collect 83,442 manually reviewed seismic records labelled as quarry blasts or local earthquakes by UUSS analysts¹⁸. The vertical channel waveforms are downloaded from the Incorporated Research Institutions for Seismology (IRIS) using web services. Our deep-learning dataset is derived from waveform data that are preserved in their original format, as raw records with no correction for instrument response and frequency filters applied. The sole preprocessing step involves detrending the raw waveforms.

Our method aims to learn a representation that maximizes differences among various signal types. Therefore, we construct a vertical-component UUSS dataset by randomly selecting waveform combinations from 83,442 downloaded records. Each sample contains four waveform periods: reference, positive, negative, and noise waveforms (refer to Fig. 7b and Supplementary Fig. 3). The reference waveform is sliced from a randomly chosen record starting at the initial P-wave arrival. The positive waveform is sliced from another randomly selected record sharing the same label as the reference waveform. Conversely, the negative waveform is sourced from a distinct, differently labeled record. The noise waveform consists of the ambient data preceding the first P-wave arrival. Each data point undergoes normalization using the maximum value within a 9-second window following the first P-wave arrival and resampling at 100 Hz. This process yields training and validation datasets containing 65,065 and 7,165 samples, respectively. Note that our training dataset does not overlap with test data used for published models.

Evolutionary deep learning solution

We design a light-weight model template that is applicable to MCUs (Fig. 7a). The model template consists of a convolution block for extracting and merging features from a 1D signal, and an embedding unit is used to transform the features into a 1D embedding code. The convolution block contains five 1D convolution kernels of variable sizes M, N, O, P, and Q. First, each kernel convolves with the input signal along the time dimension to generate a feature map, as illustrated by the colorful codes in the convolution block in Fig. 7a. Each feature map has a length shape determined by a convolution operation (time direction) multiplied by the number of kernels (depth direction). Subsequently, the two 1D-tensors of the mean values of each feature map are concatenated along the time and depth directions as one 1D-tensor. To reduce memory usage and the number of parameters in the following embedding unit, a global max-pooling operation is applied to a list of the concatenated 1D-tensors above, resulting in a 1D array of latent codes. Finally, the latent codes are input into the embedding unit, which consists of layers of fully connected neural networks, to generate a 1D embedding of the input signal.

In summary, there are 11 variables to be determined by the evolutionary optimization algorithm, including the convolution stride, existence of convolution kernels ($1\times 1$, $1\times 3$, $1\times 5$, $1\times 7$, $1\times 9$), pooling size, pooling stride, and the number of parameters in each of the three fully connected neural networks. If a negative value is assigned to a variable by the evolutionary algorithm, then the element represented by the variable does not exist in the model. The 11 variables comprise a chromosome that determines the neural network architecture (individual embedding model).

As displayed in Fig. 7b, the cloud-computing platform first generates a population of individual models using different chromosomes. Each model is trained and evaluated using a contrastive learning method. Specifically, a training sample consisting of three signal periods is input into an individual model, resulting in three corresponding embeddings. The individual models are optimized to maximize the distance between the embeddings of different events using Eq. (1):

$${L}_{M}=\, {L}_{1}+{L}_{2}\\ {L}_{1}=\, \max \left({D}_{{RP}}-\quad {D}_{{RN}}+\alpha ,0\right)\\ {L}_{2}=\, \max ({D}_{{RP}}-\beta \cdot {D}_{{RA}}+\alpha ,0)$$

(1)

where ${L}_{M}$ is model loss for updating network parameters, ${D}_{{RP}}$ is distance between embedding tensor of reference waveform and embedding tensor of positive waveform, ${D}_{{RN}}$ is distance between embedding tensor of reference waveform and embedding tensor of negative waveform, and ${D}_{{RA}}$ is distance between embedding tensor of reference waveform and embedding tensor of ambient noise, $\alpha$ is a margin in the distance space, and $\beta$ is a hyperparameter to scale the distance between two embedding tensors which have maximum distance. We use 1D Wasserstein as the distance metric. Mathematically, the Wasserstein metric is a measure of how the probability distribution $x \sim P$ (source) moves to $y \sim Q$ (target). In 1D case, there is an optimal transport route such that the Wasserstein distance can be simply written as ${W}_{p}\left(P,Q\right)={\left({\sum}_{i=1}^{n}{d\left({x}_{i}^{* },{y}_{i}^{* }\right)}^{p}\right)}^{1/p}$, and we choose $d\left({x}_{i}^{* },{y}_{i}^{* }\right)=\left|{x}_{i}^{* }-{y}_{i}^{* }\right|$ and $p=1$ in calculation, where ${x}_{i}^{* }$ and ${y}_{i}^{* }$ are the sorted embedding codes and $n$ is the number of codes in an embedding.

Subsequently, each individual is assigned with a fitness value defined as:

$${F}_{M}=\left(1-{V}_{{acc}}\right)+\gamma \cdot \left(\frac{N}{{N}_{s}}\right)+\delta \cdot {e}^{\frac{{T}_{b}}{{N}_{t}}}$$

(2)

where ${F}_{M}$ is the model fitness, ${V}_{{acc}}$ is the model accuracy on validation dataset, $N$ is number of model parameters, ${N}_{s}$ is a norm for $N$, ${T}_{b}$ is model inference time for a batch of samples, ${N}_{t}$ is a norm for ${T}_{b}$, $\gamma$ and $\delta$ are weights.

The algorithm sorts individuals based on their fitness values and selects some individuals to create a new generation of individuals using crossover and mutation operations. Here, an initial population of 256 individuals, a maximum generation of 8, a crossover probability of 90% and a mutation probability of 80% is set. For each generation, 16 individuals with the lowest fitness values (hall-of-fame population) are retained for the next generation. The evolutionary optimization is executed in parallel on a cloud-computing platform equipped with multiple CPU cores. The pseudo-code of the evolutionary process is summarized in Supplementary Table 1. The evolutionary program is implemented using Python, DEAP⁴⁷ and TensorFlow⁴⁸. More information about training and inference is provided in the open-source code.

Real-time inference on microcontrollers

Two microcontrollers are selected to demonstrate the real-time signal source discrimination. As displayed in Fig. 8, the two devices are connected to a thin-film-transistor (TFT) display device, respectively. Every second, a signal period, up to a certain number of seconds, is input to the model deployed on the microcontrollers. The microcontrollers first extract the signal embedding and subsequently calculate the likelihood of the embedding source using the probability density functions for the known sources. The inference results are displayed synchronously on the TFT screen. The time elapsed for each inference is sent to a monitor through a serial port on the microcontrollers. A movie is attached as additional supporting information for the illustration (see Supplementary Movie 1).

Data availability

The seismic events from the UUSS dataset are documented in Linville et al. (2019) (https://doi.org/10.1029/2018GL081119). The STEAD dataset can be found at: https://github.com/smousavi05/STEAD. Waveform data related to the Russia–Ukraine war is available at: https://doi.org/10.17605/OSF.IO/PKAUV. A movie demonstrating real-time inference on microcontrollers are provided as supporting materials. The datasets generated in this study are available open-source on ScienceDB (https://doi.org/10.57760/sciencedb.10775).

Code availability

The code illustrating the MCU-Quake is available open-source on ScienceDB (https://doi.org/10.57760/sciencedb.10775).

References

Lara, P., Bletery, Q., Ampuero, J., Inza, A. & Tavera, H. Earthquake early warning starting from 3 s of records on a single station with machine learning. J. Geophys. Res. Solid Earth 128, e2023JB026575 (2023).
Article Google Scholar
Lara, P. E. E. et al. Automatic multichannel volcano-seismic classification using machine learning and EMD. IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing 13, 1322–1331 (2020).
Article Google Scholar
Dando, B. D. et al. Identifying attacks in the Russia–Ukraine conflict using seismic array data. Nature 621, 767–772 (2023).
Article CAS Google Scholar
Al-Sarawi, S., Anbar, M., Abdullah, R. & Al Hawari, A. B. Internet of Things Market Analysis Forecasts, 2020–2030. 449–453 (IEEE, 2020).
Linville, L., Pankow, K. & Draelos, T. Deep learning models augment analyst decisions for event discrimination. Geophys. Res. Lett.46, 3643–3651 (2019).
Article Google Scholar
Roberts, R. G., Christoffersson, A. & Cassidy, F. Real-time event detection, phase identification and source location estimation using single station three-component seismic data. Geophys. J. Int. 97, 471–480 (1989).
Article Google Scholar
Allen, R. M. et al. Real-time earthquake detection and hazard assessment by ElarmS across california. Geophys. Res. Lett. 36 (2009).
Saad, O. M. et al. Scalodeep: A highly generalized deep learning framework for real-time earthquake detection. J. Geophys. Res. Solid Earth 126, e2020JB021473 (2021).
Article Google Scholar
Li, Z., Peng, Z., Hollis, D., Zhu, L. & McClellan, J. High-resolution seismic event detection using local similarity for large-N arrays. Sci. Rep. 8, 1646 (2018).
Article Google Scholar
Meier, M.-A. et al. Reliable real-time seismic signal/noise discrimination with machine learning. J. Geophys. Res. Solid Earth 124, 788–800 (2019).
Article Google Scholar
Abdalzaher, M. S., Soliman, M. S. & El-Hady, S. M. Seismic intensity estimation for earthquake early warning using optimized machine learning model. IEEE Trans. Geosci. Remote Sensing 61 (2023).
Tibi, R., Linville, L., Young, C. & Brogan, R. Classification of local seismic events in the utah region: A comparison of amplitude ratio methods with a spectrogram-based machine learning approach. Bull. Seismol. Soc. Am. 109, 2532–2544 (2019).
Article Google Scholar
Lara, F., Lara-Cueva, R., Larco, J. C., Carrera, E. V. & León, R. A deep learning approach for automatic recognition of seismo-volcanic events at the cotopaxi volcano. J. Volcanol. Geotherm. Res. 409, 107142 (2021).
Article CAS Google Scholar
Mousavi, S. M., Zhu, W., Ellsworth, W. & Beroza, G. Unsupervised clustering of seismic signals using deep convolutional autoencoders. IEEE Geosci. Remote Sensing Lett. 16, 1693–1697 (2019).
Article Google Scholar
Nakano, M., Sugiyama, D., Hori, T., Kuwatani, T. & Tsuboi, S. Discrimination of seismic signals from earthquakes and tectonic tremor by applying a convolutional neural network to running spectral images. Seismol. Res. Lett. 90, 530–538 (2019).
Article Google Scholar
Köhler, A., Myklebust, E. B. & Mæland, S. Enhancing seismic calving event identification in svalbard through empirical matched field processing and machine learning. Geophys. J. Int. 230, 1305–1317 (2022).
Article Google Scholar
Zainab, T., Karstens, J. & Landsiedel, O. LightEQ: On-device earthquake detection with embedded machine learning. 130–143 (2023).
Mousavi, S. M., Ellsworth, W. L., Zhu, W., Chuang, L. Y. & Beroza, G. C. Earthquake transformer-an attentive deep-learning model for simultaneous earthquake detection and phase picking. Nat. Commun. 11, 3952 (2020).
Saad, O. M., Chen, Y., Savvaidis, A., Fomel, S. & Chen, Y. Real-time earthquake detection and magnitude estimation using vision transformer. J. Geophys. Res. Solid Earth 127, e2021JB023657 (2022).
Article Google Scholar
Ku, B., Kim, G., Ahn, J.-K., Lee, J. & Ko, H. Attention-based convolutional neural network for earthquake event classification. IEEE Geosci. Remote Sensing Lett. 18, 2057–2061 (2020).
Article Google Scholar
Voulodimos, A., Doulamis, N., Doulamis, A. & Protopapadakis, E. Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. 2018 (2018).
Ioannidou, A., Chatzilari, E., Nikolopoulos, S. & Kompatsiaris, I. Deep learning advances in computer vision with 3D data: A survey. ACM Computing Surveys (CSUR) 50, 1–38 (2017).
Article Google Scholar
Naveed, H. et al. A comprehensive overview of large language models. Preprint at https://arxiv.org/abs/2307.06435 (2023).
Min, B. et al. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surv. 56, 1–40 (2023).
Article Google Scholar
Perol, T., Gharbi, M. & Denolle, M. Convolutional neural network for earthquake detection and location. Sci. Adv.4, e1700578 (2018).
Article Google Scholar
D’Hollander, E. H. Transition of HPC towards Exascale Computing Vol. 24 (IOS Press, 2013).
Elijah, O., Rahman, T. A., Orikumhi, I., Leow, C. Y. & Hindia, M. N. An overview of internet of things (IoT) and data analytics in agriculture: Benefits and challenges. IEEE Internet Things J.5, 3758��3773 (2018).
Article Google Scholar
Malik, P. K. et al. Industrial internet of things and its applications in industry 4.0: State of the art. Comput. Commun. 166, 125–139 (2021).
Article Google Scholar
Xu, G., Shi, Y., Sun, X. & Shen, W. Internet of things in marine environment monitoring: a review. Sensors 19, 1711 (2019).
Article Google Scholar
Habibzadeh, H. et al. A survey of healthcare internet of things (HIoT): A clinical perspective. IEEE Internet Things J. 7, 53–71 (2019).
Article Google Scholar
Thapliyal, H. Internet of things-based consumer electronics: Reviewing existing consumer electronic devices, systems, and platforms and exploring new research paradigms. IEEE Consumer Electron. Mag. 7, 66–67 (2017).
Article Google Scholar
Allen, R. V. Automatic earthquake recognition and timing from single traces. Bull. Seismol. Soc. Am. 68, 1521–1532 (1978).
Article Google Scholar
Le-Khac, P. H., Healy, G. & Smeaton, A. F. Contrastive representation learning: A framework and review. IEEE Access 8, 193907–193934 (2020).
Article Google Scholar
Dickey, J., Borghetti, B., Junek, W. & Martin, R. Beyond correlation: A path-invariant measure for seismogram similarity. Seismol. Res. Lett. 91, 356–369 (2020).
Article Google Scholar
Zaremba, W., Sutskever, I. & Vinyals, O. Recurrent neural network regularization. Preprint at https://arxiv.org/abs/1409.2329 (2014).
Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R. & Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Networks Learn. Systems 28, 2222–2232 (2016).
Article Google Scholar
Freire, P. J., Srivallapanondh, S., Napoli, A., Prilepsky, J. E. & Turitsyn, S. K. Computational complexity evaluation of neural network applications in signal processing. Preprint at https://doi.org/10.48550/arXiv.2206.12191 (2022).
Andrychowicz, M. & Kurach, K. Learning efficient algorithms with hierarchical attentive memory. Preprint at https://doi.org/10.48550/arXiv.1602.03218 (2016).
Muhamed, A. et al. Training Large-scale Foundation Models on Emerging AI Chips. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '23). 5821–5822 (Association for Computing Machinery, New York, 2023).
Canziani, A., Paszke, A. & Culurciello, E. An analysis of deep neural network models for practical applications. Preprint at https://doi.org/10.48550/arXiv.1605.07678 (2017).
Strubell, E., Ganesh, A. & McCallum, A. Energy and policy considerations for deep learning in NLP. Preprint at https://arxiv.org/abs/1906.02243 (2019).
Zhu, Y., Zhao, S., Zhang, F., Wei, W. & Zhao, F. Edge intelligence-based seismic event detection using a hardware-efficient neural network with field programmable gate array. IEEE Internet Things J. 11, 9432–9443 (2023).
Cecotti, H. & Graeser, A. Convolutional Neural Network with Embedded Fourier Transform for EEG Classification. 1–4 (IEEE, 2008).
Highlander, T. & Rodriguez, A. Very efficient training of convolutional neural networks using fast fourier transform and overlap-and-add. Preprin at https://arxiv.org/abs/1601.06815 (2016).
Mousavi, S. M., Sheng, Y., Zhu, W. & Beroza, G. C. STanford EArthquake dataset (STEAD): A global data set of seismic signals for AI. IEEE Access 7, 179464–179476 (2019).
Article Google Scholar
Panda, P., Sengupta, A. & Roy, K. Conditional deep learning for energy-efficient and enhanced pattern recognition. In: 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE) 475–480 (2016).
Fortin, F.-A., De Rainville, F.-M., Gardner, M.-A. G., Parizeau, M. & Gagné, C. DEAP: Evolutionary algorithms made easy. J. Mach. Learn. Res. 13, 2171–2175 (2012).
Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. https://www.tensorflow.org (2015).

Download references

Acknowledgements

The authors thank the contributors of the UUSS and STEAD seismic datasets. We created the maps in Figs. 5 and 6 using Cartopy, a Python package designed to simplify map creation for data analysis and visualization. The map data providers include Natural Earth and the GEBCO 2023 Grid (doi:10.5285/f98b053b-0cbc-6c23-e053-6c86abc0af7b). The Center of Big Data and AI in Earth Sciences of IGGCAS in supplying facilities for simulation and practical data processing is greatly acknowledged. This work was supported by the National Natural Science Foundation of China (Grant Nos. 42102351, 42004116, 12261131494, and 12171455).

Author information

Authors and Affiliations

Key Laboratory of Deep Petroleum Intelligent Exploration and Development, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, P. R. China
Zhi Geng, Yanfei Wang, Wenyong Pan, Caixia Yu, Zhijing Bai & Hongzhou Zhang
University of Chinese Academy of Sciences, Beijing, P. R. China
Zhi Geng, Yanfei Wang, Wenyong Pan & Caixia Yu

Authors

Zhi Geng
View author publications
Search author on:PubMed Google Scholar
Yanfei Wang
View author publications
Search author on:PubMed Google Scholar
Wenyong Pan
View author publications
Search author on:PubMed Google Scholar
Caixia Yu
View author publications
Search author on:PubMed Google Scholar
Zhijing Bai
View author publications
Search author on:PubMed Google Scholar
Hongzhou Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Z. Geng conceptualized and designed this study, implemented the methods, performed the tests and conducted the analyses. Y.F. Wang supervised the work and contributed to the optimization method in this study. W.Y. Pan and C.X. Yu contributed to geophysical methods and analysis. Z.J. Bai and H.Z. Zhang contributed to the cloud computing. All authors contributed to the analysis of results and scientific writing of the manuscript.

Corresponding authors

Correspondence to Zhi Geng or Yanfei Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Earth & Environment thanks Mohamed Abdalzaherv and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Carolina Ortiz Guerrero. [A peer review file is available].

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file

Supplementary Information

Supplementary Movie 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Geng, Z., Wang, Y., Pan, W. et al. Real-time discrimination of earthquake signals by integrating artificial intelligence technology into IoT devices. Commun Earth Environ 6, 73 (2025). https://doi.org/10.1038/s43247-025-02003-y

Download citation

Received: 24 January 2024
Accepted: 07 January 2025
Published: 31 January 2025
DOI: https://doi.org/10.1038/s43247-025-02003-y