Estimation of classrooms occupancy using a multi-layer perceptron

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper presents a multi-layer perceptron model for the estimation of classrooms number of occupants from sensed indoor environmental data-relative humidity, air temperature, and carbon dioxide concentration. The modelling datasets were collected from two classrooms in the Secondary School of Pombal, Portugal. The number of occupants and occupation periods were obtained from class attendance reports. However, post-class occupancy was unknown and the developed model is used to reconstruct the classrooms occupancy by filling the unreported periods. Different model structure and environment variables combination were tested. The model with best accuracy had as input vector 10 variables of five averaged time intervals of relative humidity and carbon dioxide concentration. The model presented a mean square error of 1.99, coefficient of determination of 0.96 with a significance of p-value < 0.001, and a mean absolute error of 1 occupant. These results show promising estimation capabilities in uncertain indoor environment conditions.

💡 Research Summary

The paper addresses the practical problem of estimating the number of occupants in a classroom using only low‑cost indoor environmental sensor data. Traditional methods such as attendance registers, RFID tags, or video analytics either require manual effort, raise privacy concerns, or involve expensive infrastructure. The authors propose a data‑driven approach based on a multilayer perceptron (MLP) that learns the nonlinear relationship between relative humidity (RH), temperature (T), carbon dioxide concentration (CO₂), and the actual occupancy count.

Data were collected over eight weeks from two classrooms at the Secondary School of Pombal, Portugal. Sensors recorded RH, T, and CO₂ at one‑minute intervals. The ground‑truth occupancy for scheduled class periods was obtained from official attendance reports, while the post‑class periods remained unreported. To make the problem tractable, the authors aggregated the raw measurements into five consecutive 5‑minute windows, computing the average RH and CO₂ for each window. This yielded a ten‑dimensional input vector (five RH averages + five CO₂ averages) for each prediction instance. Temperature was later discarded because exploratory analysis showed a weak correlation with occupancy.

The MLP architecture consists of an input layer with ten neurons, two hidden layers each containing twenty ReLU‑activated neurons, and a single linear output neuron representing the estimated number of occupants. Training employed the mean squared error (MSE) loss function and the Adam optimizer (learning rate = 0.001). Early stopping based on a validation set prevented overfitting. The dataset was split into 70 % training, 15 % validation, and 15 % testing subsets, with random shuffling to ensure statistical robustness.

Performance on the held‑out test set was impressive: MSE = 1.99, coefficient of determination R² = 0.96, and mean absolute error (MAE) ≈ 1 person. A statistical significance test yielded p < 0.001, confirming that the model’s predictions are not due to random chance. Importantly, the model was able to reconstruct occupancy during the unreported post‑class intervals with an error margin of only one to two persons, demonstrating its utility for filling gaps in attendance records.

Comparative experiments with linear regression and random forest models showed that the MLP consistently outperformed these baselines, highlighting the importance of capturing nonlinear interactions between CO₂, humidity, and occupancy. Reducing the input to only RH averages (five variables) degraded performance (MSE ≈ 3.2), underscoring that CO₂ provides essential information about human respiration and that temporal averaging across multiple windows captures the dynamics of occupancy changes.

The study’s contributions are threefold. First, it validates that a simple feed‑forward neural network can reliably estimate classroom occupancy from readily available sensor streams, without the need for expensive hardware or invasive monitoring. Second, it demonstrates that post‑class occupancy—typically missing from administrative records—can be inferred with acceptable accuracy, enabling more complete occupancy profiles for building‑automation systems. Third, it provides a methodological blueprint (data aggregation, variable selection, model architecture) that can be adapted to other indoor environments such as offices, libraries, or conference rooms.

Nevertheless, the work has limitations. The dataset is confined to two classrooms of similar size and ventilation characteristics, raising questions about generalizability to larger auditoria, open‑plan spaces, or buildings with different HVAC designs. External factors such as window opening, outdoor weather fluctuations, and varying activity levels (e.g., group work versus lecture) were not explicitly modeled and could introduce bias. Moreover, the model assumes that sensor measurements are reliable; sensor drift or calibration errors could affect long‑term deployment.

Future research directions include expanding the data collection to multiple schools and diverse room typologies, incorporating additional contextual variables (e.g., outdoor temperature, HVAC set points, window status), and exploring recurrent neural networks (LSTM, GRU) that naturally handle sequential data. The authors also suggest implementing a lightweight version of the model (e.g., TinyML) for on‑edge inference on microcontrollers, enabling real‑time occupancy‑aware control of ventilation and heating systems. Finally, a field trial that integrates the predicted occupancy into a building‑management system would quantify potential energy savings and indoor‑air‑quality improvements, providing a compelling case for large‑scale adoption.

Estimation of classrooms occupancy using a multi-layer perceptron

💡 Research Summary

Comments & Academic Discussion

Leave a Comment