Computer Science Applications Papers & Publications

Comparative analysis of machine learning classification models in predicting cardiovascular disease

For a long time, cardiovascular diseases have been the leading cause of death worldwide. Machine learning has found significant usage in the medical field as it can find patterns in data. Classification models can help cardiologists to diagnose heart diseases and minimize misdiagnosis accurately. In this paper, we explored a dataset related to heart disease and compared the accuracy of 43 machine learning classification models. The dataset for this research was downloaded from Kaggle; it contained 1190 observations, 11 features (age, sex, chest pain type, resting blood pressure, serum cholesterol, fasting blood sugar, resting electrocardiogram results, maximum heart rate achieved, exercise induced angina, oldpeak, the slope of the peak exercise ST segment) and a binary target variable (no heart disease or observed cardiovascular disease). For data exploration, preprocessing, training, testing, and predictor importance analysis, we used MATLAB R2004a software and the Classification Learner app included in this software. Before training machine learning classification models, we divided the dataset into a training set (90% of observations) and a test set (10% of observations). To prevent overfitting during the training of classification models, 10-fold cross-validation was used. The result showed that the best accuracy was reached with an optimized ensemble classification model (validation accuracy: 0.9262 and test accuracy: 0.9580). After calculating the permutation importance of each feature, we observed that the most important feature among all 11 features was the slope of the peak exercise ST segment.

Ladislav Végh

Evaluating optimizable machine learning models for anemia type prediction from complete blood count data

This paper compares different optimizable machine learning classification models to predict eight types of anemia from complete blood count (CBC) data. For the research, we used a publicly available Kaggle dataset containing 1281 observations, 14 predictors, and the diagnosis as the categorical target variable with nine categories (eight types of anemia and the healthy category). First, we examined the dataset and observed the histograms of some of the predictors. We compared the values of predictors of observations with no anemia to the observations where any anemia was diagnosed. Next, we used MATLAB R2024a to train and test nine optimizable machine-learning classification models. These models were Ensemble, Tree, SVM, Efficient Linear, Neural Network, Kernel, KNN, Naïve Bayes, and the Discriminant. Bayesian optimization was used to optimize the hyperparameters of all these models. We used 90% of observations for training and 10% of observations for testing. During the training, 10-fold cross-validation was used to prevent overfitting. The results showed the best accuracy was reached with the Ensemble classification model using the bag ensemble method (validation accuracy: 99.22%, test accuracy: 100%). Finally, we inspected our best classification model in more detail. We calculated the permutation feature importance to determine the contribution of each predictor to the final model. The results showed 6–7 important predictors, while the most important feature was the amount of hemoglobin.

Ladislav Végh

Two-stage rfid approach for localizing objects in smart homes based on gradient boosted decision trees with under- and over-sampling

Developing automated systems with a reasonable cost for long-term care for elders is a promising research direction. Such smart systems are based on realizing activities of daily living (ADLs) to enable aging in place while preserving the quality of life of all inhabitants in smart homes. One of the research directions is based on localizing items used by elders to monitor their activities with fine-grained details of the progress. In this paper, we shed the light on this issue by presenting an approach for localizing items in smart homes. The presented method is based on applying machine learning algorithms to Radio Frequency IDentification (RFID) tags readings. Our approach achieves the required task through two stages. The first stage detects in which room the selected object is located. Then, the second one determines the exact position of the selected object inside the detected room. Additionally, we present an efficient approach based on gradient boosted decision trees for detecting the location of the selected object in a real-world smart home. Moreover, we employ some techniques of over- and under-sampling with data clustering for improving the performance of the presented techniques. Many experiments are conducted in this work to evaluate the performance of the presented approach for localizing objects in a real smart home. The results of the experiments have shown that our approach provides remarkable performance.

Shadi Abudalfa

Improving machine learning classification models for anaemia type prediction by oversampling imbalanced complete blood count data with smote-based algorithms

Computer-assisted disease diagnosis is cost-effective and time-saving, increasing accuracy and reducing the need for an additional workforce in medical decision-making. In our prior research, we trained, tested, and compared the accuracies of nine optimizable classification models to diagnose and predict eight anaemia types from Complete Blood Count (CBC) data. This study aimed to improve these classification models by oversampling the original imbalanced dataset with four algorithms related to the Synthetic Minority Over-sampling Technique (SMOTE). The results showed that the validation accuracy increased from 99.22% (Ensemble model) to 99.57% (Tree model), and most importantly, the False Discovery Rate (FDR) for the anaemia type with the highest FDR decreased from 23.1% to 1.5%.

Ladislav Végh

Two-stage rfid approach for localizing objects in smart homes based on gradient boosted decision trees with under- and over-sampling

eveloping automated systems with a reasonable cost for long-term care for elders is a promising research direction. Such smart systems are based on realizing activities of daily living (ADLs) to enable aging in place while preserving the quality of life of all inhabitants in smart homes. One of the research directions is based on localizing items used by elders to monitor their activities with fine-grained details of the progress. In this paper, we shed the light on this issue by presenting an approach for localizing items in smart homes. The presented method is based on applying machine learning algorithms to Radio Frequency IDentification (RFID) tags readings. Our approach achieves the required task through two stages. The first stage detects in which room the selected object is located. Then, the second one determines the exact position of the selected object inside the detected room. Additionally, we present an efficient approach based on gradient boosted decision trees for detecting the location of the selected object in a real-world smart home. Moreover, we employ some techniques of over- and under- sampling with data clustering for improving the performance of the presented techniques. Many experiments are conducted in this work to evaluate the performance of the presented approach for localizing objects in a real smart home. The results of the experiments have shown that our approach provides remarkable performance.

Shadi Abudalfa

Arabic text formality modification: a review and future research directions

Formality transfer seeks to adjust text formality without altering its core meaning, which carries substantial implications across diverse domains like machine translation, dialogue systems, and social media content creation. This study provides an extensive overview of formality transfer specifically within Arabic text, an emerging domain within natural language processing. Particularly, we carried out a comprehensive review of literature on text formality transfer, focusing on studies published between July 2010 and April 2024. Our focus lies in treating formality transfer in Arabic as akin to a machine translation task, presenting synthesized insights. Despite advancements in formality transfer for English and other languages, Arabic’s distinct linguistic features present unique challenges and opportunities. Our investigation uncovers several research gaps necessitating future exploration, emphasizing persistent limitations. Moreover, we delve into text formality transfer as a promising avenue for forthcoming research initiatives in the realm of Arabic text processing.

Shadi Abudalfa

Tracking students' progress in introductory c programming courses through moodle tests with randomized questions

Assessing students' progress in introductory programming courses is crucial for identifying learning gaps and improving teaching methods. This study evaluates the effectiveness of Moodle-based tests with randomized questions in monitoring student progress in C programming courses at J. Selye University during the 2023/24 academic year. A series of ten tests were administered across two courses, covering essential programming topics such as data types, variables, conditional statements, loops, two-and three-dimensional arrays, recursion, and sorting algorithms. The results revealed significant variations in student performance, with recursion and the pretest/posttest loops presenting the greatest challenges. The correlation analysis of test scores showed strong relationships among related topics, confirming the structured progression of the curriculum. These findings suggest that Moodle-based assessments offer valuable insights into students' learning trajectories, enabling educators to adapt their instructional strategies accordingly. Such insights can help optimize introductory programming curricula, enhancing student engagement and understanding.

Ladislav Végh

An efficient sarcasm detection in audio using parameter-reduced depthwise cnn

In this study, we implement a lightweight CNN for sarcasm detection using audio input. To achieve this goal, we propose DepthFire block. We propose a lightweight version of the traditional Depthwise convolution layer that focuses on reduced memory. Unlike the traditional depthwise convolution layer that focuses on reducing the memory requirements of the entire architecture, our solution offers a specific and targeted approach that specifically reduces the memory requirements of the depthwise convolution layer through parameter reduction.We evaluated the impact of its energy consumption and the performance of our proposed solution with other existing solutions and on different activations, pooling functions and datasets.We further tested the applicability of the solution on 2D input.And our solution obtained 82.98 percent model size reduction as compared to MobileNetV2 and 58.94 percent as compared to MobileNetV3 small with a energy reduction of 56.48 percent on CIFAR10 dataset.

Jiby Mariya Jose

Energy-reduced bio-inspired 1d-cnn for audio emotion recognition

This paper proposes EPyNet, a deep learning architecture designed for energy reduced audio emotion recognition.In the domain of audio based emotion recognition, where discerning emotional cues from audio input is crucial, the integration of artificial intelligence techniques has sparked a transformative shift in accuracy and performance.Deep learning , renowned for its ability to decipher intricate patterns, spearheads this evolution. However, the energy efficiency of deep learning models, particularly in resource-constrained environments, remains a pressing concern. Convolutional operations serve as the cornerstone of deep learning systems. However, their extensive computational demands leading to energy-inefficient computations render them as not ideal for deployment in scenarios with limited resources. Addressing these challenges, researchers came up with one-dimensional convolutional neural network (1D CNN) array convolutions, offering an alternative to traditional two-dimensional CNNs, with reduced resource requirements. However , this array-based operation reduced the resource requirement, but the energy-consumption impact was not studied. To bridge this gap, we introduce EPyNet, a deep learning architecture crafted for energy efficiency with a particular emphasis on neuron reduction. Focusing on the task of audio emotion recognition, We evaluate EPyNet on five public audio corpora-RAVDESS, TESS, EMO DB, CREMA D, and SAVEE.We propose three versions of EPyNet, a lightweight neural network designed for efficient emotion recognition, each optimized for different trade-offs between accuracy and energy efficiency. Experimental results demonstrated that the 0.06M EPyNet reduced energy consumed by 76.5% while improving accuracy by 5% on RAVDESS, 25% on TESS, and 9.75% on SAVEE. The 0.2M and 0.9M models reduced energy consumed by 64.9% and 70.3%, respectively. Additionally, we compared our Proposed 0.06M system with the MobileNet models on the CIFAR-10 dataset and achieved significant improvements. The 1035 proposed system reduces energy by 86.2% and memory by 95.7% compared to MobileNet, with a slightly lower accuracy of 0.8%. Compared to MobileNetV2, it improves accuracy by 99.2% and reduces memory by 93.8%. When compared to MobileNetV3, it achieves 57.2% energy reduction, 85.1% memory reduction, and a 24.9% accuracy improvement. We further test the scalability and robustness of the proposed solution on different data dimensions and frameworks.

Jiby Mariya Jose

A lightweight deep learning framework using resource-efficient batch normalization for sarcasm detection

Communication is not always direct; it often involves nuanced elements like humor, irony, and sarcasm. This study introduces a novel two-level approach for sarcasm detection, leveraging Convolutional Neural Networks (CNNs). Convolutional neural networks (CNNs) are crucial for many deep learning applications, yet their deployment on IoT devices is challenged by resource constraints and the need for low latency, particularly in on-device training. Traditional methods of deploying large CNN models on these devices often lead to suboptimal performance and increased energy consumption. To address this, our paper proposes an energy-efficient CNN design by optimising batch normalisation operations. Batch normalisation is vital for deep learning, aiding in faster convergence and stabilising gradient flow, but there has been limited research on creating energy-efficient and lightweight CNNs with optimised batch normalisation. This study proposes a 3R (reduce, reuse, recycle) optimisation technique for batch normalization. This technique introduces an energy-efficient CNN architecture. We investigate the use of batch normalization optimization to streamline memory usage and computational complexity, aiming to uphold or improve model performance on CPU-based systems. Additionally, we evaluate its effectiveness across diverse datasets, focusing on energy efficiency and adaptability in different settings. Furthermore, we analyze how batch normalization influences the performance and effectiveness of activation functions and pooling layers in neural network designs. Our results highlight batch normalization's ability to enhance computational efficiency, particularly on devices with limited resources.

Jiby Mariya Jose

Optimizing neural network energy efficiency through low-rank factorisation and pde-driven dense layers

s deep learning models continue to grow in complexity, the computational and energy demands associated with their training and deployment are becomingincreasingly significant, particularly for convolutional neural networks (CNNs) deployed on CPU-bound and resource- limited devices. Fully connected (FC)layers, while vital, are energy-intensive, accounting for 85.7% of a network’s parameters but contributing only 1% of the computations. This research proposes anovel approach to optimising these layers for greater energy efficiency by integrating low-rank factorisation with differential partial differential equations (PDEs).The introduction of the LowRankDense layer, which combines low-rank matrix factorisation with a differential PDE solver, aims to reduce both the parametercount and energy consumption of FC layers. Experiments conducted on the MNIST, Fashion MNIST, and CIFAR-10 datasets demonstrate the effectiveness ofthis approach, yielding promising results in terms of reduced energy usage and maintaining comparable performance, thereby enhancing the practicality andsustainability of CNNs for widespread use in environments with limited computational resources

Jiby Mariya Jose

Enhancing thermal comfort and indoor air quality through energy optimization with neural network

Indoor thermal comfort and air quality are essential for occupant well-being, while simultaneously optimizing energy consumption in buildings. Achieving a balance between these factors presents a significant challenge, as indoor environments are dynamic and energy demands fluctuate. By modifying ventilation rates in response to real-time data, demand-controlled ventilation systems can reduce energy consumption and enhance indoor comfort and air quality. However, optimizing these systems with advanced predictive models remains a complex task. To address this challenge, this publication proposes a Dual-Stream Multi-Dependency Graph Neural Network (DMGNN)-based energy-efficient ventilation management technique that maximizes indoor air quality and thermal comfort. The suggested method seeks to enhance thermal comfort and air quality by maximizing heating, ventilation, and air conditioning (HVAC) operations while reducing energy consumption. Initially data are collected from an Indoor Air Quality Monitoring Dataset. The DMGNN is employed to capture the complex dependencies between environmental factors such as temperature, humidity, and CO2 concentrations, considering both temporal and spatial relationships. Implementing the proposed system and evaluating it through simulations in various building environments demonstrates notable improvements in thermal comfort, indoor air quality, and energy economy. The suggested system’s performance is contrasted with that of other current methods, showing superior energy efficiency and optimization of both indoor air quality and occupant comfort. This study presents an innovative, scalable framework for smart building management, promoting sustainable energy solutions.

Dr H Shaheen

‹ First  < 2 3 4