Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms
Artículo
Materias > Ingeniería
Universidad Europea del Atlántico > Investigación > Artículos y libros
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Universidad de La Romana > Investigación > Producción Científica
Abierto
Inglés
In contemporary society, depression has emerged as a prominent mental disorder that exhibits exponential growth and exerts a substantial influence on premature mortality. Although numerous research applied machine learning methods to forecast signs of depression. Nevertheless, only a limited number of research have taken into account the severity level as a multiclass variable. Besides, maintaining the equality of data distribution among all the classes rarely happens in practical communities. So, the inevitable class imbalance for multiple variables is considered a substantial challenge in this domain. Furthermore, this research emphasizes the significance of addressing class imbalance issues in the context of multiple classes. We introduced a new approach Feature group partitioning (FGP) in the data preprocessing phase which effectively reduces the dimensionality of features to a minimum. This study utilized synthetic oversampling techniques, specifically Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN), for class balancing. The dataset used in this research was collected from university students by administering the Burn Depression Checklist (BDC). For methodological modifications, we implemented heterogeneous ensemble learning stacking, homogeneous ensemble bagging, and five distinct supervised machine learning algorithms. The issue of overfitting was mitigated by evaluating the accuracy of the training, validation, and testing datasets. To justify the effectiveness of the prediction models, balanced accuracy, sensitivity, specificity, precision, and f1-score indices are used. Overall, comprehensive analysis demonstrates the discrimination between the Conventional Depression Screening (CDS) and FGP approach. In summary, the results show that the stacking classifier for FGP with SMOTE approach yields the highest balanced accuracy, with a rate of 92.81%. The empirical evidence has demonstrated that the FGP approach, when combined with the SMOTE, able to produce better performance in predicting the severity of depression. Most importantly the optimization of the training time of the FGP approach for all of the classifiers is a significant achievement of this research.
metadata
Shaha, Tumpa Rani; Begum, Momotaz; Uddin, Jia; Yélamos Torres, Vanessa; Alemany Iturriaga, Josep; Ashraf, Imran y Samad, Md. Abdus
mail
SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, vanessa.yelamos@funiber.org, josep.alemany@uneatlantico.es, SIN ESPECIFICAR, SIN ESPECIFICAR
(2024)
Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms.
BMC Medical Research Methodology, 24 (1).
ISSN 1471-2288
|
Texto
s12874-024-02249-8.pdf Available under License Creative Commons Attribution. Descargar (2MB) | Vista Previa |
Resumen
In contemporary society, depression has emerged as a prominent mental disorder that exhibits exponential growth and exerts a substantial influence on premature mortality. Although numerous research applied machine learning methods to forecast signs of depression. Nevertheless, only a limited number of research have taken into account the severity level as a multiclass variable. Besides, maintaining the equality of data distribution among all the classes rarely happens in practical communities. So, the inevitable class imbalance for multiple variables is considered a substantial challenge in this domain. Furthermore, this research emphasizes the significance of addressing class imbalance issues in the context of multiple classes. We introduced a new approach Feature group partitioning (FGP) in the data preprocessing phase which effectively reduces the dimensionality of features to a minimum. This study utilized synthetic oversampling techniques, specifically Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN), for class balancing. The dataset used in this research was collected from university students by administering the Burn Depression Checklist (BDC). For methodological modifications, we implemented heterogeneous ensemble learning stacking, homogeneous ensemble bagging, and five distinct supervised machine learning algorithms. The issue of overfitting was mitigated by evaluating the accuracy of the training, validation, and testing datasets. To justify the effectiveness of the prediction models, balanced accuracy, sensitivity, specificity, precision, and f1-score indices are used. Overall, comprehensive analysis demonstrates the discrimination between the Conventional Depression Screening (CDS) and FGP approach. In summary, the results show that the stacking classifier for FGP with SMOTE approach yields the highest balanced accuracy, with a rate of 92.81%. The empirical evidence has demonstrated that the FGP approach, when combined with the SMOTE, able to produce better performance in predicting the severity of depression. Most importantly the optimization of the training time of the FGP approach for all of the classifiers is a significant achievement of this research.
Tipo de Documento: | Artículo |
---|---|
Palabras Clave: | Machine learning; Depression prediction; Class balancing; Oversampling; SMOTE; ADASYN; Stratified cross validation; Burn depression checklist; Feature group partitioning |
Clasificación temática: | Materias > Ingeniería |
Divisiones: | Universidad Europea del Atlántico > Investigación > Artículos y libros Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica Universidad Internacional do Cuanza > Investigación > Producción Científica Universidad de La Romana > Investigación > Producción Científica |
Depositado: | 17 Jun 2024 23:30 |
Ultima Modificación: | 17 Jun 2024 23:30 |
URI: | https://repositorio.uneatlantico.es/id/eprint/12751 |
Acciones (logins necesarios)
![]() |
Ver Objeto |
en
close
Enzymatic treatment shapes in vitro digestion pattern of phenolic compounds in mulberry juice
The health benefits of mulberry fruit are closely associated with its phenolic compounds. However, the effects of enzymatic treatments on the digestion patterns of these compounds in mulberry juice remain largely unknown. This study investigated the impact of pectinase (PE), pectin lyase (PL), and cellulase (CE) on the release of phenolic compounds in whole mulberry juice. The digestion patterns were further evaluated using an in vitro simulated digestion model. The results revealed that PE significantly increased chlorogenic acid content by 77.8 %, PL enhanced cyanidin-3-O-glucoside by 20.5 %, and CE boosted quercetin by 44.5 %. Following in vitro digestion, the phenolic compound levels decreased differently depending on the treatment, while cyanidin-3-O-rutinoside content increased across all groups. In conclusion, the selected enzymes effectively promoted the release of phenolic compounds in mulberry juice. However, during gastrointestinal digestion, the degradation of phenolic compounds surpassed their enhanced release, with effects varying based on the compound's structure.
Peihuan Luo mail , Jian Ai mail , Qiongyao Wang mail , Yihang Lou mail , Zhiwei Liao mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Elwira Sieniawska mail , Weibin Bai mail , Lingmin Tian mail ,
Luo
<a href="/17819/1/1-s2.0-S2214804325000679-main%20%281%29.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
What works in financial education? Experimental evidence on program impact
Financial education is increasingly essential for safeguarding both individual and corporate well-being. This study systematically reviews global financial education experiments using a dual-method framework that integrates a deep learning classifier with advanced multivariate statistical techniques. Our analysis indicates that while short-term improvements in financial literacy are common, such gains tend to diminish over time without ongoing reinforcement. Moreover, the limited impact of digital innovations and monetary incentives suggests that successful financial education depends on more than simply deploying technological solutions or extrinsic rewards. Overall, this review provides valuable insights into the evolving landscape of financial education in a dynamic economic context and underscores the need for sustainable strategies that secure lasting improvements in financial literacy.
Gonzalo Llamosas García mail , Cristina Mazas Pérez-Oleaga mail cristina.mazas@uneatlantico.es,
García
en
close
Epigallocatechin gallate (EGCG) is the most abundant polyphenol in tea. Owing to the different fermentation degrees, differences in polyphenol composition of water extracts of green tea, white tea, oolong tea, and black tea occur, and affect health value. This study revealed that the content of EGCG decreases with the increase in the degree of fermentation. In tea with a high fermentation degree, EGCG was stably present in the form of ammoniation to yield nitrogen-containing EGCG derivative (N-EGCG). The content of N-EGCG in tea was negatively correlated with the content of EGCG. Furthermore, the content of l-serine and L-threonine in tea was positively and negatively correlated with N-EGCG and EGCG levels, respectively, suggesting that they may participate in the formation of N-EGCG as nitrogen sources. This study proposes a new fermentation-induced polyphenol-amino acid synergistic mechanism, which provides a theoretical basis for the study of the biotransformation reaction mechanism of tea polyphenols.
Yuxuan Zhao mail , Jingyimei Liang mail , Wanning Ma mail , Mohamed A. Farag mail , Chunlin Li mail , Jianbo Xiao mail ,
Zhao
<a class="ep_document_link" href="/17843/1/s41599-025-05247-3.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Econometric analysis has long been integral to measuring sustainable environmental quality, with panel data methods, such as fixed and random effects models, becoming the focal point of modern research. Initially, such methods were used to simply investigate environmental issues, but recent years have seen a shift toward the study of random effects models, focusing on hypothesis testing and policy debates. However, several important aspects of the Hausman test have not been sufficiently investigated in the literature. This study seeks to evaluate the utility of the Hausman test using a real dataset from tourism and globalization, exploring their effects on sustainable environmental quality. Additionally, the study examines key factors contributing to environmental issues including economic growth and energy consumption, as critical explanatory variables. By investigating the relationship between tourism, globalization, economic growth, and energy use, the research focuses on the top 10 most visited economies: France, the USA, Spain, China, Turkey, Italy, Mexico, Germany, Thailand, and the UK. Using panel data and the cross-sectional random effects model for the period of 1998 to 2024, the study produces reliable estimations of these relationships. The empirical findings suggest that while the Hausman test favors the fixed effect model, the real-world characteristics of these countries point to the random effect model, highlighting the negative impact of economic growth, energy consumption, and globalization on sustainable environmental quality. It is also suggested that socio-environmental factors should be considered for each destination for sustainable environmental quality.
Saba Nourin mail , Ismat Nasim mail , Hafiz Muhammad Raza ur Rehman mail , Elisabeth Caro Montero mail elizabeth.caro@uneatlantico.es, Mirtha Silvana Garat de Marin mail silvana.marin@uneatlantico.es, Nagwan Abdel Samee mail , Imran Ashraf mail ,
Nourin
<a class="ep_document_link" href="/17844/1/frai-1-1572645.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
A systematic review of deep learning methods for community detection in social networks
Introduction: The rapid expansion of generated data through social networks has introduced significant challenges, which underscores the need for advanced methods to analyze and interpret these complex systems. Deep learning has emerged as an effective approach, offering robust capabilities to process large datasets, and uncover intricate relationships and patterns. Methods: In this systematic literature review, we explore research conducted over the past decade, focusing on the use of deep learning techniques for community detection in social networks. A total of 19 studies were carefully selected from reputable databases, including the ACM Library, Springer Link, Scopus, Science Direct, and IEEE Xplore. This review investigates the employed methodologies, evaluates their effectiveness, and discusses the challenges identified in these works. Results: Our review shows that models like graph neural networks (GNNs), autoencoders, and convolutional neural networks (CNNs) are some of the most commonly used approaches for community detection. It also examines the variety of social networks, datasets, evaluation metrics, and employed frameworks in these studies. Discussion: However, the analysis highlights several challenges, such as scalability, understanding how the models work (interpretability), and the need for solutions that can adapt to different types of networks. These issues stand out as important areas that need further attention and deeper research. This review provides meaningful insights for researchers working in social network analysis. It offers a detailed summary of recent developments, showcases the most impactful deep learning methods, and identifies key challenges that remain to be explored.
Mohamed El-Moussaoui mail , Mohamed Hanine mail , Ali Kartit mail , Mónica Gracia Villar mail monica.gracia@uneatlantico.es, Helena Garay mail helena.garay@uneatlantico.es, Isabel de la Torre Díez mail ,
El-Moussaoui