Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
Article
Subjects > Engineering
Europe University of Atlantic > Research > Articles and books
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
Abierto
Inglés
Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.
metadata
Jabir, Brahim and Díez, Isabel De la Torre and Bautista Thompson, Ernesto and Ramírez-Vargas, Debora L. and Kuc Castilla, Ángel Gabriel
mail
UNSPECIFIED
(2023)
Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification.
IEEE Access.
p. 1.
ISSN 2169-3536
Abstract
Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Ensemble Partition Sampling (EPS); One vs One (OvO); One vs All (OvA); Multi-Class Classification; Imbalanced learning; multiclass imbalanced classification |
Subjects: | Subjects > Engineering |
Divisions: | Europe University of Atlantic > Research > Articles and books Fundación Universitaria Internacional de Colombia > Research > Scientific Production Ibero-american International University > Research > Scientific Production Ibero-american International University > Research > Scientific Production Universidad Internacional do Cuanza > Research > Scientific Production |
Date Deposited: | 09 May 2023 23:30 |
Last Modified: | 21 Oct 2024 23:31 |
URI: | https://repositorio.uneatlantico.es/id/eprint/7028 |
Actions (login required)
View Item |
<a href="/10290/1/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/10290/1.hassmallThumbnailVersion/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" border="0"/></a>
en
open
The main aim of this study was to analyse the influence of e-learning training on the acquisition of competences in basketball coaches in Cantabria. The current landscape of basketball coach training shows an increasing demand for innovative training models and emerging pedagogies, including e-learning-based methodologies. The study sample consisted of fifty students from these courses, all above 16 years of age (36 males, 14 females). Among them, 16% resided outside the autonomous community of Cantabria, 10% resided more than 50 km from the city of Santander, 36% between 10 and 50 km, 14% less than 10 km, and 24% resided within Santander city. Data were collected through a Google Forms survey distributed by the Cantabrian Basketball Federation to training course students. Participation was voluntary and anonymous. The survey, consisting of 56 questions, was validated by two sports and health doctors and two senior basketball coaches. The collected data were processed and analysed using Microsoft® Excel version 16.74, and the results were expressed in percentages. The analysis revealed that 24.60% of the students trained through the e-learning methodology considered themselves fully qualified as basketball coaches, contrasting with 10.98% of those trained via traditional face-to-face methodology. The results of the study provide insights into important characteristics that can be adjusted and improved within the investigated educational process. Moreover, the study concludes that e-learning training effectively qualifies basketball coaches in Cantabria.
Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es, Javier Jorge mail , Kamil Giglio mail ,
Alemany Iturriaga
en
close
Technological firms invest in R&D looking for innovative solutions but assuming high costs and great (technological) uncertainty regarding final results and returns. Additionally, they face other problems related to R&D management. This empirical study tries to determine which of the factors favour or constrain the decision of these firms to engage in R&D. The analysis uses financial data of 14,619 ICT listed companies of 22 countries from 2003 to 2018. Additionally, macroeconomic data specific for the countries and the sector were used. For the analysis of dynamic panel data, a System-GMM method is used. Among the findings, we highlight that cash flow, contrary to the known theoretical models and empirical evidences, negatively impacts on R&D investment. Debt is neither the right source for R&D funding, as the effect is also negative. This suggests that ICT companies are forced to manage their R&D activities differently, relying more on other funding sources, taking advantage of growth opportunities and benefiting from a favourable macroeconomic environment in terms of growth and increased business sector spending on R&D. These results are similar in both sub-sectors and in all countries, both bank- and market based. The exception is firms with few growth opportunities and little debt.
Inna Alexeeva-Alexeev mail inna.alexeeva@uneatlantico.es, Cristina Mazas Pérez-Oleag mail cristina.mazas@uneatlantico.es,
Alexeeva-Alexeev
<a class="ep_document_link" href="/15198/1/nutrients-16-03859.pdf"><img class="ep_doc_icon" alt="[img]" src="/15198/1.hassmallThumbnailVersion/nutrients-16-03859.pdf" border="0"/></a>
en
open
Carotenoids Intake and Cardiovascular Prevention: A Systematic Review
Background: Cardiovascular diseases (CVDs) encompass a variety of conditions that affect the heart and blood vessels. Carotenoids, a group of fat-soluble organic pigments synthesized by plants, fungi, algae, and some bacteria, may have a beneficial effect in reducing cardiovascular disease (CVD) risk. This study aims to examine and synthesize current research on the relationship between carotenoids and CVDs. Methods: A systematic review was conducted using MEDLINE and the Cochrane Library to identify relevant studies on the efficacy of carotenoid supplementation for CVD prevention. Interventional analytical studies (randomized and non-randomized clinical trials) published in English from January 2011 to February 2024 were included. Results: A total of 38 studies were included in the qualitative analysis. Of these, 17 epidemiological studies assessed the relationship between carotenoids and CVDs, 9 examined the effect of carotenoid supplementation, and 12 evaluated dietary interventions. Conclusions: Elevated serum carotenoid levels are associated with reduced CVD risk factors and inflammatory markers. Increasing the consumption of carotenoid-rich foods appears to be more effective than supplementation, though the specific effects of individual carotenoids on CVD risk remain uncertain.
Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Imanol Eguren García mail imanol.eguren@uneatlantico.es, Álvaro Lasarte García mail , Thomas Prola mail thomas.prola@uneatlantico.es, Raquel Martínez Díaz mail raquel.martinez@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es,
Sumalla Cano
en
close
Uterine leiomyomas are the most common benign, monoclonal, gynaecological tumors in a woman’s uterus, while leiomyosarcoma is a rare but aggressive condition caused by the malignant transformation of the myometrium. To overcome the common obstacles related to the methods usually used to study these pathologies, we aimed to devise three-dimensional models of myometrium, uterine leiomyoma and leiomyosarcoma cell lines, using two different types of biocompatible scaffolds. Specifically, we exploited the agarose gel matrix in common 6-well plates and the alginate matrix using Bioprinting INKREDIBLE + (CELLINK), a pneumatic extruded base equipped with a system with double printheads, and a UV printer LED curing system. Both methods allowed the development of 3D spheroids of all three cell types, that were also suitable for morphological investigations. We showed that all cell types embedded in both agarose and alginate formed spheroids in their growth medium. The spheroids successfully proliferated and self-organized into complex structures, developing a sustainable system that emulated the condition of the tissues through the accumulation of extracellular matrix. These models could be useful for a better understanding of pathophysiology, etiopathogenesis, and testing new methods or molecules from a preventive and therapeutic point of view.
Pamela Pellegrino mail , Stefania Greco mail , Abel Duménigo Gonzàlez mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Stefano Raffaele Giannubilo mail , Giovanni Delli Carpini mail , Franco Capocasa mail , Bruno Mezzetti mail , Maurizio Battino mail maurizio.battino@uneatlantico.es, Andrea Ciavattini mail , Pasquapina Ciarmela mail ,
Pellegrino
<a href="/15333/1/nutrients-16-03907.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/15333/1.hassmallThumbnailVersion/nutrients-16-03907.pdf" border="0"/></a>
en
open
Background/Objectives: The diet quality of younger individuals is decreasing globally, with alarming trends also in the Mediterranean region. The aim of this study was to assess diet quality and adequacy in relation to country-specific dietary recommendations for children and adolescents living in the Mediterranean area. Methods: A cross-sectional survey was conducted of 2011 parents of the target population participating in the DELICIOUS EU-PRIMA project. Dietary data and cross-references with food-based recommendations and the application of the youth healthy eating index (YHEI) was assessed through 24 h recalls and food frequency questionnaires. Results: Adherence to recommendations on plant-based foods was low (less than ∼20%), including fruit and vegetables adequacy in all countries, legume adequacy in all countries except for Italy, and cereal adequacy in all countries except for Portugal. For animal products and dietary fats, the adequacy in relation to the national food-based dietary recommendations was slightly better (∼40% on average) in most countries, although the Eastern countries reported worse rates. Higher scores on the YHEI predicted adequacy in relation to vegetables (except Egypt), fruit (except Lebanon), cereals (except Spain), and legumes (except Spain) in most countries. Younger children (p < 0.005) reporting having 8–10 h adequate sleep duration (p < 0.001), <2 h/day screen time (p < 0.001), and a medium/high physical activity level (p < 0.001) displayed a better diet quality. Moreover, older respondents (p < 0.001) with a medium/high educational level (p = 0.001) and living with a partner (p = 0.003) reported that their children had a better diet quality. Conclusions: Plant-based food groups, including fruit, vegetables, legumes, and even (whole-grain) cereals are underrepresented in the diets of Mediterranean children and adolescents. Moreover, the adequate consumption of other important dietary components, such as milk and dairy products, is rather disregarded, leading to substantially suboptimal diets and poor adequacy in relation to dietary guidelines.
Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Alice Rosi mail , Francesca Scazzina mail , Evelyn Frias-Toral mail , Osama Abdelkarim mail , Mohamed Aly mail , Raynier Zambrano-Villacres mail , Juancho Pons mail , Laura Vázquez-Araújo mail , Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es, Lorenzo Monasta mail , Ana Mata mail , María Isabel Pardo mail , Pablo Busó mail , Giuseppe Grosso mail ,
Giampieri