A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective
Article
Subjects > Engineering
Europe University of Atlantic > Research > Articles and books
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
University of La Romana > Research > Scientific Production
Abierto
Inglés
Data mining is an analytical approach that contributes to achieving a solution to many problems by extracting previously unknown, fascinating, nontrivial, and potentially valuable information from massive datasets. Clustering in data mining is used for splitting or segmenting data items/points into meaningful groups and clusters by grouping the items that are near to each other based on certain statistics. This paper covers various elements of clustering, such as algorithmic methodologies, applications, clustering assessment measurement, and researcher-proposed enhancements with their impact on data mining thorough grasp of clustering algorithms, its applications, and the advances achieved in the existing literature. This study includes a literature search for papers published between 1995 and 2023, including conference and journal publications. The study begins by outlining fundamental clustering techniques along with algorithm improvements and emphasizing their advantages and limitations in comparison to other clustering algorithms. It investigates the evolution measures for clustering algorithms with an emphasis on metrics used to gauge clustering quality, such as the F-measure and the Rand Index. This study includes a variety of clustering-related topics, such as algorithmic approaches, practical applications, metrics for clustering evaluation, and researcher-proposed improvements. It addresses numerous methodologies offered to increase the convergence speed, resilience, and accuracy of clustering, such as initialization procedures, distance measures, and optimization strategies. The work concludes by emphasizing clustering as an active research area driven by the need to identify significant patterns and structures in data, enhance knowledge acquisition, and improve decision making across different domains. This study aims to contribute to the broader knowledge base of data mining practitioners and researchers, facilitating informed decision making and fostering advancements in the field through a thorough analysis of algorithmic enhancements, clustering assessment metrics, and optimization strategies.
metadata
Chaudhry, Mahnoor and Shafi, Imran and Mahnoor, Mahnoor and Ramírez-Vargas, Debora L. and Bautista Thompson, Ernesto and Ashraf, Imran
mail
UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, debora.ramirez@unini.edu.mx, ernesto.bautista@unini.edu.mx, UNSPECIFIED
(2023)
A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective.
Symmetry, 15 (9).
p. 1679.
ISSN 2073-8994
|
Text
symmetry-15-01679-v2.pdf Available under License Creative Commons Attribution. Download (2MB) | Preview |
Abstract
Data mining is an analytical approach that contributes to achieving a solution to many problems by extracting previously unknown, fascinating, nontrivial, and potentially valuable information from massive datasets. Clustering in data mining is used for splitting or segmenting data items/points into meaningful groups and clusters by grouping the items that are near to each other based on certain statistics. This paper covers various elements of clustering, such as algorithmic methodologies, applications, clustering assessment measurement, and researcher-proposed enhancements with their impact on data mining thorough grasp of clustering algorithms, its applications, and the advances achieved in the existing literature. This study includes a literature search for papers published between 1995 and 2023, including conference and journal publications. The study begins by outlining fundamental clustering techniques along with algorithm improvements and emphasizing their advantages and limitations in comparison to other clustering algorithms. It investigates the evolution measures for clustering algorithms with an emphasis on metrics used to gauge clustering quality, such as the F-measure and the Rand Index. This study includes a variety of clustering-related topics, such as algorithmic approaches, practical applications, metrics for clustering evaluation, and researcher-proposed improvements. It addresses numerous methodologies offered to increase the convergence speed, resilience, and accuracy of clustering, such as initialization procedures, distance measures, and optimization strategies. The work concludes by emphasizing clustering as an active research area driven by the need to identify significant patterns and structures in data, enhance knowledge acquisition, and improve decision making across different domains. This study aims to contribute to the broader knowledge base of data mining practitioners and researchers, facilitating informed decision making and fostering advancements in the field through a thorough analysis of algorithmic enhancements, clustering assessment metrics, and optimization strategies.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | clustering; distance measures; data mining; evolution measures; symmetry |
Subjects: | Subjects > Engineering |
Divisions: | Europe University of Atlantic > Research > Articles and books Ibero-american International University > Research > Scientific Production Universidad Internacional do Cuanza > Research > Scientific Production University of La Romana > Research > Scientific Production |
Date Deposited: | 05 Sep 2023 23:30 |
Last Modified: | 02 Jan 2024 23:30 |
URI: | https://repositorio.uneatlantico.es/id/eprint/8657 |
Actions (login required)
View Item |
<a href="/10290/1/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/10290/1.hassmallThumbnailVersion/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" border="0"/></a>
en
open
The main aim of this study was to analyse the influence of e-learning training on the acquisition of competences in basketball coaches in Cantabria. The current landscape of basketball coach training shows an increasing demand for innovative training models and emerging pedagogies, including e-learning-based methodologies. The study sample consisted of fifty students from these courses, all above 16 years of age (36 males, 14 females). Among them, 16% resided outside the autonomous community of Cantabria, 10% resided more than 50 km from the city of Santander, 36% between 10 and 50 km, 14% less than 10 km, and 24% resided within Santander city. Data were collected through a Google Forms survey distributed by the Cantabrian Basketball Federation to training course students. Participation was voluntary and anonymous. The survey, consisting of 56 questions, was validated by two sports and health doctors and two senior basketball coaches. The collected data were processed and analysed using Microsoft® Excel version 16.74, and the results were expressed in percentages. The analysis revealed that 24.60% of the students trained through the e-learning methodology considered themselves fully qualified as basketball coaches, contrasting with 10.98% of those trained via traditional face-to-face methodology. The results of the study provide insights into important characteristics that can be adjusted and improved within the investigated educational process. Moreover, the study concludes that e-learning training effectively qualifies basketball coaches in Cantabria.
Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es, Javier Jorge mail , Kamil Giglio mail ,
Alemany Iturriaga
en
close
Technological firms invest in R&D looking for innovative solutions but assuming high costs and great (technological) uncertainty regarding final results and returns. Additionally, they face other problems related to R&D management. This empirical study tries to determine which of the factors favour or constrain the decision of these firms to engage in R&D. The analysis uses financial data of 14,619 ICT listed companies of 22 countries from 2003 to 2018. Additionally, macroeconomic data specific for the countries and the sector were used. For the analysis of dynamic panel data, a System-GMM method is used. Among the findings, we highlight that cash flow, contrary to the known theoretical models and empirical evidences, negatively impacts on R&D investment. Debt is neither the right source for R&D funding, as the effect is also negative. This suggests that ICT companies are forced to manage their R&D activities differently, relying more on other funding sources, taking advantage of growth opportunities and benefiting from a favourable macroeconomic environment in terms of growth and increased business sector spending on R&D. These results are similar in both sub-sectors and in all countries, both bank- and market based. The exception is firms with few growth opportunities and little debt.
Inna Alexeeva-Alexeev mail inna.alexeeva@uneatlantico.es, Cristina Mazas Pérez-Oleag mail cristina.mazas@uneatlantico.es,
Alexeeva-Alexeev
<a class="ep_document_link" href="/15198/1/nutrients-16-03859.pdf"><img class="ep_doc_icon" alt="[img]" src="/15198/1.hassmallThumbnailVersion/nutrients-16-03859.pdf" border="0"/></a>
en
open
Carotenoids Intake and Cardiovascular Prevention: A Systematic Review
Background: Cardiovascular diseases (CVDs) encompass a variety of conditions that affect the heart and blood vessels. Carotenoids, a group of fat-soluble organic pigments synthesized by plants, fungi, algae, and some bacteria, may have a beneficial effect in reducing cardiovascular disease (CVD) risk. This study aims to examine and synthesize current research on the relationship between carotenoids and CVDs. Methods: A systematic review was conducted using MEDLINE and the Cochrane Library to identify relevant studies on the efficacy of carotenoid supplementation for CVD prevention. Interventional analytical studies (randomized and non-randomized clinical trials) published in English from January 2011 to February 2024 were included. Results: A total of 38 studies were included in the qualitative analysis. Of these, 17 epidemiological studies assessed the relationship between carotenoids and CVDs, 9 examined the effect of carotenoid supplementation, and 12 evaluated dietary interventions. Conclusions: Elevated serum carotenoid levels are associated with reduced CVD risk factors and inflammatory markers. Increasing the consumption of carotenoid-rich foods appears to be more effective than supplementation, though the specific effects of individual carotenoids on CVD risk remain uncertain.
Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Imanol Eguren García mail imanol.eguren@uneatlantico.es, Álvaro Lasarte García mail , Thomas Prola mail thomas.prola@uneatlantico.es, Raquel Martínez Díaz mail raquel.martinez@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es,
Sumalla Cano
en
close
Uterine leiomyomas are the most common benign, monoclonal, gynaecological tumors in a woman’s uterus, while leiomyosarcoma is a rare but aggressive condition caused by the malignant transformation of the myometrium. To overcome the common obstacles related to the methods usually used to study these pathologies, we aimed to devise three-dimensional models of myometrium, uterine leiomyoma and leiomyosarcoma cell lines, using two different types of biocompatible scaffolds. Specifically, we exploited the agarose gel matrix in common 6-well plates and the alginate matrix using Bioprinting INKREDIBLE + (CELLINK), a pneumatic extruded base equipped with a system with double printheads, and a UV printer LED curing system. Both methods allowed the development of 3D spheroids of all three cell types, that were also suitable for morphological investigations. We showed that all cell types embedded in both agarose and alginate formed spheroids in their growth medium. The spheroids successfully proliferated and self-organized into complex structures, developing a sustainable system that emulated the condition of the tissues through the accumulation of extracellular matrix. These models could be useful for a better understanding of pathophysiology, etiopathogenesis, and testing new methods or molecules from a preventive and therapeutic point of view.
Pamela Pellegrino mail , Stefania Greco mail , Abel Duménigo Gonzàlez mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Stefano Raffaele Giannubilo mail , Giovanni Delli Carpini mail , Franco Capocasa mail , Bruno Mezzetti mail , Maurizio Battino mail maurizio.battino@uneatlantico.es, Andrea Ciavattini mail , Pasquapina Ciarmela mail ,
Pellegrino
<a class="ep_document_link" href="/15333/1/nutrients-16-03907.pdf"><img class="ep_doc_icon" alt="[img]" src="/15333/1.hassmallThumbnailVersion/nutrients-16-03907.pdf" border="0"/></a>
en
open
Background/Objectives: The diet quality of younger individuals is decreasing globally, with alarming trends also in the Mediterranean region. The aim of this study was to assess diet quality and adequacy in relation to country-specific dietary recommendations for children and adolescents living in the Mediterranean area. Methods: A cross-sectional survey was conducted of 2011 parents of the target population participating in the DELICIOUS EU-PRIMA project. Dietary data and cross-references with food-based recommendations and the application of the youth healthy eating index (YHEI) was assessed through 24 h recalls and food frequency questionnaires. Results: Adherence to recommendations on plant-based foods was low (less than ∼20%), including fruit and vegetables adequacy in all countries, legume adequacy in all countries except for Italy, and cereal adequacy in all countries except for Portugal. For animal products and dietary fats, the adequacy in relation to the national food-based dietary recommendations was slightly better (∼40% on average) in most countries, although the Eastern countries reported worse rates. Higher scores on the YHEI predicted adequacy in relation to vegetables (except Egypt), fruit (except Lebanon), cereals (except Spain), and legumes (except Spain) in most countries. Younger children (p < 0.005) reporting having 8–10 h adequate sleep duration (p < 0.001), <2 h/day screen time (p < 0.001), and a medium/high physical activity level (p < 0.001) displayed a better diet quality. Moreover, older respondents (p < 0.001) with a medium/high educational level (p = 0.001) and living with a partner (p = 0.003) reported that their children had a better diet quality. Conclusions: Plant-based food groups, including fruit, vegetables, legumes, and even (whole-grain) cereals are underrepresented in the diets of Mediterranean children and adolescents. Moreover, the adequate consumption of other important dietary components, such as milk and dairy products, is rather disregarded, leading to substantially suboptimal diets and poor adequacy in relation to dietary guidelines.
Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Alice Rosi mail , Francesca Scazzina mail , Evelyn Frias-Toral mail , Osama Abdelkarim mail , Mohamed Aly mail , Raynier Zambrano-Villacres mail , Juancho Pons mail , Laura Vázquez-Araújo mail , Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es, Lorenzo Monasta mail , Ana Mata mail , María Isabel Pardo mail , Pablo Busó mail , Giuseppe Grosso mail ,
Giampieri