Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble
Article
Subjects > Engineering
Subjects > Psychology
Europe University of Atlantic > Research > Articles and books
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
Abierto
Inglés
Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.
metadata
Rizwan, Muhammad and Mushtaq, Muhammad Faheem and Rafiq, Maryam and Mehmood, Arif and Diez, Isabel de la Torre and Gracia Villar, Mónica and Garay, Helena and Ashraf, Imran
mail
UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, monica.gracia@uneatlantico.es, helena.garay@uneatlantico.es, UNSPECIFIED
(2024)
Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble.
Computers, Materials & Continua, 78 (2).
pp. 2047-2066.
ISSN 1546-2226
|
Text
TSP_CMC_37347.pdf Available under License Creative Commons Attribution. Download (861kB) | Preview |
Abstract
Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Depression classification; deep learning; FastText; machine learning |
Subjects: | Subjects > Engineering Subjects > Psychology |
Divisions: | Europe University of Atlantic > Research > Articles and books Fundación Universitaria Internacional de Colombia > Research > Scientific Production Ibero-american International University > Research > Scientific Production Ibero-american International University > Research > Scientific Production Universidad Internacional do Cuanza > Research > Scientific Production |
Date Deposited: | 14 Mar 2024 23:30 |
Last Modified: | 14 Mar 2024 23:30 |
URI: | https://repositorio.uneatlantico.es/id/eprint/11264 |
Actions (login required)
View Item |
<a href="/10290/1/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/10290/1.hassmallThumbnailVersion/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" border="0"/></a>
en
open
The main aim of this study was to analyse the influence of e-learning training on the acquisition of competences in basketball coaches in Cantabria. The current landscape of basketball coach training shows an increasing demand for innovative training models and emerging pedagogies, including e-learning-based methodologies. The study sample consisted of fifty students from these courses, all above 16 years of age (36 males, 14 females). Among them, 16% resided outside the autonomous community of Cantabria, 10% resided more than 50 km from the city of Santander, 36% between 10 and 50 km, 14% less than 10 km, and 24% resided within Santander city. Data were collected through a Google Forms survey distributed by the Cantabrian Basketball Federation to training course students. Participation was voluntary and anonymous. The survey, consisting of 56 questions, was validated by two sports and health doctors and two senior basketball coaches. The collected data were processed and analysed using Microsoft® Excel version 16.74, and the results were expressed in percentages. The analysis revealed that 24.60% of the students trained through the e-learning methodology considered themselves fully qualified as basketball coaches, contrasting with 10.98% of those trained via traditional face-to-face methodology. The results of the study provide insights into important characteristics that can be adjusted and improved within the investigated educational process. Moreover, the study concludes that e-learning training effectively qualifies basketball coaches in Cantabria.
Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es, Javier Jorge mail , Kamil Giglio mail ,
Alemany Iturriaga
en
close
Technological firms invest in R&D looking for innovative solutions but assuming high costs and great (technological) uncertainty regarding final results and returns. Additionally, they face other problems related to R&D management. This empirical study tries to determine which of the factors favour or constrain the decision of these firms to engage in R&D. The analysis uses financial data of 14,619 ICT listed companies of 22 countries from 2003 to 2018. Additionally, macroeconomic data specific for the countries and the sector were used. For the analysis of dynamic panel data, a System-GMM method is used. Among the findings, we highlight that cash flow, contrary to the known theoretical models and empirical evidences, negatively impacts on R&D investment. Debt is neither the right source for R&D funding, as the effect is also negative. This suggests that ICT companies are forced to manage their R&D activities differently, relying more on other funding sources, taking advantage of growth opportunities and benefiting from a favourable macroeconomic environment in terms of growth and increased business sector spending on R&D. These results are similar in both sub-sectors and in all countries, both bank- and market based. The exception is firms with few growth opportunities and little debt.
Inna Alexeeva-Alexeev mail inna.alexeeva@uneatlantico.es, Cristina Mazas Pérez-Oleag mail cristina.mazas@uneatlantico.es,
Alexeeva-Alexeev
<a href="/15198/1/nutrients-16-03859.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/15198/1.hassmallThumbnailVersion/nutrients-16-03859.pdf" border="0"/></a>
en
open
Carotenoids Intake and Cardiovascular Prevention: A Systematic Review
Background: Cardiovascular diseases (CVDs) encompass a variety of conditions that affect the heart and blood vessels. Carotenoids, a group of fat-soluble organic pigments synthesized by plants, fungi, algae, and some bacteria, may have a beneficial effect in reducing cardiovascular disease (CVD) risk. This study aims to examine and synthesize current research on the relationship between carotenoids and CVDs. Methods: A systematic review was conducted using MEDLINE and the Cochrane Library to identify relevant studies on the efficacy of carotenoid supplementation for CVD prevention. Interventional analytical studies (randomized and non-randomized clinical trials) published in English from January 2011 to February 2024 were included. Results: A total of 38 studies were included in the qualitative analysis. Of these, 17 epidemiological studies assessed the relationship between carotenoids and CVDs, 9 examined the effect of carotenoid supplementation, and 12 evaluated dietary interventions. Conclusions: Elevated serum carotenoid levels are associated with reduced CVD risk factors and inflammatory markers. Increasing the consumption of carotenoid-rich foods appears to be more effective than supplementation, though the specific effects of individual carotenoids on CVD risk remain uncertain.
Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Imanol Eguren García mail imanol.eguren@uneatlantico.es, Álvaro Lasarte García mail , Thomas Prola mail thomas.prola@uneatlantico.es, Raquel Martínez Díaz mail raquel.martinez@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es,
Sumalla Cano
en
close
Uterine leiomyomas are the most common benign, monoclonal, gynaecological tumors in a woman’s uterus, while leiomyosarcoma is a rare but aggressive condition caused by the malignant transformation of the myometrium. To overcome the common obstacles related to the methods usually used to study these pathologies, we aimed to devise three-dimensional models of myometrium, uterine leiomyoma and leiomyosarcoma cell lines, using two different types of biocompatible scaffolds. Specifically, we exploited the agarose gel matrix in common 6-well plates and the alginate matrix using Bioprinting INKREDIBLE + (CELLINK), a pneumatic extruded base equipped with a system with double printheads, and a UV printer LED curing system. Both methods allowed the development of 3D spheroids of all three cell types, that were also suitable for morphological investigations. We showed that all cell types embedded in both agarose and alginate formed spheroids in their growth medium. The spheroids successfully proliferated and self-organized into complex structures, developing a sustainable system that emulated the condition of the tissues through the accumulation of extracellular matrix. These models could be useful for a better understanding of pathophysiology, etiopathogenesis, and testing new methods or molecules from a preventive and therapeutic point of view.
Pamela Pellegrino mail , Stefania Greco mail , Abel Duménigo Gonzàlez mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Stefano Raffaele Giannubilo mail , Giovanni Delli Carpini mail , Franco Capocasa mail , Bruno Mezzetti mail , Maurizio Battino mail maurizio.battino@uneatlantico.es, Andrea Ciavattini mail , Pasquapina Ciarmela mail ,
Pellegrino
<a href="/15333/1/nutrients-16-03907.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/15333/1.hassmallThumbnailVersion/nutrients-16-03907.pdf" border="0"/></a>
en
open
Background/Objectives: The diet quality of younger individuals is decreasing globally, with alarming trends also in the Mediterranean region. The aim of this study was to assess diet quality and adequacy in relation to country-specific dietary recommendations for children and adolescents living in the Mediterranean area. Methods: A cross-sectional survey was conducted of 2011 parents of the target population participating in the DELICIOUS EU-PRIMA project. Dietary data and cross-references with food-based recommendations and the application of the youth healthy eating index (YHEI) was assessed through 24 h recalls and food frequency questionnaires. Results: Adherence to recommendations on plant-based foods was low (less than ∼20%), including fruit and vegetables adequacy in all countries, legume adequacy in all countries except for Italy, and cereal adequacy in all countries except for Portugal. For animal products and dietary fats, the adequacy in relation to the national food-based dietary recommendations was slightly better (∼40% on average) in most countries, although the Eastern countries reported worse rates. Higher scores on the YHEI predicted adequacy in relation to vegetables (except Egypt), fruit (except Lebanon), cereals (except Spain), and legumes (except Spain) in most countries. Younger children (p < 0.005) reporting having 8–10 h adequate sleep duration (p < 0.001), <2 h/day screen time (p < 0.001), and a medium/high physical activity level (p < 0.001) displayed a better diet quality. Moreover, older respondents (p < 0.001) with a medium/high educational level (p = 0.001) and living with a partner (p = 0.003) reported that their children had a better diet quality. Conclusions: Plant-based food groups, including fruit, vegetables, legumes, and even (whole-grain) cereals are underrepresented in the diets of Mediterranean children and adolescents. Moreover, the adequate consumption of other important dietary components, such as milk and dairy products, is rather disregarded, leading to substantially suboptimal diets and poor adequacy in relation to dietary guidelines.
Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Alice Rosi mail , Francesca Scazzina mail , Evelyn Frias-Toral mail , Osama Abdelkarim mail , Mohamed Aly mail , Raynier Zambrano-Villacres mail , Juancho Pons mail , Laura Vázquez-Araújo mail , Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es, Lorenzo Monasta mail , Ana Mata mail , María Isabel Pardo mail , Pablo Busó mail , Giuseppe Grosso mail ,
Giampieri