Contextual Urdu Lemmatization Using Recurrent Neural Network Models
Artículo
Materias > Ingeniería
Universidad Europea del Atlántico > Investigación > Artículos y libros
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Abierto
Inglés
In the field of natural language processing, machine translation is a colossally developing research area that helps humans communicate more effectively by bridging the linguistic gap. In machine translation, normalization and morphological analyses are the first and perhaps the most important modules for information retrieval (IR). To build a morphological analyzer, or to complete the normalization process, it is important to extract the correct root out of different words. Stemming and lemmatization are techniques commonly used to find the correct root words in a language. However, a few studies on IR systems for the Urdu language have shown that lemmatization is more effective than stemming due to infixes found in Urdu words. This paper presents a lemmatization algorithm based on recurrent neural network models for the Urdu language. However, lemmatization techniques for resource-scarce languages such as Urdu are not very common. The proposed model is trained and tested on two datasets, namely, the Urdu Monolingual Corpus (UMC) and the Universal Dependencies Corpus of Urdu (UDU). The datasets are lemmatized with the help of recurrent neural network models. The Word2Vec model and edit trees are used to generate semantic and syntactic embedding. Bidirectional long short-term memory (BiLSTM), bidirectional gated recurrent unit (BiGRU), bidirectional gated recurrent neural network (BiGRNN), and attention-free encoder–decoder (AFED) models are trained under defined hyperparameters. Experimental results show that the attention-free encoder-decoder model achieves an accuracy, precision, recall, and F-score of 0.96, 0.95, 0.95, and 0.95, respectively, and outperforms existing models
metadata
Hafeez, Rabab; Anwar, Muhammad Waqas; Jamal, Muhammad Hasan; Fatima, Tayyaba; Martínez Espinosa, Julio César; Dzul López, Luis Alonso; Bautista Thompson, Ernesto y Ashraf, Imran
mail
SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, ulio.martinez@unini.edu.mx, luis.dzul@uneatlantico.es, ernesto.bautista@unini.edu.mx, SIN ESPECIFICAR
(2023)
Contextual Urdu Lemmatization Using Recurrent Neural Network Models.
Mathematics, 11 (2).
p. 435.
ISSN 2227-7390
|
Texto
mathematics-11-00435.pdf Available under License Creative Commons Attribution. Descargar (1MB) | Vista Previa |
Resumen
In the field of natural language processing, machine translation is a colossally developing research area that helps humans communicate more effectively by bridging the linguistic gap. In machine translation, normalization and morphological analyses are the first and perhaps the most important modules for information retrieval (IR). To build a morphological analyzer, or to complete the normalization process, it is important to extract the correct root out of different words. Stemming and lemmatization are techniques commonly used to find the correct root words in a language. However, a few studies on IR systems for the Urdu language have shown that lemmatization is more effective than stemming due to infixes found in Urdu words. This paper presents a lemmatization algorithm based on recurrent neural network models for the Urdu language. However, lemmatization techniques for resource-scarce languages such as Urdu are not very common. The proposed model is trained and tested on two datasets, namely, the Urdu Monolingual Corpus (UMC) and the Universal Dependencies Corpus of Urdu (UDU). The datasets are lemmatized with the help of recurrent neural network models. The Word2Vec model and edit trees are used to generate semantic and syntactic embedding. Bidirectional long short-term memory (BiLSTM), bidirectional gated recurrent unit (BiGRU), bidirectional gated recurrent neural network (BiGRNN), and attention-free encoder–decoder (AFED) models are trained under defined hyperparameters. Experimental results show that the attention-free encoder-decoder model achieves an accuracy, precision, recall, and F-score of 0.96, 0.95, 0.95, and 0.95, respectively, and outperforms existing models
| Tipo de Documento: | Artículo |
|---|---|
| Palabras Clave: | neural networks; natural language processing; inflectional morphology; derivational morphology; MSC: 68T50 |
| Clasificación temática: | Materias > Ingeniería |
| Divisiones: | Universidad Europea del Atlántico > Investigación > Artículos y libros Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica Universidad Internacional do Cuanza > Investigación > Producción Científica |
| Depositado: | 01 Feb 2023 23:30 |
| Ultima Modificación: | 21 Oct 2024 23:30 |
| URI: | https://repositorio.uneatlantico.es/id/eprint/5660 |
Acciones (logins necesarios)
![]() |
Ver Objeto |
en
close
Background Post-kala-azar dermal leishmaniasis (PKDL) is a skin condition that can become a complication in about 15 % of patients who have had kala-azar. Despite its significance, treatment options for PKDL are still limited. This systematic review and meta-analysis aim to evaluate the efficacy of amphotericin B for this condition. Methods PubMed, Embase, Cochrane, and Web of Science databases were searched for randomized controlled trials (RCTs) that reported the efficacy of Liposomal Amphotericin B in the treatment of PKDL. This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Events per 100 observations with 95 % confidence intervals were performed for outcomes. Results Nine studies with 639 patients were included, the treatment durations ranging from 7 to 60 days. The mean age ranged from 9.2 to 31.0 years, and 359 patients were male. The PKDL treatment with liposomal amphotericin B resulted in a cure rate of 91.36 % (95 % CI: 76.60-97.15). However, a relapse was observed in 11.42 % (95 % CI: 6.20-20.8) of patients. Adverse events were common, with hepatic enzyme elevation (ALT/AST) being the most frequent (61.75 %; 95 % CI: 21.81–90.33), followed by fever in 29.93 % of cases (95 % CI: 5.09–77.30). Among the more serious side effects, decreased serum potassium was observed in 19.27 % (95 % CI: 3.84–58.82), and increased serum creatinine, indicative of nephrotoxicity, occurred in 15.08 % (95 % CI: 3.97–43.27). Nausea or vomiting, although less severe, affected 12.36 % of patients (95 % CI: 4.81–28.25). Conclusions These findings highlight that while liposomal amphotericin B is a potent therapeutic option for PKDL, its administration requires careful management and clinical vigilance to optimize outcomes and minimize risks.
Deivyd Vieira Silva Cavalcante mail , Lilia Maria Lima de Oliveira mail , Noor Husain mail , Beatriz Ximenes Mendes mail , Ana Clara Felix de Farias Santos mail , Luciana Borrigueiro mail , Lyria de Oliveira Rosa mail , Christian Ndikuryayo mail , Sarah Soares Amorim mail , Lalit Mohan mail , Fabiana Castro Porto Silva Lopes mail ,
Cavalcante
en
close
Enzymatic treatment shapes in vitro digestion pattern of phenolic compounds in mulberry juice
The health benefits of mulberry fruit are closely associated with its phenolic compounds. However, the effects of enzymatic treatments on the digestion patterns of these compounds in mulberry juice remain largely unknown. This study investigated the impact of pectinase (PE), pectin lyase (PL), and cellulase (CE) on the release of phenolic compounds in whole mulberry juice. The digestion patterns were further evaluated using an in vitro simulated digestion model. The results revealed that PE significantly increased chlorogenic acid content by 77.8 %, PL enhanced cyanidin-3-O-glucoside by 20.5 %, and CE boosted quercetin by 44.5 %. Following in vitro digestion, the phenolic compound levels decreased differently depending on the treatment, while cyanidin-3-O-rutinoside content increased across all groups. In conclusion, the selected enzymes effectively promoted the release of phenolic compounds in mulberry juice. However, during gastrointestinal digestion, the degradation of phenolic compounds surpassed their enhanced release, with effects varying based on the compound's structure.
Peihuan Luo mail , Jian Ai mail , Qiongyao Wang mail , Yihang Lou mail , Zhiwei Liao mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Elwira Sieniawska mail , Weibin Bai mail , Lingmin Tian mail ,
Luo
<a href="/17819/1/1-s2.0-S2214804325000679-main%20%281%29.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
What works in financial education? Experimental evidence on program impact
Financial education is increasingly essential for safeguarding both individual and corporate well-being. This study systematically reviews global financial education experiments using a dual-method framework that integrates a deep learning classifier with advanced multivariate statistical techniques. Our analysis indicates that while short-term improvements in financial literacy are common, such gains tend to diminish over time without ongoing reinforcement. Moreover, the limited impact of digital innovations and monetary incentives suggests that successful financial education depends on more than simply deploying technological solutions or extrinsic rewards. Overall, this review provides valuable insights into the evolving landscape of financial education in a dynamic economic context and underscores the need for sustainable strategies that secure lasting improvements in financial literacy.
Gonzalo Llamosas García mail , Cristina Mazas Pérez-Oleaga mail cristina.mazas@uneatlantico.es,
García
en
close
Epigallocatechin gallate (EGCG) is the most abundant polyphenol in tea. Owing to the different fermentation degrees, differences in polyphenol composition of water extracts of green tea, white tea, oolong tea, and black tea occur, and affect health value. This study revealed that the content of EGCG decreases with the increase in the degree of fermentation. In tea with a high fermentation degree, EGCG was stably present in the form of ammoniation to yield nitrogen-containing EGCG derivative (N-EGCG). The content of N-EGCG in tea was negatively correlated with the content of EGCG. Furthermore, the content of l-serine and L-threonine in tea was positively and negatively correlated with N-EGCG and EGCG levels, respectively, suggesting that they may participate in the formation of N-EGCG as nitrogen sources. This study proposes a new fermentation-induced polyphenol-amino acid synergistic mechanism, which provides a theoretical basis for the study of the biotransformation reaction mechanism of tea polyphenols.
Yuxuan Zhao mail , Jingyimei Liang mail , Wanning Ma mail , Mohamed A. Farag mail , Chunlin Li mail , Jianbo Xiao mail ,
Zhao
en
close
Single-cell omics for nutrition research: an emerging opportunity for human-centric investigations
Understanding how dietary compounds affect human health is challenged by their molecular complexity and cell-type–specific effects. Conventional multi-cell type (bulk) analyses obscure cellular heterogeneity, while animal and standard in vitro models often fail to replicate human physiology. Single-cell omics technologies—such as single-cell RNA sequencing, as well as single-cell–resolved proteomic and metabolomic approaches—enable high-resolution investigation of nutrient–cell interactions and reveal mechanisms at a single-cell resolution. When combined with advanced human-derived in vitro systems like organoids and organ-on-chip platforms, they support mechanistic studies in physiologically relevant contexts. This review outlines emerging applications of single-cell omics in nutrition research, emphasizing their potential to uncover cell-specific dietary responses, identify nutrient-sensitive pathways, and capture interindividual variability. It also discusses key challenges—including technical limitations, model selection, and institutional biases—and identifies strategic directions to facilitate broader adoption in the field. Collectively, single-cell omics offer a transformative framework to advance human-centric nutrition research.
Manuela Cassotta mail manucassotta@gmail.com, Yasmany Armas Diaz mail , Danila Cianciosi mail , Bei Yang mail , Zexiu Qi mail , Ge Chen mail , Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Giuseppe Grosso mail , José L. Quiles mail , Jianbo Xiao mail , Maurizio Battino mail maurizio.battino@uneatlantico.es, Francesca Giampieri mail francesca.giampieri@uneatlantico.es,
Cassotta
