A deep learning approach for Named Entity Recognition in Urdu language
Artículo
Materias > Ingeniería
Universidad Europea del Atlántico > Investigación > Artículos y libros
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Abierto
Inglés
Named Entity Recognition (NER) is a natural language processing task that has been widely explored for different languages in the recent decade but is still an under-researched area for the Urdu language due to its rich morphology and language complexities. Existing state-of-the-art studies on Urdu NER use various deep-learning approaches through automatic feature selection using word embeddings. This paper presents a deep learning approach for Urdu NER that harnesses FastText and Floret word embeddings to capture the contextual information of words by considering the surrounding context of words for improved feature extraction. The pre-trained FastText and Floret word embeddings are publicly available for Urdu language which are utilized to generate feature vectors of four benchmark Urdu language datasets. These features are then used as input to train various combinations of Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), CRF, and deep learning models. The results show that our proposed approach significantly outperforms existing state-of-the-art studies on Urdu NER, achieving an F-score of up to 0.98 when using BiLSTM+GRU with Floret embeddings. Error analysis shows a low classification error rate ranging from 1.24% to 3.63% across various datasets showing the robustness of the proposed approach. The performance comparison shows that the proposed approach significantly outperforms similar existing studies.
metadata
Khan, Hikmat Ullah; Anam, Rimsha; Anwar, Muhammad Waqas; Jamal, Muhammad Hasan; Bajwa, Usama Ijaz; Diez, Isabel de la Torre; Silva Alvarado, Eduardo René; Soriano Flores, Emmanuel y Ashraf, Imran
mail
SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, eduardo.silva@funiber.org, emmanuel.soriano@uneatlantico.es, SIN ESPECIFICAR
(2024)
A deep learning approach for Named Entity Recognition in Urdu language.
PLOS ONE, 19 (3).
e0300725.
ISSN 1932-6203
![]() |
Texto
journal.pone.0300725.pdf Available under License Creative Commons Attribution. Descargar (1MB) |
Resumen
Named Entity Recognition (NER) is a natural language processing task that has been widely explored for different languages in the recent decade but is still an under-researched area for the Urdu language due to its rich morphology and language complexities. Existing state-of-the-art studies on Urdu NER use various deep-learning approaches through automatic feature selection using word embeddings. This paper presents a deep learning approach for Urdu NER that harnesses FastText and Floret word embeddings to capture the contextual information of words by considering the surrounding context of words for improved feature extraction. The pre-trained FastText and Floret word embeddings are publicly available for Urdu language which are utilized to generate feature vectors of four benchmark Urdu language datasets. These features are then used as input to train various combinations of Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), CRF, and deep learning models. The results show that our proposed approach significantly outperforms existing state-of-the-art studies on Urdu NER, achieving an F-score of up to 0.98 when using BiLSTM+GRU with Floret embeddings. Error analysis shows a low classification error rate ranging from 1.24% to 3.63% across various datasets showing the robustness of the proposed approach. The performance comparison shows that the proposed approach significantly outperforms similar existing studies.
Tipo de Documento: | Artículo |
---|---|
Clasificación temática: | Materias > Ingeniería |
Divisiones: | Universidad Europea del Atlántico > Investigación > Artículos y libros Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica Universidad Internacional do Cuanza > Investigación > Producción Científica Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica |
Depositado: | 30 May 2024 20:51 |
Ultima Modificación: | 09 Dic 2024 23:30 |
URI: | https://repositorio.uneatlantico.es/id/eprint/12369 |
Acciones (logins necesarios)
![]() |
Ver Objeto |
en
close
Enzymatic treatment shapes in vitro digestion pattern of phenolic compounds in mulberry juice
The health benefits of mulberry fruit are closely associated with its phenolic compounds. However, the effects of enzymatic treatments on the digestion patterns of these compounds in mulberry juice remain largely unknown. This study investigated the impact of pectinase (PE), pectin lyase (PL), and cellulase (CE) on the release of phenolic compounds in whole mulberry juice. The digestion patterns were further evaluated using an in vitro simulated digestion model. The results revealed that PE significantly increased chlorogenic acid content by 77.8 %, PL enhanced cyanidin-3-O-glucoside by 20.5 %, and CE boosted quercetin by 44.5 %. Following in vitro digestion, the phenolic compound levels decreased differently depending on the treatment, while cyanidin-3-O-rutinoside content increased across all groups. In conclusion, the selected enzymes effectively promoted the release of phenolic compounds in mulberry juice. However, during gastrointestinal digestion, the degradation of phenolic compounds surpassed their enhanced release, with effects varying based on the compound's structure.
Peihuan Luo mail , Jian Ai mail , Qiongyao Wang mail , Yihang Lou mail , Zhiwei Liao mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Elwira Sieniawska mail , Weibin Bai mail , Lingmin Tian mail ,
Luo
<a class="ep_document_link" href="/17819/1/1-s2.0-S2214804325000679-main%20%281%29.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
What works in financial education? Experimental evidence on program impact
Financial education is increasingly essential for safeguarding both individual and corporate well-being. This study systematically reviews global financial education experiments using a dual-method framework that integrates a deep learning classifier with advanced multivariate statistical techniques. Our analysis indicates that while short-term improvements in financial literacy are common, such gains tend to diminish over time without ongoing reinforcement. Moreover, the limited impact of digital innovations and monetary incentives suggests that successful financial education depends on more than simply deploying technological solutions or extrinsic rewards. Overall, this review provides valuable insights into the evolving landscape of financial education in a dynamic economic context and underscores the need for sustainable strategies that secure lasting improvements in financial literacy.
Gonzalo Llamosas García mail , Cristina Mazas Pérez-Oleaga mail cristina.mazas@uneatlantico.es,
García
en
close
Epigallocatechin gallate (EGCG) is the most abundant polyphenol in tea. Owing to the different fermentation degrees, differences in polyphenol composition of water extracts of green tea, white tea, oolong tea, and black tea occur, and affect health value. This study revealed that the content of EGCG decreases with the increase in the degree of fermentation. In tea with a high fermentation degree, EGCG was stably present in the form of ammoniation to yield nitrogen-containing EGCG derivative (N-EGCG). The content of N-EGCG in tea was negatively correlated with the content of EGCG. Furthermore, the content of l-serine and L-threonine in tea was positively and negatively correlated with N-EGCG and EGCG levels, respectively, suggesting that they may participate in the formation of N-EGCG as nitrogen sources. This study proposes a new fermentation-induced polyphenol-amino acid synergistic mechanism, which provides a theoretical basis for the study of the biotransformation reaction mechanism of tea polyphenols.
Yuxuan Zhao mail , Jingyimei Liang mail , Wanning Ma mail , Mohamed A. Farag mail , Chunlin Li mail , Jianbo Xiao mail ,
Zhao
<a href="/17849/1/1-s2.0-S2590005625001043-main.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Ultra Wideband radar-based gait analysis for gender classification using artificial intelligence
Gender classification plays a vital role in various applications, particularly in security and healthcare. While several biometric methods such as facial recognition, voice analysis, activity monitoring, and gait recognition are commonly used, their accuracy and reliability often suffer due to challenges like body part occlusion, high computational costs, and recognition errors. This study investigates gender classification using gait data captured by Ultra-Wideband radar, offering a non-intrusive and occlusion-resilient alternative to traditional biometric methods. A dataset comprising 163 participants was collected, and the radar signals underwent preprocessing, including clutter suppression and peak detection, to isolate meaningful gait cycles. Spectral features extracted from these cycles were transformed using a novel integration of Feedforward Artificial Neural Networks and Random Forests , enhancing discriminative power. Among the models evaluated, the Random Forest classifier demonstrated superior performance, achieving 94.68% accuracy and a cross-validation score of 0.93. The study highlights the effectiveness of Ultra-wideband radar and the proposed transformation framework in advancing robust gender classification.
Adil Ali Saleem mail , Hafeez Ur Rehman Siddiqui mail , Muhammad Amjad Raza mail , Sandra Dudley mail , Julio César Martínez Espinosa mail ulio.martinez@unini.edu.mx, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Isabel de la Torre Díez mail ,
Saleem
<a class="ep_document_link" href="/17843/1/s41599-025-05247-3.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Econometric analysis has long been integral to measuring sustainable environmental quality, with panel data methods, such as fixed and random effects models, becoming the focal point of modern research. Initially, such methods were used to simply investigate environmental issues, but recent years have seen a shift toward the study of random effects models, focusing on hypothesis testing and policy debates. However, several important aspects of the Hausman test have not been sufficiently investigated in the literature. This study seeks to evaluate the utility of the Hausman test using a real dataset from tourism and globalization, exploring their effects on sustainable environmental quality. Additionally, the study examines key factors contributing to environmental issues including economic growth and energy consumption, as critical explanatory variables. By investigating the relationship between tourism, globalization, economic growth, and energy use, the research focuses on the top 10 most visited economies: France, the USA, Spain, China, Turkey, Italy, Mexico, Germany, Thailand, and the UK. Using panel data and the cross-sectional random effects model for the period of 1998 to 2024, the study produces reliable estimations of these relationships. The empirical findings suggest that while the Hausman test favors the fixed effect model, the real-world characteristics of these countries point to the random effect model, highlighting the negative impact of economic growth, energy consumption, and globalization on sustainable environmental quality. It is also suggested that socio-environmental factors should be considered for each destination for sustainable environmental quality.
Saba Nourin mail , Ismat Nasim mail , Hafiz Muhammad Raza ur Rehman mail , Elisabeth Caro Montero mail elizabeth.caro@uneatlantico.es, Mirtha Silvana Garat de Marin mail silvana.marin@uneatlantico.es, Nagwan Abdel Samee mail , Imran Ashraf mail ,
Nourin