A deep learning approach for Named Entity Recognition in Urdu language
Article
Subjects > Engineering
Europe University of Atlantic > Research > Articles and books
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Abierto
Inglés
Named Entity Recognition (NER) is a natural language processing task that has been widely explored for different languages in the recent decade but is still an under-researched area for the Urdu language due to its rich morphology and language complexities. Existing state-of-the-art studies on Urdu NER use various deep-learning approaches through automatic feature selection using word embeddings. This paper presents a deep learning approach for Urdu NER that harnesses FastText and Floret word embeddings to capture the contextual information of words by considering the surrounding context of words for improved feature extraction. The pre-trained FastText and Floret word embeddings are publicly available for Urdu language which are utilized to generate feature vectors of four benchmark Urdu language datasets. These features are then used as input to train various combinations of Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), CRF, and deep learning models. The results show that our proposed approach significantly outperforms existing state-of-the-art studies on Urdu NER, achieving an F-score of up to 0.98 when using BiLSTM+GRU with Floret embeddings. Error analysis shows a low classification error rate ranging from 1.24% to 3.63% across various datasets showing the robustness of the proposed approach. The performance comparison shows that the proposed approach significantly outperforms similar existing studies.
metadata
Khan, Hikmat Ullah and Anam, Rimsha and Anwar, Muhammad Waqas and Jamal, Muhammad Hasan and Bajwa, Usama Ijaz and Diez, Isabel de la Torre and Silva Alvarado, Eduardo René and Soriano Flores, Emmanuel and Ashraf, Imran
mail
UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, eduardo.silva@funiber.org, emmanuel.soriano@uneatlantico.es, UNSPECIFIED
(2024)
A deep learning approach for Named Entity Recognition in Urdu language.
PLOS ONE, 19 (3).
e0300725.
ISSN 1932-6203
![]() |
Text
journal.pone.0300725.pdf Available under License Creative Commons Attribution. Download (1MB) |
Abstract
Named Entity Recognition (NER) is a natural language processing task that has been widely explored for different languages in the recent decade but is still an under-researched area for the Urdu language due to its rich morphology and language complexities. Existing state-of-the-art studies on Urdu NER use various deep-learning approaches through automatic feature selection using word embeddings. This paper presents a deep learning approach for Urdu NER that harnesses FastText and Floret word embeddings to capture the contextual information of words by considering the surrounding context of words for improved feature extraction. The pre-trained FastText and Floret word embeddings are publicly available for Urdu language which are utilized to generate feature vectors of four benchmark Urdu language datasets. These features are then used as input to train various combinations of Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), CRF, and deep learning models. The results show that our proposed approach significantly outperforms existing state-of-the-art studies on Urdu NER, achieving an F-score of up to 0.98 when using BiLSTM+GRU with Floret embeddings. Error analysis shows a low classification error rate ranging from 1.24% to 3.63% across various datasets showing the robustness of the proposed approach. The performance comparison shows that the proposed approach significantly outperforms similar existing studies.
Item Type: | Article |
---|---|
Subjects: | Subjects > Engineering |
Divisions: | Europe University of Atlantic > Research > Articles and books Ibero-american International University > Research > Scientific Production Ibero-american International University > Research > Scientific Production Universidad Internacional do Cuanza > Research > Scientific Production Fundación Universitaria Internacional de Colombia > Research > Scientific Production |
Date Deposited: | 30 May 2024 20:51 |
Last Modified: | 09 Dec 2024 23:30 |
URI: | https://repositorio.uneatlantico.es/id/eprint/12369 |
Actions (login required)
![]() |
View Item |
en
close
Enzymatic treatment shapes in vitro digestion pattern of phenolic compounds in mulberry juice
The health benefits of mulberry fruit are closely associated with its phenolic compounds. However, the effects of enzymatic treatments on the digestion patterns of these compounds in mulberry juice remain largely unknown. This study investigated the impact of pectinase (PE), pectin lyase (PL), and cellulase (CE) on the release of phenolic compounds in whole mulberry juice. The digestion patterns were further evaluated using an in vitro simulated digestion model. The results revealed that PE significantly increased chlorogenic acid content by 77.8 %, PL enhanced cyanidin-3-O-glucoside by 20.5 %, and CE boosted quercetin by 44.5 %. Following in vitro digestion, the phenolic compound levels decreased differently depending on the treatment, while cyanidin-3-O-rutinoside content increased across all groups. In conclusion, the selected enzymes effectively promoted the release of phenolic compounds in mulberry juice. However, during gastrointestinal digestion, the degradation of phenolic compounds surpassed their enhanced release, with effects varying based on the compound's structure.
Peihuan Luo mail , Jian Ai mail , Qiongyao Wang mail , Yihang Lou mail , Zhiwei Liao mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Elwira Sieniawska mail , Weibin Bai mail , Lingmin Tian mail ,
Luo
en
close
A novel machine learning-based proposal for early prediction of endometriosis disease
Background Endometriosis is one of the causes of female infertility, with some studies estimating its prevalence at around 10 % of reproductive-age women worldwide and between 30 and 50 % in symptomatic women. However, its diagnosis is complex and often delayed, highlighting the need for more accessible and accurate diagnostic methods. The difficulty lies in its diverse etiology and the variability of symptoms among those affected. Methods This study proposes a predictive model based on supervised machine learning for the early identification of endometriosis, providing support for decision-making by healthcare professionals. For this purpose, an anonymised dataset of 5,143 female patients diagnosed with endometriosis at the private fertility clinic Inebir was used. The model integrates clinical records and genetic analysis through supervised machine learning algorithms, focusing on clinical variables and pathogenic and potentially pathogenic genetic variants. Results The developed predictive model achieves high accuracy in identifying the presence of endometriosis, highlighting the importance of combining clinical and genetic data in diagnosis. The integration of this data into the DELFOS platform, a clinical decision support system, demonstrates the utility of machine learning in improving the diagnosis of endometriosis. Conclusions The findings underscore the potential of clinical and genetic factors in the early diagnosis of endometriosis using supervised machine learning algorithms. This study contributes to the classification of clinical variables that influence endometriosis, offering a valuable tool for clinicians in making therapeutic and management decisions for their female patients.
Elena Enamorado-Díaz mail , Leticia Morales-Trujillo mail , Julián-Alberto García-García mail , Ana Teresa Marcos Rodríguez mail anateresa.marcos@uneatlantico.es, José Manuel Navarro-Pando mail jose.navarro@uneatlantico.es, María-José Escalona-Cuaresma mail ,
Enamorado-Díaz
en
close
Background The aging process leads to negative changes in various bodily systems, including the neuromuscular system. Strength training, is considered the best strategy to counteract these neuromuscular changes, preventing sarcopenia and frailty in older adults. Objective To compare the effects of strength training with elastic resistance and free weights on the muscle strength of knee extensors and flexors and functional performance in the older adults. Methods This was a randomised clinical study. Thirty-one participants of both sexes were allocated randomly into two groups: Training Group Free Weight (TGFW, n = 15) and Training Group with Elastic Resistance (TGER, n = 16). Two individuals were excluded and so, twenty-nine individuals were evaluated before and after eight weeks training protocol, which was performed three times a week. The determination of the training load was obtained using a protocol of 10 repetitions maximum. Results No significant differences were found in either the intra- or the inter-group comparisons, on functional performance and peak muscle strength. In the intra-groups (pre- and post-strength training), it was observed that both groups significantly increased the training load (10 RM) for the extensors (TGFW p = 0.0002; TGER p = 0.0001) and the knee flexors (TGFW p = 0.006; TGER p = 0.0001). Conclusion Both training protocols similarly were effective in increasing the training load observed by the 10 RM test of the extension and flexion movements of the knee.
Rafaela Zanin Ferreira mail , Antonio Felipe Souza Gomes mail , Marco Antonio Ferreira Baldim mail , Ricardo Silva Alves mail , Leonardo César Carvalho mail , Adriano Prado Simão mail ,
Ferreira
<a href="/17139/1/s41598-025-89266-9.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
In the rapidly evolving landscape of artificial intelligence (AI) and the Internet of Things (IoT), the significance of device diagnostics and prognostics is paramount for guaranteeing the dependable operation and upkeep of intricate systems. The capacity to precisely diagnose and preemptively predict potential failures holds the potential to considerably amplify maintenance efficiency, diminish downtime, and optimize resource allocation. The wealth of information offered by telemetry data gathered from IoT devices presents an opportunity for diagnostics and prognostics applications. However, extracting valuable insights and making well-timed decisions from this extensive data reservoir remains a formidable challenge. This study proposes a novel AI-driven framework that integrates forward chaining and backward chaining algorithms to analyze telemetry data from IoT devices. The proposed methodology utilizes rule-based inference to detect real-time anomalies and predict potential future failures, providing a dual-layered approach for diagnostics and prognostics. The results show that the diagnostics engine using forward chaining detects real-time issues like “High Temperature” and “Low Pressure,” while the prognostics engine with backward chaining predicts potential future occurrences of these issues, enabling proactive prevention measures. The experimental results demonstrate that adopting this approach could offer valuable assistance to authorities and stakeholders. Accurate early diagnosis and prediction of potential failures have the capability to greatly improve maintenance efficiency, minimize downtime, and optimize cost.
Muhammad Shoaib Farooq mail , Rizwan Pervez Mir mail , Atif Alvi mail , Kilian Tutusaus mail kilian.tutusaus@uneatlantico.es, Eduardo García Villena mail eduardo.garcia@uneatlantico.es, Fadwa Alrowais mail , Hanen Karamti mail , Imran Ashraf mail ,
Farooq
<a href="/17140/1/s41598-025-90616-w.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Efficient CNN architecture with image sensing and algorithmic channeling for dataset harmonization
The process of image formulation uses semantic analysis to extract influential vectors from image components. The proposed approach integrates DenseNet with ResNet-50, VGG-19, and GoogLeNet using an innovative bonding process that establishes algorithmic channeling between these models. The goal targets compact efficient image feature vectors that process data in parallel regardless of input color or grayscale consistency and work across different datasets and semantic categories. Image patching techniques with corner straddling and isolated responses help detect peaks and junctions while addressing anisotropic noise through curvature-based computations and auto-correlation calculations. An integrated channeled algorithm processes the refined features by uniting local-global features with primitive-parameterized features and regioned feature vectors. Using K-nearest neighbor indexing methods analyze and retrieve images from the harmonized signature collection effectively. Extensive experimentation is performed on the state-of-the-art datasets including Caltech-101, Cifar-10, Caltech-256, Cifar-100, Corel-10000, 17-Flowers, COIL-100, FTVL Tropical Fruits, Corel-1000, and Zubud. This contribution finally endorses its standing at the peak of deep and complex image sensing analysis. A state-of-the-art deep image sensing analysis method delivers optimal channeling accuracy together with robust dataset harmonization performance.
Khadija Kanwal mail , Khawaja Tehseen Ahmad mail , Aiza Shabir mail , Li Jing mail , Helena Garay mail helena.garay@uneatlantico.es, Luis Eduardo Prado González mail uis.prado@uneatlantico.es, Hanen Karamti mail , Imran Ashraf mail ,
Kanwal