Section
-
-
Instructions: Clicking on the section name will show / hide the section.
-
مشاريع طلاب ربيع 2023 _ S23
In Silico Identification of Key Genes and Pathways Associated with Bipolar Disorder
Using GWAS
Abstract
Bipolar disorder (BD) is a chronic and recurrent disorder that affects more than (1%) of the global population. The most prevalent age for the onset of symptoms is 20 years old; early-onset is associated with a worse prognosis. It is a leading cause of disability in young people as it can lead to cognitive and functional impairment and increased mortality, particularly from suicide and cardiovascular disease.
Our analysis drew upon Genome-Wide Association Studies (GWAS) from the Psychiatric Genomic Consortium (PGC) and GWAS Catalog for BD patients. Through the analysis; 118 genomic risk loci and 539 genes were mapped. By utilization of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, a deeper understanding of the underlying biological processes and crucial pathways related to BD was attained. As a result, a comprehensive protein-protein interaction (PPI) network was established, revealing 16 central hub genes and two notable modules.
Using the Comparative Toxicogenomics Database (CTD), we performed in-silico validation of the hub genes. Our findings from functional enrichment analysis highlighted the crucial functions of these key genes in biological processes such as antigen processing and presentation and regulation of T-cell mediated immunity. Additionally, we identified 762 microRNAs and 28 transcription factors that target these hub genes, further supporting their significance in BD disorder.
By conducting a thorough bioinformatics analysis, we have gained insights into the underlying mechanisms of BD, identifying potential biomarkers for clinical treatment, and uncovering drug targets. These findings greatly enhance our understanding of BD and show potential for improving diagnosis and treatment methods in the future.
إعداد : الطالبة هبه فكرت محمد
إشراف: الدكتورة لمى يوسف
In Silico Identification of Key Genes and Pathways Associated with Bipolar Disorder Using GWAS
Deep learning in clinical epigenetics: shedding new light on pathological processes of Alzheimer's disease in the perspective of therapeutic approaches
Abstract
This thesis discusses Alzheimer’s disease, its impact and pathogenesis, and delves into DNA methylation of Alzheimer’s disease: the most prominent epigenetic mechanism in the disease. Several analytic methods are present for the detection of DNA methylation on cytosine-guanine dinucleotides, but none can fully grasp all loci, they also have limitations due to computational ability along with their high cost. Artificial intelligence, therefore, can be of better benefit in this case by using the results of analytic methods such as epigenome-wide association studies, and whole-genome bisulfite sequencing, and extracting the data from them to help train and test models for the prediction of new previously undetected loci in the genome, for instance. First, however, it must be noted that artificial intelligence in epigenetics is very recently new and only a handful of studies have been employed for the identification of loci undergoing epigenetic tags in Alzheimer’s disease. Also, no deep learning models have been reported so far that are targeted towards the identification of methylated CpGs in an AD context. Our reference study EWAS plus, for example, utilizes data from large resources on a super-computer-scale and uses it to predict new methylated loci of CpGs related to the disease.
Our goal was to take inspiration from this reference study in the aim of trying a new model – deep learning – in the hopes of coming closer to finding new therapeutic approaches. In this thesis, we used whole-genome bisulfite sequencing data and EWAS data to train two models: Random Forest Regressor, a machine learning model, and Keras Regressor, a deep learning model. Both were applied in this thesis to predict previously undetected methylated CpGs on chromosome 19, which contains the most important risk gene in Alzheimer’s disease: APOE.
The models resulted in the prediction of four CpGs on chromosome 19 that present with higher correlation than the rest in terms of methylation and Alzheimer’s disease.
However, many technical and computational limitations were present in the application of the models, leading to low performance. This attempt at applying a deep learning model in this epigenetic context still remains promising, due to its higher efficacy in comparison with machine learning in general.
Therefore, it is immensely important that studies such as the one presented in this thesis have broader horizons in terms of resources to fully reach the potential of the models and datasets, leading to higher precision, and closer steps towards Alzheimer’s disease therapy.
إعداد: الطالبة شهد نبيل نحال
إشراف: الدكتور رؤوف حمدان
In silico analysis of Continuous glucose monitoring (CGM) results in diabetes mellitus patients; and Automatic Event Detection Using Neural Networks
Abstract
Background: Diabetes Mellitus (DM) is a chronic metabolic disorder that results in abnormal blood glucose regulation. People with diabetes are prone to develop devastating long-term complications including cardiovascular disease, neuropathy, retinopathy, renal failure and even mortality. Keeping blood glucose in near normal levels and normalizing patients’ HbA1c leads to a lower frequency of macrovascular and microvascular complication. Due to this, blood glucose monitoring plays a vital key in diabetes care. Especially, Continuous Glucose Monitoring (CGM) which monitors interstitial blood glucose in real time. However, the huge amount of data obtained from CGM sensors requires finding ways to analyze the data more efficiently. Thus, using artificial intelligence and deep learning models to better interpret these results. Further using deep learning models like RNNs based on Long Short-Term Memory (LSTM) networks that has been designed for time sequence prediction problems has enabled researchers to propose specialized models to predict future values of blood glucose based on patient’s existing data.
Aim: The main purpose of this study is to use artificial intelligence to better analyze patients’ CGM data. In addition to using a deep learning model based on LSTM neural network to predict future trends in patients’ data and help prevent either hyperglycemia or hypoglycemia episodes from occurring in order to improve the patient’s treatment plan and their life quality.
Materials and Methods: This study utilized the Shanghai_T1DM and Shanghai_T2DM datasets. The data was collected from Diabetes Data Registry and Individualized Lifestyle Intervention (DiaDRIL) was initiated in Shanghai East Hospital and Shanghai Fourth People’s Hospital affiliated to Tongji University since 2019. The data contains 3 to 14 days of CGM data corresponding to 12 patients with T1DM and 100 patients with T2DM, respectively. Some patients might have multiple periods of CGM recordings. The CGM data was analyzed using artificial intelligence to find each dataset’s characterizations. Furthermore, we calculated the autocorrelation function (ACF) and the time percentage of TAR, TBR and TIR for patients in both datasets. Later, we mapped the data onto risk scores and used a RNN based neural network to predict future values of blood glucose.
Results: After applying the model to both Shanghai_T1DM and Shanghai_T2DM we evaluated the model performance using the Root Mean Square Error (RMSE) metric. We achieved a result of (RMSE: 9.78 mg/dl) for the LSTM model in T1DM patients’ data and (RMSE: 4.40 mg/dl) in T2DM patients’ data. Overall, our models demonstrated high prediction accuracy, supported by low RMSE values. But the model performed better in T2DM with a lower RMSE than that of T1DM. Moreover, we assessed the clinical safety of glucose prediction using the Clarke Error Grid (CEG). In T1DM data, most of the predictions fell in zones A or B which are either accurate of clinically benign with very few predictions were inaccurate or could be clinically harmful. Alternatively, in T2DM data most of the predictions were in zone A which is clinically accurate while the rest of the predictions were in Zone B which is clinically benign.
Conclusion: In this study, we show that our LSTM model was able to accurately and safely predict glucose values. In addition, translation of our prediction models to individuals with both type 1 diabetes showed encouraging results. We observed high precision in predictions. As such, the prediction model can be used to improve closed-loop insulin delivery systems by overcoming sensor delay. In addition, longer prediction intervals may be used to safely bridge periods of sensor malfunction. On another note, analyzing CGM data in T2DM and accurately predicting patient’s glucose at different intervals offers an immense help in improving the drug choices based on the trends in the data. Potential future research avenues could involve the inclusion of meals and insulin doses delivered to the patient in the model in order to computationally decide the optimal dose of insulin needed independent of patient’s input.
إعداد: الطالبة سراء محمود عبد الوهاب
إشراف: الدكتور رؤوف حمدان
Broad Neutralization Effects of Monoclonal Antibodies Targeting the Stem Helix of MERS-CoV: A Computational Study using AutoDOCK Vina, HADDOCK and PyMOL Analysis
Abstract
The emergence of SARS-CoV-2 VOCs, and other zoonotic coronaviruses with pandemic potential, research efforts focus on vaccines and antibodies targeting the most conserved regions of the spike protein. Middle East Respiratory Syndrome Coronavirus (MERS-CoV) continue to pose significant global health threats. Across the coronavirus family, the receptor binding domain is poorly conserved, and so therapeutics that target the receptor binding function have low potential as a pan-coronavirus solution. An alternative relatively conserved target on the coronavirus spike is the stem helix in S2 region, which does harbor neutralizing epitopes and therefore is of interest to generate vaccines effective against pan-beta-coronaviruses.
Monoclonal antibodies (mAbs) possessing broad neutralization capabilities against HCoVs offer a promising avenue for treatment, as there is currently no vaccine or treatment approved against MERS-CoV. This thesis leverages computational methodologies, notably Autodock Vina and HADDOCK, to explore the neutralizing effects of broad neutralizing antibodies (bnAbs) targeting the stem helix of MERS-CoV and SARS-CoV-2. Through method optimization and validation against experimental data, the study aims to efficiently identify potential drug candidates among bnAbs. This approach promises to reduce resource expenditure and streamline subsequent clinical investigations, potentially accelerating targeted therapy development against MERS-CoV while minimizing research costs.
Referencing Zhou et al.'s comprehensive study, which isolated a substantial panel of β-CoV stem-helix bnAbs, structural analyses of these bnAbs unveiled the molecular underpinnings of their broad reactivity. The study determined crystal structures of four bnAbs (CC25.106, CC95.108, CC68.109, and CC99.103) in complex with beta-coronavirus spike stem-helix peptides at resolutions ranging from 1.9 to 2.9 Å. Employing molecular docking simulations via Autodock Vina and HADDOCK2.4, this investigation aims to predict binding modes and affinities of five bnAbs (CC25.106, CC95.108, CC99,103, CC9.113, CC25.36) against the stem helix epitopes of both viruses. Additionally, it explores dynamic behavior and conformational changes of these complexes through molecular dynamics simulations.
The analysis integrates PyMOL visualization to elucidate and interpret binding modes, emphasizing crucial residue interactions governing binding specificity, affinity, and stability of bnAb-stem helix complexes. The synthesis of computational outcomes with experimental data and existing literature aims to enhance the reliability and relevance of findings. By elucidating the molecular mechanisms governing bnAb interactions with conserved MERS-CoV epitopes, this study seeks to contribute to the development of broad-spectrum antiviral strategies targeting coronaviruses. Evaluated across both viruses, the assessment of five distinct bnAbs reveals comparable neutralization potency against SARS-CoV-2 and heightened efficacy against replication-competent MERS-CoV. Notably, while CC25.106 displayed superior performance in combating beta-coronavirus disease, CC9.113 emerged as a promising therapeutic candidate due to its favorable binding characteristics. Despite inherent limitations, this study underscores CC9.113's potential for therapeutic development against coronaviruses, advocating for further exploration across a broader spectrum of bnAbs to streamline future therapeutic initiatives.
إعداد: الطالبة شذى منجد الفريجات
إشراف: الدكتور باسم عصفور
PGx ExploreEZ
A Web-Based User-Friendly Tool for Exploration of Pharmacogenomics Reference Resources
Abstract
As it is clear that one size doesn't fit all, one medication with the same dosing regimen may not be effective or safe for all patients with the same disease due to various factors. One key factor is the genetic variations between individuals.
Pharmacogenomics (PGx), a rapidly evolving field within precision medicine, has the potential to revolutionize healthcare by tailoring treatments based on an individual's genetic makeup, which can optimize treatment outcomes and minimize the risks of adverse reactions.
However, several barriers hinder the successful implementation of pharmacogenomics in clinical practice. A major challenge is the lack of knowledge among healthcare providers, researchers, and other targeted communities.
To address this barrier, we present “PGx ExploreEZ”, a web-based, user-friendly tool for exploring pharmacogenomics reference resources developed using the R shiny package. Supplied with manually collected and curated data about gene-drug associations, clinical recommendations, and other relevant data from pharmacogenomics reference resources, this tool serves as a gateway to explore these resources easily.
With its user-friendly and interactive interface, 'PGx ExploreEZ' enables healthcare professionals, researchers, and interested users to easily access and explore essential information in pharmacogenomics.
“PGx ExploreEZ” aims to simplify the process of accessing valuable insights in the field of pharmacogenomics in order to bridge the knowledge gap and pave the way for the implementation of pharmacogenomics into clinical practice.
إعداد: الطالبة رشا عبد القادر حمامه
إشراف: الدكتورة لمى يوسف
In-silico analysis of culture media miRNA as potential non invasive biomarker for embryo selection in IVF cycles
Abstract
Background: In vitro fertilization is widely used to overcome numerous reproductive challenges, but implantation failure and early pregnancy loss are common issues that affect IVF's success rates. Biological markers of embryo viability still need optimization and require invasive biopsies, thus, less invasive methods are needed for selecting the best embryos with highest potential of implantation, especially when only one embryo is going to be transferred back to the uterus.
MiRNAs have been detected in the SCM with their unique expression profiles associated with the embryonic developmental and chromosomal status, sexual dimorphism, the reproductive competence after transfer to the uterus, fertilization method, day-6 blastocysts compared to day-5 , and trophectoderm (TE) morphology grades, indicating that miRNAs should be more explored for non-invasive embryo selection.
Methods: In this study Wang S. 2021 was chosen to use their raw count data set available on GEO database to analyze in-silico and find differentially expressed miRNAs between non-pregnant and pregnant group in day 3 and day 5 of embryo's development in-vitro using DEseq2 tool in R studio graphic user interface, then finding the genes that the resulting DEmiRNAs interact with by using miRDB and Target Scan tools, and finally applying a functional enrichment analysis using DAVID and Metascape tools, in addition to using SRplot website to plot additional useful plots along the study, and finally the results were interpreted through integrating all produced information and comparing the current results with previous studys' results.
Results: DEseq2 significant results for differentially expressed miRNA in day 5 embryos CM depending on pregnancy outcome included 11 novel DEmiRNAs and 5 known DEmiRNAs (hsa-miR-629-5p , hsa-miR-30a-3p , hsa-miR-99a-5p , miR-199a-3p > miR-199b-3p, hsa-miR-199a-5p). while on day 3 there were 14 all novel differentially expressed miRNAs, known miRNAs that have been differentially expressed where pooled together with their original raw counts for comparison, day 5 samples showed better separation between outcome labeled clusters (non-pregnant , pregnant), out of these pooled DEmiRNAs , two were having the most obvious and unbroken pattern among the others in day 5 SBCM ( hsa-miR-99a-5p and hsa-miR-30a-3p).
hsa-miR-99a-5p functional enrichment analysis indicated its association with biological processes including embryonic morphogenesis and signaling pathways regulating pluripotency of stem cells, as for miR-30a-3p, it was associated with embryo development ending in birth or egg hatching.
Conclusion: Differentially expressed miRNA in day 5 embryos' culture media depending on pregnancy outcome included 11 novel and 5 known DEmiRNAs (hsa-miR-629-5p , hsa-miR-30a-3p , hsa-miR-99a-5p , miR-199a-3p > miR-199b-3p, hsa-miR-199a-5p).
إعداد: الطالبة علياء خالد الديري
إشراف: الدكتور مجد الجمالي
التنبؤ بالأورام الخبيثة في سرطان الثدي باستخدام أدوات الذكاء الاصطناعي(أداة تعلم الآلة)
Prediction of Malignant Tumors In Breast Cancer using Artificial Intelligence Tools (Machine Learning)
Abstract:
Breast cancer is one of the most common diseases in women worldwide. Many studies have been conducted to predict the prognosis of breast cancer. However, most of these analyses were predominantly performed using basic statistical methods. There for, this study aims to use machine learning techniques to build high accuracy and sensitivity models for detecting malignancy of breast cancer based on many variables in order to be able to intervene quickly in the patient's treatment protocol to reduce mortality as much as possible.
We utilized a dataset from Kaggle after processing and visualizing it. The final dataset consisted of 569 samples, 21 inputs, and one output (malignant tumor and benign tumor).
Our study showed that all machine learning algorithms achieved perfect accurac greater than 99% according to the first approach (testing set= 25%), where the decision tree, logistic regression, and random forest ranked first with an accuracy of 100%, followed by the rest of the algorithms at 99.3%.
We also found that the accuracy decreased slightly in many algorithms according to the second approach (testing set= 40%) to reach 99.56%. Moreover, when optimizing hyperparameters, the accuracy of the SVM increased from 99.56% to 100%. The performance of this classifier can be described as balanced.
In conclusion. this study underscores the importance of selecting appropriate classification algorithms for predicting breast cancer patient outcomes. These findings contribute to the field of breast cancer prognosis and provide insights for improving personalized treatment strategies.
إعداد: الطالبة رغد رفاعي عبد العزيز
إشراف: الدكتور ينال القدسي
التنبؤ بالأورام الخبيثة في سرطان الثدي باستخدام أدوات الذكاء الاصطناعي(أداة تعلم الآلة)
QSAR and 3D-QSAR Principles and applications in Drug Design (antineoplastic drugs)
Abstract:
QSAR and 3D-QSAR techniques marked a huge milestone in drug design development, especially in antineoplastic drugs.
QSAR models utilizes molecular descriptors to predict the relationships between the chemical structure and the biological activity, which aids in designing and developing potent compounds.
Classical QSAR models have limitations in drug design. Thus, 3D-QSAR methods were developed in order to provide more accurate results of the drug-target interactions.
The applications of these methods has led to the development of drug design and antineoplastic drugs with improved efficacy and reduced side effects and toxicity.
In conclusion, QSAR and 3D-QSAR play a vital role in the development of more effective and more selective drugs.
إعداد: الطالبة آيه احمد حسان المصري
إشراف: الدكتورة خنساء حسين
QSAR and 3D-QSAR Principles and applications in Drug Design (antineoplastic drugs)
Investigating the correlation between Colorectal cancer mutational profile and the associated microbiota on Tumor and matched normal healthy tissue; A computational analysis
Abstract:
Colorectal cancer is a prevalent and deadly malignancy with a significant global burden. It arises from the accumulation of genetic and epigenetic changes that transform normal colonic epithelial cells into adenocarcinomas. The microbiome plays a crucial role in CRC development. Bacterial biomarkers have prognostic value and hold potential for CRC detection and clinical outcome prediction.
The human gut microbiota is a vibrant ecosystem teeming with bacteria, viruses, fungi, and archaea, residing in a harmonious relationship with the host. It profoundly influences various aspects of human health, playing a crucial role in maintaining gut homeostasis, immune function, and metabolism.
In recent years, the association between colorectal cancer (CRC) and the microbiota has gained significant attention. Emerging evidence suggests that dysbiosis, a disruption in the gut microbiota's composition, may contribute to the initiation and progression of CRC. Studies have unveiled distinct alterations in the gut microbiota composition and diversity in individuals with CRC compared to healthy controls. These alterations encompass shifts in microbial taxa, decreased microbial diversity, and modifications in microbial metabolites. Specific bacterial species, such as Fusobacterium nucleatum, Bacteroides fragilis, and certain Enterococcus and Escherichia coli strains, have been implicated in CRC pathogenesis due to their capacity to promote inflammation, produce genotoxins, or modulate the tumor microenvironment. Thus, in this study we used 16S rRNA data from 60 samples belonging to 30 patients, from the tissue and the matched normal healthy tissue, the data went through characterization process using Linux shell command, bash programming language and R programming language with RStudio with various microbiome processing packages and tools, we implemented the DADA2 package, for Amplicon Sequencing Variants based approach. DADA2 (Denoising Amplicon Data with Adaptive Removal of Chimeras and Dereplication) is a widely used pipeline for analysing amplicon sequencing data. It employs a three-step approach to accurately identify and quantify microbial communities: error estimation, chimera detection, and denoising, the denoising algorithm employed by DADA2 is particularly effective in handling error-prone amplicon sequencing data and can significantly improve the accuracy of microbial community analysis.
The final product of the dada2 package is the corresponding taxonomy table of the data, next it is input to other packages for further manipulation, filtering and downstream analysis.
After further statistical analysis with various measure popular for microbiome studies, we compared the microbiome composition between tumor and matched healthy tissue in patients with colorectal cancer (CRC). Our findings align with previous studies highlighting the dominance of Firmicutes and Bacteroidetes phyla in the gut microbiome. While overall diversity may not be affected, the presence of a tumor may influence the abundance of specific rare taxa. Differential abundance analysis identified the genus Ruminococcus within the Firmicutes phylum as significantly enriched in cancer tissues. This finding is intriguing, considering the potential role of Ruminococcus species in promoting tumor growth and pro-inflammatory responses.
إعداد: الطالبة زينه حسام الجندلي
إشراف: الدكتور مجد الجمالي
Studying the cases of heart and arteries nutrition for cardiac patients in the Syrian community using bioinformatics tools
Abstract:
Cardiovascular diseases (CVD) are one of the most causes of death worldwide. Although of many habits like smoking and comorbidities are considered as a risk factors for developing CVD, poor eating habits should be taken into consideration.
Bioinformatics tools provide powerful computational methods for analyzing CVD data. Therefore, the aim of this study was to develop a machine learning models to predict the deferent CVD to make the right decision in the protocol treatments of patients.
The dataset was collected from Al-WATANI hospital in Sweida, which included patients demographic, comorbidities, and dietary. The dataset was further split into training (60%) and test (40%) sets for building model and evaluating.
Our study included 183 patients of which 111 patients were with hypertension, 33 patients with Infarction, 20 patients with congestive heart failure, and 19 patients with arrhythmia. Moreover, the accuracy of the algorithms varied, with support vector machine achieving the lowest accuracy of 71.62%, while it increased remarkably to 91.74% when applying the balanced weights. We also found that decision tree and random forest (tree depth=5) achieved the same accuracies of 85.14%. However, when increasing the depth of the trees to 10, 15, or 20, the accuracy increased to 87.84% and remained steady.
These models demonstrated high accuracy and reliability, making them valuable tools for clinical decision-making.
إعداد: الطالبة بسمه أسعد العشعوش
إشراف: الدكتور ينال القدسي
-
مشاريع طلاب خريف 2022- F22
In silico detection for Beta Thalassemia via bioinformatics and expert systems
Aim of the study: Providing a guide to choose the most efficient way to design a new specific-primer by applying web services on SNPs from the HbVar database to understand the relationship between phenotype and genotype in the clinical setting and investigating the effects of SNP mutations in the HBB exons and give a guideline for functional studies and prenatal diagnosis to be developed as basis for future studies , Finding alternative therapeutic molecules made from natural inducers that had fewer side effects than traditional medications for treating beta thalassemia by recognizing the particular ligands that bind to specific receptor binding sites and recognize the foremost favorable ligand with the assistance of molecular docking , and creating a fuzzy inference system to predict the severity involve in Thalassemia disease.
Conclusion: Single nucleotide polymorphisms (SNPs) have been proposed as the next generation of markers to identify loci associated with complex diseases and their therapeutic treatment . Low-cost genotyping tools are absolutely necessary for effective personalized medicine ,so the in silico analysis like AS-PCR methods are quick, excellent and inexpensive strategies and require minimal instruments that are found in most laboratories to be developed for massive implementation into clinical laboratories . we hope that identify the mechanisms responsible for fetal hemoglobin control, since reactivation of fetal hemoglobin can provide major therapeutic benefits to people affected by β-hemoglobinopathies .
إعداد: الطالبة رند محمد حمزه خياطه
إشراف: الدكتور ياسر خضرا
In silico detection for Beta Thalassemia via bioinformatics and expert systems
محاكاة مرض جنف المراهقين مجهول السبب باستخدام أداة Synthea
Simulation of adolescent idiopathic scoliosis using the Synthea tool
The aim of this research is to review scientific publications related to the history of the disease, diagnostic markers, and treatment options. Additionally, the research seeks to create a model that simulates the disease using the Synthea tool, generating realistic but synthetic patient data. Furthermore, a field visit was conducted to the specialized unit for conservative treatment of scoliosis at Ibn Al-Nafees Hospital in Damascus, and demographic data of patients in the Syrian Arab Republic were obtained based on United Nations statistics for 2023.
The results demonstrate that using the Synthea tool to simulate adolescent idiopathic scoliosis can provide a bioinformatics model that supports clinical and therapeutic decision-making for scoliosis specialists. This can potentially be used in the future to build dedicated databases for adolescent idiopathic scoliosis and for educational and informational purposes for specialists at various academic levels. Moreover, it can be used to develop software applications that support diagnostic and therapeutic decision-making, thus enhancing and improving healthcare outcomes and positively impacting patient health.
إعداد : الطالبة ريم موسى قبه
إشراف: الدكتور ينال أحمد القدسي
إشراف مشارك: الدكتور داوود رزق الله قره كولله
التنبؤ بأمراض القلب باستخدام الذكاء الاصطناعي
Prediction of Heart Disease Using Artificial Intelligence
In this study, we proposed an efficient and accurate model for early prediction of cardiovascular disease, based on 13 features that are important for physicians to diagnose like age, gender, chest pain type, blood pressure, cholesterol, blood glucose, also on ECG reading, and other investigations. The model is based on machine learning techniques and artificial neural networks by using three dataset related to California University (Cleveland, Statlog) and from kaggle heart predicted data. The model performed by 3 platforms (SPSS, WEKA, Python), then developed based on classification algorithms includes Support vector machine, Logistic regression, Artificial neural network, K-nearest neighbor, Naïve bays, and Decision tree, Random Forest, XGBoost.
إعداد: الطالبة رهام علي نوفل
إشراف: الدكتور عبد القادر عبَادي
Investigating the role of aberrant alternative splicing in Cholangiocarcinoma via Integrated Bioinformatic
Aim of the study: This study was conducted to examine the impact of alternative splicing on cholangiocarcinoma. It aimed to analyze the differences in splicing patterns between tumoral and normal samples using bioinformatics-based alternative splicing detection tools. Furthermore, the study aimed to discover new biomarkers and investigate genetic modification techniques.
The results improve our understanding of the association between AS events and CHOL and might be a starting point for further research to confirm the importance of the splicing events studied in this research in CHOL and to identify new prognosis biomarkers.
إعداد: الطالبة ديما حسين سويد
إشراف: الدكتور ياسر خضرا
Identification of Key Genes and Pathways Associated with Post-Traumatic Stress Disorder
Post-Traumatic Stress Disorder (PTSD) is a complex multifactorial mental health condition characterized by a range of symptoms, including intrusive thoughts, nightmares, hypervigilance, and emotional distress, significantly affecting an individual's quality of life. While PTSD is relatively common, with a prevalence ranging from 6% to 10%, it varies depending on the population and specific traumatic events experienced. However, the exact pathogenesis of PTSD remains unclear, and accurate diagnosis can be challenging due to the possibility of inaccuracies in reporting symptoms. Furthermore, preventive therapies for PTSD development are limited. This study aimed to bridge these gaps by conducting an integrative bioinformatics analysis that explores the molecular mechanisms, identifies diagnostic markers, and discovers therapeutic targets for PTSD.
إعداد: الطالبة لين سمير خوري
إشراف: الدكتور مجد الجمالي
Identification of Key Genes and Pathways Associated with Post-Traumatic Stress Disorder
Predicting Breast Cancer Prognosis Using Machine Learning
The aim of the research is to find algorithms with high accuracy and sensitivity capable of predicting breast cancer prognosis and the cause of death in the study sample based on many variables in order to be able to intervene quickly in the patient's treatment protocol to reduce mortality as much as possible.
Materials and Methods: This study utilized the METABRIC database, containing targeted sequencing data of 1904 primary breast cancer samples, to predict breast cancer outcomes. Clinical and genetic attributes, such as age at diagnosis, type of surgery, chemotherapy, genetic expression levels, mutation data among others were analyzed using SPSS Statistics 25.0 and Python libraries. The dataset was split into training and test sets for model development and evaluation. Data preprocessing techniques were applied, and Python libraries facilitated data manipulation and analysis.
Conclusions: The study underscores the importance of selecting appropriate classification algorithms for predicting breast cancer patient outcomes. The Decision Tree and Random Forest algorithms offer promising results, while Logistic Regression may not be the most effective choice. These findings contribute to the field of breast cancer prognosis and provide insights for improving personalized treatment strategies. Future research can focus on exploring additional algorithms and incorporating more comprehensive datasets to further enhance predictive accuracy.
إعداد: الطالبة أريانا يونس يونس
إشراف: الدكتور مجد الجمالي
In-Silico Identification of Single Nucleotide Polymorphisms (SNPs) Associated with Alzheimer’s Disease
Aim of the study: in this project, we aim to identify genes contributing to neuroinflammation present in Alzheimer’s disease, identify SNPs located in these genes, and test if they are in linkage equilibrium thus, they could be inherited as a haplotype. Also identifying a haplotype that could possibly be frequent in populations and associated with AD.
Methods: We used in silico approaches to identify SNPs related to TNFα and other immune factors affecting Alzheimer’s disease whether directly or indirectly. We used various databases to search for genes that regulate inflammation through cytokines, interleukins, and immune cells.
Results: We identified 5 genes on chromosome 6 that are linked to inflammation, TNF-α, and Treg cells. We tested various SNP pairs to check if they are linked, and found 49 pairs to be in LD out of 84 pairs tested using LDpair. SNPs in high LD were selected and tested with LDhap to generate possible haplotypes. One haplotype with a frequency of 1.72% containing 4 significant SNPs was selected, which could be close to AD frequency in the elder (above 65 yrs) population (10.7%).
إعداد: الطالبة علياء عمار صالح
إشراف: الدكتور مجد الجمالي
-
مشاريع طلاب ربيع 2022 _ S22
التحقق وتبيان وجود الجينات المتضمنة لطفرات نقطية ذات قيمة تنبؤية وتشخيصية في سرطان المبيض باستخدام أدوات المعلومات الحيوية
Identification of Hub Genes and Key Pathways Associated with Human Papillomavirus Status in Cervical Squamous Cell Carcinoma Based on Gene Expression Profiling via Integrated Bioinformatics
Using integrated bioinformatics to screen differentially expressed genes (DEGs) associated with two HPV status (HPV positive and HPV negative) in Cervical Squamous Cell Carcinoma could reveal valuable information about the pathogenic mechanism underlying the tumor progression. Moreover, the identification of significant differentially expressed genes, enrichment of their biological functions and key pathways, and visualization of the network of DEGs and hub genes will provide more accurate and reliable biomarkers and therapeutic targets for early diagnosis, individualized prevention measures, and improvement of therapeutic efficacy.
In this study, a series of analyses was conducted using R2 software of HPV status in squamous cervical carcinoma-related data in TCGA database to screen and identify prognostic biomarkers related to differentially expressed genes. Then, the up- and downregulated DEGs were classified into three groups (biological processes, molecular functions, and cellular components) according to Gene Ontology(GO) terms, and KEGG pathway enrichment analysis was conducted using DAVID websit.
إعداد الطالبة: مرح عماد مسعود
إشراف: الدكتورة لمى يوسف
الدكتور مجد الجمالي
تصميم لقاح افتراضي متعدد الحواتم ضد ضمة الكوليرا باستخدام أدوات المعلوماتية المناعية
Designing a virtual multi-epitopes vaccine against Vibrio cholerae using immunoinformatics
إعداد الطالبة: رفا شكيب صالح
إشراف: الدكتور عبد القادر عبّادي
تصميم لقاح افتراضي متعدد الحواتم ضد ضمة الكوليرا باستخدام أدوات المعلوماتية المناعية
التنبؤ بوفيات كوفيد-19 باستخدام التعلم الآلي
Predicting COVID-19 deaths using machine learning
Coronavirus disease 2019 (COVID-19) is a highly contagious viral disease that causes the severe acute respiratory syndrome (SARS), and has had a disastrous impact on demographics around the world. Studies have shown that using machine learning (ML) considered one of the most important lines of research to understand and fight COVID-19.
The aim of this study is to develop models for prediction the causes of mortality for patients with COVID-19 infection in order to make timely and effective clinical decision for COVID-19 treatments.
إعداد الطالبة: روزاليا ايليا معماري
إشراف: الدكتورة رشا مسعود
الدكتور زكريا الزلق
التنبؤ بوفيات كوفيد-19 باستخدام التعلم الآلي
تحليل المستضدات السرطانية لتطبيق العلاج المناعي المعتمد على الخلايا التغصنية
In silico analysis of cancer antigens of non-small cell lung cancer (NSCLC) for dendritic cell-based immune-gene therapy application
In this project, we aim to design a new structural model containing putative antigenic epitopes. Since immune stimulation is considered one of the most important mechanisms in tumor treatment, tumor cells can escape from the immune system. It may be advantageous to use T cell epitopes of different tumor Antigens simultaneously, the goal of using multiple antigenic epitopes instead of a single antigen is to avoid the specific antigen being lost or mutated. The use of multiple antigenic epitopes in a single structural model would cover a wide range of histocompatibility complex polymorphisms.
إعداد الطالبة : هند زهير اللو
إشراف: الدكتور مجد الجمالي
الدكتورة لمى يوسف
تحليل المستضدات السرطانية لتطبيق العلاج المناعي المعتمد على الخلايا التغصنية