Skip to main content

Machine Learning and Artificial Intelligence in Bioinformatics

Page 1 of 4

  1. Host population structure is a key determinant of pathogen and infectious disease transmission patterns. Pathogen phylogenetic trees are useful tools to reveal the population structure underlying an epidemic. ...

    Authors: Hassan W. Kayondo, Alfred Ssekagiri, Grace Nabakooza, Nicholas Bbosa, Deogratius Ssemwanga, Pontiano Kaleebu, Samuel Mwalili, John M. Mango, Andrew J. Leigh Brown, Roberto A. Saenz, Ronald Galiwango and John M. Kitayimbwa
    Citation: BMC Bioinformatics 2021 22:546
  2. Correctly classifying the subtypes of cancer is of great significance for the in-depth study of cancer pathogenesis and the realization of personalized treatment for cancer patients. In recent years, classific...

    Authors: Lianxin Zhong, Qingfang Meng, Yuehui Chen, Lei Du and Peng Wu
    Citation: BMC Bioinformatics 2021 22:475
  3. Accurate identification of Transcriptional Regulator binding locations is essential for analysis of genomic regions, including Cis Regulatory Elements. The customary NGS approaches, predominantly ChIP-Seq, can...

    Authors: Quentin Ferré, Jeanne Chèneby, Denis Puthier, Cécile Capponi and Benoît Ballester
    Citation: BMC Bioinformatics 2021 22:460
  4. We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual’s ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds o...

    Authors: Yong Wang, Shiya Song, Joshua G. Schraiber, Alisa Sedghifar, Jake K. Byrnes, David A. Turissini, Eurie L. Hong, Catherine A. Ball and Keith Noto
    Citation: BMC Bioinformatics 2021 22:459
  5. This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph.

    Authors: Ewan Carr, Mathieu Carrière, Bertrand Michel, Frédéric Chazal and Raquel Iniesta
    Citation: BMC Bioinformatics 2021 22:449
  6. One of the major challenges in precision medicine is accurate prediction of individual patient’s response to drugs. A great number of computational methods have been developed to predict compounds activity usi...

    Authors: Zhaorui Zuo, Penglei Wang, Xiaowei Chen, Li Tian, Hui Ge and Dahong Qian
    Citation: BMC Bioinformatics 2021 22:434
  7. Modern Next Generation- and Third Generation- Sequencing methods such as Illumina and PacBio Circular Consensus Sequencing platforms provide accurate sequencing data. Parallel developments in Deep Learning hav...

    Authors: Anand Ramachandran, Steven S. Lumetta, Eric W. Klee and Deming Chen
    Citation: BMC Bioinformatics 2021 22:404
  8. Autism spectrum disorders (ASD) imply a spectrum of symptoms rather than a single phenotype. ASD could affect brain connectivity at different degree based on the severity of the symptom. Given their excellent ...

    Authors: Jinlong Hu, Lijie Cao, Tenghui Li, Shoubin Dong and Ping Li
    Citation: BMC Bioinformatics 2021 22:379
  9. Plant pathogens cause billions of dollars of crop loss every year and are a major threat to global food security. Effector proteins are the tools such pathogens use to infect the cell, predicting effectors de ...

    Authors: Ruth Kristianingsih and Dan MacLean
    Citation: BMC Bioinformatics 2021 22:372
  10. The topology of metabolic networks is both well-studied and remarkably well-conserved across many species. The regulation of these networks, however, is much more poorly characterized, though it is known to be...

    Authors: Justin Y. Lee, Britney Nguyen, Carlos Orosco and Mark P. Styczynski
    Citation: BMC Bioinformatics 2021 22:365
  11. Localization of messenger RNAs (mRNAs) plays a crucial role in the growth and development of cells. Particularly, it plays a major role in regulating spatio-temporal gene expression. The in situ hybridization ...

    Authors: Prabina Kumar Meher, Anil Rai and Atmakuri Ramakrishna Rao
    Citation: BMC Bioinformatics 2021 22:342
  12. Epigenetic modifications, including CG methylation (a major form of DNA methylation) and histone modifications, interact with each other to shape their genomic distribution patterns. However, the entire pictur...

    Authors: Wan Kin Au Yeung, Osamu Maruyama and Hiroyuki Sasaki
    Citation: BMC Bioinformatics 2021 22:341
  13. Approximate Bayesian Computation (ABC) has become a key tool for calibrating the parameters of discrete stochastic biochemical models. For higher dimensional models and data, its performance is strongly depend...

    Authors: Richard M. Jiang, Fredrik Wrede, Prashant Singh, Andreas Hellander and Linda R. Petzold
    Citation: BMC Bioinformatics 2021 22:339
  14. MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression post-transcriptionally via base-pairing with complementary sequences on messenger RNAs (mRNAs). Due to the technical challenges involv...

    Authors: Gilad Ben Or and Isana Veksler-Lublinsky
    Citation: BMC Bioinformatics 2021 22:264
  15. Pseudogenes are non-functional copies of protein coding genes that typically follow a different molecular evolutionary path as compared to functional genes. The inclusion of pseudogene sequences in DNA barcodi...

    Authors: T. M. Porter and M. Hajibabaei
    Citation: BMC Bioinformatics 2021 22:256
  16. Motivated by the size and availability of cell line drug sensitivity data, researchers have been developing machine learning (ML) models for predicting drug response to advance cancer treatment. As drug sensit...

    Authors: Alexander Partin, Thomas Brettin, Yvonne A. Evrard, Yitan Zhu, Hyunseung Yoo, Fangfang Xia, Songhao Jiang, Austin Clyde, Maulik Shukla, Michael Fonstein, James H. Doroshow and Rick L. Stevens
    Citation: BMC Bioinformatics 2021 22:252
  17. The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider...

    Authors: Milad Mostavi, Yu-Chiao Chiu, Yidong Chen and Yufei Huang
    Citation: BMC Bioinformatics 2021 22:244
  18. Current methods in machine learning provide approaches for solving challenging, multiple constraint design problems. While deep learning and related neural networking methods have state-of-the-art performance,...

    Authors: Kyle Boone, Cate Wisdom, Kyle Camarda, Paulette Spencer and Candan Tamerler
    Citation: BMC Bioinformatics 2021 22:239
  19. Genes implicated in tumorigenesis often exhibit diverse sets of genomic variants in the tumor cohorts within which they are frequently mutated. For many genes, neither the transcriptomic effects of these varia...

    Authors: Michal R. Grzadkowski, Hannah D. Holly, Julia Somers and Emek Demir
    Citation: BMC Bioinformatics 2021 22:233
  20. Epitope prediction is a useful approach in cancer immunology and immunotherapy. Many computational methods, including machine learning and network analysis, have been developed quickly for such purposes. Howev...

    Authors: Xiaoyun Yang, Liyuan Zhao, Fang Wei and Jing Li
    Citation: BMC Bioinformatics 2021 22:231
  21. The identification of gene–gene and gene–environment interactions in genome-wide association studies is challenging due to the unknown nature of the interactions and the overwhelmingly large number of possible...

    Authors: Pål V. Johnsen, Signe Riemer-Sørensen, Andrew Thomas DeWan, Megan E. Cahill and Mette Langaas
    Citation: BMC Bioinformatics 2021 22:230
  22. The Cox proportional hazards model is commonly used to predict hazard ratio, which is the risk or probability of occurrence of an event of interest. However, the Cox proportional hazard model cannot directly g...

    Authors: Eu-Tteum Baek, Hyung Jeong Yang, Soo Hyung Kim, Guee Sang Lee, In-Jae Oh, Sae-Ryung Kang and Jung-Joon Min
    Citation: BMC Bioinformatics 2021 22:192
  23. The genomics data analysis has been widely used to study disease genes and drug targets. However, the existence of missing values in genomics datasets poses a significant problem, which severely hinders the us...

    Authors: Xinshan Zhu, Jiayu Wang, Biao Sun, Chao Ren, Ting Yang and Jie Ding
    Citation: BMC Bioinformatics 2021 22:188
  24. Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and simi...

    Authors: Khushnood Abbas, Alireza Abbasi, Shi Dong, Ling Niu, Laihang Yu, Bolun Chen, Shi-Min Cai and Qambar Hasan
    Citation: BMC Bioinformatics 2021 22:187
  25. Microsatellite instability (MSI) is a common genomic alteration in colorectal cancer, endometrial carcinoma, and other solid tumors. MSI is characterized by a high degree of polymorphism in microsatellite leng...

    Authors: Tao Zhou, Libin Chen, Jing Guo, Mengmeng Zhang, Yanrui Zhang, Shanbo Cao, Feng Lou and Haijun Wang
    Citation: BMC Bioinformatics 2021 22:185
  26. The interactions of proteins are determined by their sequences and affect the regulation of the cell cycle, signal transduction and metabolism, which is of extraordinary significance to modern proteomics resea...

    Authors: Yang Wang, Zhanchao Li, Yanfei Zhang, Yingjun Ma, Qixing Huang, Xingyu Chen, Zong Dai and Xiaoyong Zou
    Citation: BMC Bioinformatics 2021 22:184
  27. Identifying lncRNA-disease associations not only helps to better comprehend the underlying mechanisms of various human diseases at the lncRNA level but also speeds up the identification of potential biomarkers...

    Authors: Rong Zhu, Yong Wang, Jin-Xing Liu and Ling-Yun Dai
    Citation: BMC Bioinformatics 2021 22:175
  28. Supervised learning from high-throughput sequencing data presents many challenges. For one, the curse of dimensionality often leads to overfitting as well as issues with scalability. This can bring about inacc...

    Authors: Trevor S. Frisby, Shawn J. Baker, Guillaume Marçais, Quang Minh Hoang, Carl Kingsford and Christopher J. Langmead
    Citation: BMC Bioinformatics 2021 22:174
  29. To address the need for easy and reliable species classification in plant genetic resources collections, we assessed the potential of five classifiers (Random Forest, Neighbour-Joining, 1-Nearest Neighbour, a ...

    Authors: Artur van Bemmelen van der Plaat, Rob van Treuren and Theo J. L. van Hintum
    Citation: BMC Bioinformatics 2021 22:173
  30. Recent studies have confirmed that N7-methylguanosine (m7G) modification plays an important role in regulating various biological processes and has associations with multiple diseases. Wet-lab experiments are cos...

    Authors: Jiani Ma, Lin Zhang, Jin Chen, Bowen Song, Chenxuan Zang and Hui Liu
    Citation: BMC Bioinformatics 2021 22:152
  31. Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. ...

    Authors: Kevin De Angeli, Shang Gao, Mohammed Alawad, Hong-Jun Yoon, Noah Schaefferkoetter, Xiao-Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Lynne Penberthy and Georgia Tourassi
    Citation: BMC Bioinformatics 2021 22:113
  32. Manual microscopic examination of Leishman/Giemsa stained thin and thick blood smear is still the “gold standard” for malaria diagnosis. One of the drawbacks of this method is that its accuracy, consistency, a...

    Authors: Fetulhak Abdurahman, Kinde Anlay Fante and Mohammed Aliy
    Citation: BMC Bioinformatics 2021 22:112
  33. Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more imp...

    Authors: Mateusz Garbulowski, Klev Diamanti, Karolina Smolińska, Nicholas Baltzer, Patricia Stoll, Susanne Bornelöv, Aleksander Øhrn, Lars Feuk and Jan Komorowski
    Citation: BMC Bioinformatics 2021 22:110
  34. The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data inte...

    Authors: Yuqi Wen, Xinyu Song, Bowei Yan, Xiaoxi Yang, Lianlian Wu, Dongjin Leng, Song He and Xiaochen Bo
    Citation: BMC Bioinformatics 2021 22:97
  35. Microbes perform a fundamental economic, social, and environmental role in our society. Metagenomics makes it possible to investigate microbes in their natural environments (the complex communities) and their ...

    Authors: Raíssa Silva, Kleber Padovani, Fabiana Góes and Ronnie Alves
    Citation: BMC Bioinformatics 2021 22:87
  36. The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting thes...

    Authors: Camilo Broc, Therese Truong and Benoit Liquet
    Citation: BMC Bioinformatics 2021 22:86
  37. In the last decade, Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases. Many follow-up investigations i...

    Authors: Haohan Wang, Fen Pei, Michael M. Vanyukov, Ivet Bahar, Wei Wu and Eric P. Xing
    Citation: BMC Bioinformatics 2021 22:50