Sunday, December 8, 2019

Uses of Bioinformatics in the Biotechnology-Samples for Students

Question: Prepare a Report that Illustrates Some of the Uses Of Bioinformatics in the Biotechnology And/or Research Sector. Answer: Introduction to the general field of bioinformatics. Bioinformatics tools are important in fundamental research on the evolutionary theories and practical instances of the protein design. They are used in biotechnology and other aspects of biological research. Various approaches and algorithms that are used in such studies include; alignments of the structure and sequences, prediction of the secondary structure, classification of proteins and progress of protein expression in the cell cycle (Felix et al., 2005). In this essay, we shall discuss the uses of bioinformatics in biotechnology, biological sciences and medical research critically examining the general field of bioinformatics, types of data involved in bioinformatics and the applications of bioinformatics in the scientific process. Rana (2012) argues that genome sequencing and the analysis of the X-ray structure have led to enormous amounts of structures and sequences of multiple proteins into the scientific community. The information obtained from such analysis can be used in biological and medical research effectively, if one can interpret the information they provide appropriately (p.10). Two types of computational techniques can be used in the analysis of such data these include simulations of the full atoms in molecular dynamics or the bioinformatics approach (Rana, 2012, p 11). Bioinformatics is a field in biological sciences that involves statistical analysis of the structure and sequences of proteins. Moreover, it aids in the annotation of the genome, understanding its function and predict structures. Nevertheless, the process is possible when the protein sequence information is available. Bioinformatics has brought a major revolution in biological sciences with powerful tools that provide vast information. They are the most complex and powerful tools in biological sciences presently. Moleculardynamics and molecular modeling simulations study the folding and functions of proteins (Rana, 2012, p.12). According to the National Institute of Health, bioinformatics is involved in research, development and application of tools in computation to widen the medical, behavioral and biological data. In addition to that, it helps to acquire, store, organize and interpret information. Bioinformatics has been used in the Human Genome Project, which has attracted much interest from researchers and facilitated the analysis of large amounts of bio data. The data needs to be analyzed due to the advances made in molecular biology techniques (Kumar, 2015, p.2). Rana (2012) further illustrates that bioinformatics has led to important discoveries in drugs and medicine, plant sciences biology furthermore, it has helped pharmaceutical companies to save money, time and management of large biological data. In addition to that, its aims include organizing data for researchers to gain easy access to information, to develop data analysis tools and interpret information in a meaningful way. Moreover, bioinformatics provides available tools to analyze data and interpret results (p.14). Research areas in bioinformatics include genomics, proteomics, and computer aided drug design. In addition to that, research areas further include biological databases, biological data mining, microarray informatics, molecular phylogenetics, (study of an organisms at the molecular level in order to gather information on phylogenetic relationships of organisms) and agro informatics (agricultural informatics that deal with plant research) (Rana, 2012, pp. 13- 18) Types of Data in Bioinformatics. Kraulis (2001) emphasizes on the increasing nature and availability of biological data; a phenomenon has necessitated creation of databases whose sole purpose is to collect data, organize it in a form that is meaningful and ensures easy interpretation (par. 1). Databases have been classified into different forms to maintain order within the scientific process, improve accessibility to information and reduce repetitions. Moreover, in order to ease the access to data, it is important to first have the needed information and seek it from the appropriate database (Kavitha, 2012). Databases are classified according to the data that they accommodate. The types of data include one, biomolecule sequences, proteins and nucleic acids, for example, EMBL, DDJB, Genebank, PIR and Swiss-Prot. Two, bio-molecular structures with examples such as PDB. Thirdly, we have bibliographies or scientific literatures and their examples include Scopus and PubMed, these are search engines and some are free while others require subscription to access content. In addition to that, we have gene expression profiles, genetic disorders and whole genome sequences (Kavitha, 2012). The data or information has sources that are categorized into primary databases, secondary databases, composite databases and integrated databases. Primary databases have molecular data presented in its initial form. Examples of primary databases are GenBank, for sequences in nucleic acids, Protein Data Bank (PDB) for molecular structures, PIR (Protein Information Resource) and SWISS-PROT for protein sequences. They contain combinations of data such as gene sequences from mRNA or genomic DNA, genome sequences, chromosome sequences, annotated entries and partial or complete entries (Welcome Genome Campus, 2017). Secondary databases have information derived from primary data analysis and it is more useful and relevant. Furthermore, the information is structured to meet specific articulated requirements. Examples of secondary databases include UniGene and Eukaryotic Promoter Databases, which are secondary databases that are sequence based. The evolutionary and structural relationships between the known structures of proteins is described by SCOP (Structural Classification of Proteins).The hierarchical classification of structures in proteins is included in CATH (Class, Architecture, Topology, Homology) (Welcome Genome Campus, 2017). Composite databases are repertoires of secondary data and they are easier to use since they allow the user to access all information that is relevant from one source instead of connecting to multiple resources. The NCBI database (National Centre for Biotechnology Information) is one best composite databases. In addition to that, it includes many primary and secondary databases such as PubMed, Genbank, and OMIM. NCBI is a free online database for accessing gene sequences of phyla and species. The database includes gene alleles and mutations, gene sequences, protein sequences and genome pathways (Lesk, 2008). Finally, integrated databases have data from different organisms that are related. They are important for studies involving genomic relationships in organisms, they also illustrate relations in evolution within organisms. These types of investigations are important in phylogenetics since genes that allow for expression of traits of economic value can be identified in plants. For example, Arabidopsis thaliana integrated databases provide genome and transcriptome sequence data linking a Brassica species of economic value and an organism that acts as a model (Lesk, 2008). Furthermore, there are other remarkable types of databases such as SGN (Sol Genomics Networks) for organisms such as potato, tomato, eggplant and the pletunia. Legume Base for Glycine max and Lotus japonicas. Bean genes for Vigna species and Phaseoulus. Gramene databases for rice, maize, barley, wheat, oats and foxtail. Plant Transcript Assemblies Databases for several plant species. Aphid Base databases for several aphid species and SYSTOMONAS databases for biotechnology and the infection of Pseudomonads .Human Ageing Genomic Resources (HAGR) for the genetics and biology of aging in humans. FLYMINE databases for Anopheles and Drosophila genomics (Seung et al., 2006). Several databases can be merged on the basis of an organism's taxonomic identity. The merger of databases leads to formation of integrated databases. Presently, work on the analysis of the genome and transcriptome of many species has started. Consequently, the work has developed more databases that are organ specific. They include Chlamydomonas Center algae for green alga, Medicago.org for Medicago truncatula, Soybase for soybean, Oryzabase for Oryza species (rice), FLYBASE for Drosophila and OMIM for genetic disorders. They collect data obtained using various techniques used in studying plant systems which include linkage maps, microarray data, transcriptome and genome sequencing (Seung et al., 2006). Many of these databases are obtained through websites that organize the data in a way that a user can easily access it online. In addition to that, same data can be downloaded from websites in a various formats. The formats include sequence data, text links and protein structure. These formats can be found from given sources such as OMIM and PubMed that provide text formats, GenBank that provides sequence data in terms of DNA, and Uniprot in terms of protein and finally, protein structure are provided by CATH, SCOP and PDB((Lesk, 2008). Applications of bio-informatics Vaccine discovery The availability of genomic data, computing resources, technology, immunogenetics, and the better understanding of the immune process has led to vaccine research (Shanju Shangeetha, 2013). The science of reverse vaccinology and rational design of vaccines are the new indicators of vaccine development in future, the methods have been used to study peptide vaccines. The protein antigen in a viral genome that brings forth an immune response is scanned and then synthesized to a peptide vaccine; this is used in development of vaccines against various viruses such as coronavirus and influenza (Smith, 2003). Gregory (2010) states that the recent advancement in technology and bioinformatics enables computer-based approach in the development of vaccines. Over the years, peptide vaccines have promised to be effective in humans. Furthermore, advances made in proteomics have resulted in vaccinomics and reverse vaccinology as new techniques of developing vaccines (p. 510). Advances in technological and scientific tools have resulted in stronger inhibitors such as AIDS drugs for example Viracept, Aegenerase from structure based design approaches, and Relenza inhibitors made for influenza (Nandy Subhash, 2014). American Biopharmaceutical companies (2013), state that peptide vaccines have been showing good promises in relation to tertiary cancers and other diseases and there is a good response from cancer patients in regards to improved immunity (p.1). Furthermore, there is a high rising interest in peptide vaccines; the process of their design, determining the desired proteins and the protein sequences involved, this requires application of bioinformatics. Reliable and good results need the approval of molecular level data for every virus used in the vaccines, and a reliable technique that will be used in data analysis to identify the protein sequences of interests for the purpose of vaccine development (Danylo et al. 2011). Nandy Subhash (2014) investigated the use of bioinformatics in designing the human corona virus peptide vaccine. This virus (HCoV) causes infections on the upper respiratory tract and early in the century it led to a SARS outbreak (p.4), The HCoV protein had 56 strains that were presented to the Vaxi Jen 2.0 server. The protein with the highest antigenicity index was identified for analysis. The prediction of epitopes in T cell response was done. Five peptides were selected from the Net CTL 1.2 Server that predicted the presence of CTL (Cytotoxic T Lymphocytes) epitopes in the protein sequences (p.5). The epitope that was identified had an amino acid sequence of KSSTGFVYF and it interacted with several MHC 1 alleles at a higher affinity .The conservancy of B cell epitope was determined from IEBD server and its allergenicity obtained from AllerHunter tool .The epitope had a conservancy of 64.29% and a low allergenicity result. The selected peptide underwent a molecular docking analysis and the peptide was HLA-B*15: 01 which showed a good binding (Nandy Subhash, 2014, p.6). Kolaskar and Tangaokar antigenicity prediction method was used in searching for the B-cell epitope and seven regions with a high antigen scores were shown but were later reduced to three after determination of solvent accessibility by the IEDB Analysis resource. An analysis was further done with linear B cell epitopes server to analyze the epitopes of the B cell and after the analysis, the peptide GPSSQPY was concluded to have the ability to induce an immune response when used with the B cell epitopes .Therefore the vaccines could now be formulated using protein peptides information that was available ( Nandy Subhash., 2014, p.6). Pathogenesis and bioinformatics. Pathogenesis is a study of biological mechanisms that cause disease state in the body. It also describes the development and the origin of a disease and whether the disease is acute, chronic or recurrent. The mechanisms of pathogenesis are set by the course of the disease and the disease can be prevented if the underlying causes are controlled. Bioinformatics can be used to determine pathological links between the diseases and their causes and if the cause can be determined then the disease can be controlled by looking at the molecular pathology signatures of the disease ( Zhumar Malik, 2003, p.47). Pancreatic cancer is regularly a lethal disease and in its early stages, it can be difficult to diagnose .Bioinformatics approach can be used to analyze the pathogenesis of this disease by identifying causal genome which might lead to prevention of occurrence of the disease. In addition to that, bioinformatics can be used to investigate the mechanisms of disease and recognize the new and present disease targets, therefore assist in therapy (Zhao et al., 2014).The following is an outline on the use of biotechnology in pathogenesis. The data GSE 16515 has 16 normal samples and 36 tumor samples available from the GEO database. This is a database that stores and distributes freely next generation microarray and high output genomic sets of data from a wide array of biological subjects of diseases. LIMMA Package and Robust Multichip Averaging are used in screening out Differentially Expressed Genes (DEGs). Furthermore, gene ontology and analysis on the pathway enrichment are conducted the genes, which is followed by protein protein interaction (PPI) network connection, this is done by the Cytoscape and STRING. ClusterONE is used to perform module analysis (Zhao et al., 2014). Text mining based on the DEGs is conducted based on Pub Med. 93 downregulated and 274 upregulated genes are identified as the prominent DEGs, and they are found to exist significantly in the extracellular region and EM receptor pathways. In addition to this, no modules were screened in down regulated PPI networks while five were screened out from the up-regulated networks. The down-regulated genes included INS, FGF, and LAPP while up regulated genes included MET, MIA and CEA. CAMS had the highest number of inferences during the text mining analysis. The findings demonstrated that in conclusion, up and down regulated genes had an important role to play in the development of pancreatic cancers and this are the new targets for therapy of the disease (Zhao et al., 2014). Bioinformatics in medicines Bioinformatics has impacted the medical field as it helps in diagnosis of diseases and furthermore it helps physicians use the information it produces to develop strategies for therapy. According to Bala (2014), bioinformatics can be used in the diagnosis of clinical conditions for example a patients might present to a physicians with a form of hemophilia that is genetic. They might be unsure of the disease symptoms but only have a clue from the history and information given about an early occurrence of the disease in the family. The following is an outline on how the physician will use bioinformatics to diagnose the disease. The physician will use the Web to obtain information about the disease by clicking on the OMIM database, which provides information relating to various genetic disorders. A search can be put on diabetes which reveals many diseases such as the Von Willerband disease .In addition to the search gives an important information about the patient which states that the patient has a low level of anti hemophilic globulin in the disease(factor VIII). Furthermore, when factor VIII is searched on the protein sequence database it will lead to a match that encodes the factor VIII with an incomplete DNA and the equivalent protein sequence. In this study, the gene is linked to its protein and DNA sequences (Bal, 2005, p. 121). Furthermore, it is also linked to a reference set in the MEDLINE database. According to the MEDLINE literature database, there is an earlier research article which explains the association of hemophilia with factor VIII. Detailed information from Protein Information Resources SWISS-PROT database is found on the protein sequence link. A link to Protein DataBank provides information regarding to the crystal structure of the protein in a SWISS PROT database (Bal, 2005, p.121). The genes, nucleotide sequences can now be obtained coupled with records of gene irregularities by following a DNA sequence link on the GENBANK database. Therefore, the health physician can use plenty more databases to get information relating to the diseases and analyze the information, a technique that enables the physician to diagnose treatment and make further strategies regarding the therapy (Bal, 2005, p.121). Bioinformatics has emerged as a very important tool for the present day scientist, since its development, it has shown significant importance. The data is growing tremendously therefore a need for collecting the data, storing it, managing and further analyzing it so that researchers can easily access and add more entries. Bioinformatics is a very important tool especially in drug discovery, biotechnology and medical science. The essay has illustrated a specific use of bioinformatics in designing vaccines and analyzing the pathogenesis of pancreatic cancer. Furthermore, it has shown the use of bioinformatics in therapy and diagnosing diseases. References American Biopharmaceutical Companies. (2013) Medicine in Development. Retrieved on 17th August, 2017, from www.pharma.org Bal, H. (2005). Bioinformatics Principles and applications. India: Mc Graw Hill 119- 32. Bala, M. P., (2014). Applications of bioinformatics; Retrieved on 17th August 2017, from www.biotecharticles.com Danylo, S. Fransisco, D., Ashko, K. (2011) . Innovative bioinformatics approach for developing peptide-based vaccines against hyper variable viruses. Australia Society of Immunology, 89, 81-89. Retrieved on 17th August 2017 from https:// www.nature.com.isb Felix, A., Barry, T., Annuray S. (2005). Bioinformatics and Sequence Alignment. San Fransisco: LulleySchulten Group, Gregory, A. (2010). Vaccinomics and bioinformatics; Accelerating for the next golden of vaccinology. Vaccine, 28, 3509 3510. Retrieved on 16th August 2017, from www.elsevier.com/locate/vaccine. Kavitha, R. (2012). Databases in bioinformatics, Mumbai: SRM University Press, Kraulis, P. (2001). Databases in bioinformatics. Retrieved on 18th August 2017 from avatar.se/lectures/strbio2001/databases/index.html. Kumar, R. (2015). Role of Bioinformatics in Various aspects of Biological Research; A mini review. Research Journal of Biology, 3(2), 1-20. Lesk, A., (2008). Introduction to Bioinformatics (3rd Ed.) New Jersey: Wiley Blackwell. Nandy, A. Subhash, C. (2016). A brief overview of computer Assisted Approaches in Rational Design of Peptide Vaccines. International Journal of Molecular Sciences, 17(666), 1-111. doi: 10:3390/ijms/7050666. Ran, S. (2012). Bioinformatics. Tools and Applications (3rd ed.). Dehradun: Charu Printers. Seung, Y., Julie, D., Dong, A., (2006) Bioinformatics and is applications in plant biology. Annual Review of plant biology, 1-29. doi : 101146/annuner.arplant.56.032604.144103. Shanju, S Sangeetha, K. (2013). Current trends in cancer Vaccines; A bioinformatics perspective. Asian Pacific Journal of Cancer Prevention 14, 1-29. doi: https:// dx.doi.org/10.7314/APJCP.2013.147.4041 Smith, D. J. (2003). Applications of bioinformatics to influenza surveillance and vaccine strain selection, 21(16), 17580 61. Retrieved on 16th August 2017 from https:// www.ncbi.nlm.nih.gov/m/pubmed/126090. Wellcome Genome Campus (2017). Bioinformatics for the relational databases; Primary and Secondary databases. Retrieved from on 17th August 2016 from www.ebi.ac.uk/training/online/course/bioinformatics. Zhao, L. Zhang, T., Zhuang, L., Yan, B., Wang, R.F., Liu, B. (2014). Uncovering thee pathogenesis and identifying the novel targets of pancreatic cancer using bioinformatics approach. International Journal of Molecular and Cellular Biology, 41(7), 4697-4704. Zhumar, G. Mallik, B. (2003). Principles and applications of bioinformatics. London: Oxford University Press.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.