دوره 10، شماره 23 - ( بهار 1398 )                   جلد 10 شماره 23 صفحات 132-117 | برگشت به فهرست نسخه ها


XML English Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Dehghanzadeh H, Mirhoseini S Z, Ghaderi-Zefrehei M, Tavakoli H, Esmaeilkhaniyan S. (2019). Clustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information . rap. 10(23), 117-132. doi:10.29252/rap.10.23.117
URL: http://rap.sanru.ac.ir/article-1-790-fa.html
دهقان زاده هوشنگ، میرحسینی سید ضیاء الدین، قادری زفره‌یی مصطفی، توکلی حسن، اسماعیل خانیان سعید. خوشه بندی تعدادی از ژن های موثر در تولید شیر با استفاده از تئوری اطلاعات و اطلاعات متقابل پژوهشهاي توليدات دامي 1398; 10 (23) :132-117 10.29252/rap.10.23.117

URL: http://rap.sanru.ac.ir/article-1-790-fa.html


مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی گیلان
چکیده:   (3330 مشاهده)
نظریه‌ اطلاعات، شاخه‌ای از ریاضیات است. از تئوری اطلاعات در تجزیه و تحلیل ­های ژنتیکی و بیوانفورماتیکی استفاده گردیده و میتوان از آن در آنالیز‌های مربوط به ساختارها و توالی‌های زیستی نیز استفاده نمود. در این پژوهش بعد از استخراج توالی DNA ژن و اگزونهای موثر بر تولید شیر در گاو شیری، فراسنجه آنتروپی در مراتب یک الی چهار برای هر ژن و اگزونهای هر ژن محاسبه شد. برای استخراج تشابه میان ژنها از یکدیگر، از اطلاعات متقابل بین ژن­ ها استفاده شد. نتایج با استفاده از هفت روش معمول خوشهبندی شدند. با توجه به تعدد نتایج، جهت افرایش دقت و تجمیع نتایج حاصل، از الگوریتم آدابوست استفاده گردید. در پایان جهت تایید نتایج حاصل از آدابوست و پیش ­بینی عملکرد ژن‌ها و ارتباط بین آنها، با مراجعه به تارگاه GeneMANIA  نتایج بر اساس حاشیه­ نویسی ژنومی آن‌ها مورد بررسی و مقایسه قرار گرفت. تجمیع نتایج هر خوشهبندی که با الگوریتم آدابوست انجام شد و خود نوعی درخت ژنی را تداعی می­ کند، نشان داد که روش پیشنهادی برای خوشهبندی مجموعهای از ژنها، از نظر زیستی جواب معقولی را حاصل میکند چرا که با نتایج حاشیه ­نویسی ژنومی ژنهای حاصل در تارگاه GeneMANIA  مطابقت داشت. اعتقاد بر این است که روش ارائه شده برای ایجاد درخت ژنی با سایر روشهای متکی به توالی DNA برای خوشه ­بندی مجموعهای از ژنها، میتواند رقابت نماید و لذا میتواند در گروهبندی ژنهای سایر گونهها نیز به کار رود.
 
متن کامل [PDF 3943 kb]   (888 دریافت)    
نوع مطالعه: پژوهشي | موضوع مقاله: ژنتیک و اصلاح نژاد طیور
دریافت: 1396/7/14 | ویرایش نهایی: 1398/3/4 | پذیرش: 1397/6/31 | انتشار: 1398/3/1

فهرست منابع
1. Bindewald, E. and B.A. Shapiro. 2006. RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers, RNA (2006), 12: 342-352. Published by Cold Spring Harbor Laboratory Press. Copyright 2006 RNA Society. [DOI:10.1261/rna.2164906]
2. Blaisdell, B.E. 1986. Ameasure of the similarity of sets of sequences not requiring sequence alignment. Proceeding of National Academy of Sciences. 83(14): 5155-5159. [DOI:10.1073/pnas.83.14.5155]
3. Brunell, H., J.J. Gallardo-Chacon, A. Buil, M. Montserrat Vallverdu, J.M. Soria, P. Caminal and A. Perera. 2010. MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis .BIOINFORMATICS 26(15): 1811-1818, DOI:10.1093/bioinformatics/btq273. [DOI:10.1093/bioinformatics/btq273]
4. Buitenhuis, A.J., U.K. Sundekilde, N. Poulsen, H.C. Bertram, L.B. Larsen and P. Sørensen. 2013. Estimation of genetic parameters and detection of qtl for metabolites in Danish Holstein milk. Journal of Dairy Science, 14(79): 1-10.
5. Buslje, C.M., E. Teppa, T.D. Dome'nico, J.M. Delfino and M. Nielsen .2010. Networks of high mutual information define the structural proximity of catalytic sites: Implications for Catalytic Residue Identification. PLoS Computational Biology. . Volume 6(11). [DOI:10.1371/journal.pcbi.1000978]
6. Changchuan, Y., Y. Chen and S.T. Yau. 2014. A measure of DNA sequence similarity by Fourier Transform with applications on hierarchical clustering. Journal of Theoretical Biology, 359: 18-28. [DOI:10.1016/j.jtbi.2014.05.043]
7. Clemente, J.C., K. Satou and G. Valiente. 2007. Phylogenetic reconstruction from non-genomic data. Bioinformatics, 23: 110-115. [DOI:10.1093/bioinformatics/btl307]
8. Comin, M. and D. Verzotto. 2012. Alignment-free phylogeny of whole genomes using underlying subwords. Algorithms for Molecular Biology, 7(1). [DOI:10.1186/1748-7188-7-34]
9. Dawy, Z., J. Hagenauer, P. Hanus and J.C. Mueller. 2005. Mutual Information Based Distance Measures for Classification and Content Recognition with Applications to Genetics. 0-7803-8938-7/05/$20.00 (C) 2005 IEEE.
10. Edgar, R.C. and S. Batzoglou. 2006. Multiple sequence alignment. Curr. Opin. Struct. Biol, 16(3): 368-373. [DOI:10.1016/j.sbi.2006.04.004]
11. Edwards, S.V., B. Fertil, A. Giron and P.J. Deschavanne. 2002. A genomic schism in birds revealed by phylogenetic analysis of DNA strings. Systematic. Biology, 51: 599-613. [DOI:10.1080/10635150290102285]
12. Erill, I. 2012. Information Theory and biological sequences: Insights from an evolutionary prespective. 2012 Nova Science Publishers, Inc.
13. Freund, Y. and R. Schapire. 1996. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55: 119. CiteSeerX 10.1.1.32.8918,  DOI: 10.1006/jcss.1997.1504. [DOI:10.1006/jcss.1997.1504]
14. Freund, Y. and R. Schapire. 1996. Experiments with a new boosting algoritm. Paper read at Proceeding of the Thirteenth Internatioanal Conference on Machine Learning.
15. Forst, C.V. and K. Schulten. 2001. Phylogenetic analysis of metabolic pathways. Journal Molecular Evolution, 52: 471-489. [DOI:10.1007/s002390010178]
16. Gray, R.M. 2013. Entropy and Information Theory. First Edition. Springer-Verlag New York publisher.
17. Habibi, M., H.Pezeshk, C. Eslahchi and M. Sadegi. 2007. Allocation of protein secondary structure using entropy. Iran's fifth largest biotechnology conference. Tehran, Iran. pp: 33-39 (In Persian).
18. Herzel, H., W. Ebelling and A.O. Schmitt. 1994. Entropies of biosequences: The role of repeats. Physical Review Letters, 50: 5061-5071. [DOI:10.1103/PhysRevE.50.5061]
19. Heymans, M. and A.K. Singh. 2003. Deriving phylogenetic trees from the similarity analysis of metabolic pathways. Bioinformatics, 19(1): 138-146. [DOI:10.1093/bioinformatics/btg1018]
20. Jiang, S., C. Tang, L. Zhang and A. Zhang. 2014. A Maximum entropy approach to classifying gene array data sets. Workshop on Data Mining for Genomics, First SIAM International Conference on Data Mining.
21. Jun, S.R., G.E. Sims, G.A. Wu and S.H. Kim. 2010. Whole-proteome phylogeny of prokaryotes by feature frequency profiles: analignment-free method with optimal featurere solution. Proceedings of the National Academy of Sciences, 107 (1): 133-138. [DOI:10.1073/pnas.0913033107]
22. Katoh, K., K. Misawa, K.I. Kuma and T. Miyata. 2002. Mafft: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30(14): 3059-3066. [DOI:10.1093/nar/gkf436]
23. Kemena, C. and C. Notredame. 2009. up coming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics, 25(19): 2455-2465. [DOI:10.1093/bioinformatics/btp452]
24. Khatib, H., RL. Monson, V. Schutzkus, D.M. Kohl, G.J.M. Rosa and J.J.Rutledge. 2008. Mutations in the STAT5A gene are associated with embryonic survival and milk composition in cattle. Journal of Dairy Science, 91: 784-793. [DOI:10.3168/jds.2007-0669]
25. Kim, J., S. Kim, K. Lee and Y. Kwon .2009. Entropy analysis in yeast DNA. Chaos, Solitons and Fractals 39: 1565-1571. [DOI:10.1016/j.chaos.2007.06.036]
26. Larkin, M.A., G. Blackshields, N. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm and R. Lopez. 2007. Clustal w and clustal x version 2.0. Bioinformatics, 23(21): 2947-2948. [DOI:10.1093/bioinformatics/btm404]
27. Lemay, D.G., D.J. Lynn, W.F. Martin, M.C. Neville, T.M. Casey, G. Rincon, E.V. Kriventseva, W.C. Barris, A.S. Hinrichs, A.J. Molenaar, K.S. Pollard, N.J. Maqbool, K. Singh, R. Murney, E.M. Zdobnov, R.L. Tellam, J.F. Medrano, J.B. German and M. Rijnkels. 2009. The bovine lactation genome: insights into the evolution of mammalian milk. Genome Biology, 10:R43 (DOI: 10.1186/gb-2009-10-4-r43). [DOI:10.1186/gb-2009-10-4-r43]
28. Liou, C.Y., S.H. Tseng, W.C. Cheng and H.Y. Tsai. 2013. Structural complexity of DNA sequence. Computational and mathematical methods in medicine, Volume 2013, Article ID 628036, 11 pages. [DOI:10.1155/2013/628036]
29. Liu, B. 2007. Uncertainty Theory, 2nd ed., Springer-Verlag, Berlin.
30. Machado, J.T. 2012. Shannon Entropy Analysis of the Genome Code. Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2012, Article ID 132625, 12 pages DOI: 10.1155/2012/132625. [DOI:10.1155/2012/132625]
31. Monge, R.E. and J.L. Crespo. 2014. Comparison of complexity measures for DNA sequence analysis. 2014 International Work Conference on Bio-inspired Intelligence (IWOBI). [DOI:10.1109/IWOBI.2014.6913941]
32. Neagoe, I.M., D. Popescu and V.I.R. Niculescu. 2014. Applications of entropic divergence measures for DNA segmentation into high variable regiones of cryposporidium spp. GP60 gene. Romanian Reports in Physics, 66(4): 1078-1087.
33. Ogorevc, J., T. Kunej, A. Razpet and P. Dovc. 2009. Database of cattle candidate genes and genetic markers for milk production and mastitis. Animal Genetics, 40: 832-851. [DOI:10.1111/j.1365-2052.2009.01921.x]
34. Penner, O., P. Grassberger and M. Paczuski. 2011. Sequence Alignment, Mutual Information, and Dissimilarity Measures for Constructing Phylogenies. PLOS ONE, 6(1): e14373. DOI: 10.1371/journal.pone.0014373. [DOI:10.1371/journal.pone.0014373]
35. Pham, T.D., D.I. Crane, D. Tannock and D. Beck. 2004, Kullback-Leibler dissimilarity of markov models for phylogenetic tree reconstruction. Proceeding of 2004 international Symposium on Inteligent Multimedia, Video and Speech Processing. October 20-22, 2004 HongKong.
36. Porto-DIaz, L., V. BolOn-Canedo, A. Alonso-Betanzos and O. Fontenla-Rome. 2011. A study of performance on microarray data sets for a classifier based on information theoretic learning. Neural Networks 24: 888-896. [DOI:10.1016/j.neunet.2011.05.010]
37. Qi, J., B.Wang and B. Hao. 2004. Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach. Journal Molecular and Evolution, 58: 1-11. [DOI:10.1007/s00239-003-2493-7]
38. Reddy, Y.V. and A. Sebastian. 2009. Parameters for estimation of entropy to study price manipulation in stock markets", Research publication university of Dehli.
39. 39.Ruiz-Marin, M., M. Matilla-Garcia, J.A.G. Cordoba, J.L. Susillo-Gonzalez, A. Romo-Astorga, A. Gonzalez-Pérez, A. Ruiz and J. Gayan. 2010. An entrpyetest for single-locus genetic association analysis. BMC Genetics, 11:19. [DOI:10.1186/1471-2156-11-19]
40. Sims, G.E., S.R. Jun, G.A. Wu and S.H. Kim. 2009. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proceedings of the National Academy of Sciences. 106(8): 2677-2682. [DOI:10.1073/pnas.0813249106]
41. Shannon, C. 1948. A mathematical theory of communication. Bell System Technical Journal, vol. 27: 379-423 and 623-656. [DOI:10.1002/j.1538-7305.1948.tb00917.x]
42. Sherwin, B.W. 2010. Entropy and information approaches to genetic diversity and its expression: genomic geography. Entropy, 12: 1765-1798; DOI: 10.3390/e12071765. [DOI:10.3390/e12071765]
43. Stuart G.W, K. Moffet and S. Baker. 2002. Integrated gene species phylogenies from unaligned whole genome protein sequences. Bioinformatics, 18: 100-108. [DOI:10.1093/bioinformatics/18.1.100]
44. Stuart, G.W., K. Moffet and J.J. Leader. 2002. A comprehensive vertebrate phylogeny using vector representations of protein sequences from whole genomes. Molecular Biology and Evolution., 19: 554-562. [DOI:10.1093/oxfordjournals.molbev.a004111]
45. Sundekilde, U.K., L.B. Larsen and H.C. Bertram. 2013. NMR-Based Milk Metabolomics. Metabolites, 3:204-222. [DOI:10.3390/metabo3020204]
46. Tautz, D. and M. Trick, G.A. Dover. 1986. Cryptic simplicity in DNA is a major source of genetic variation. Nature, 322: 652-656. [DOI:10.1038/322652a0]
47. Tomovic, A. and E.J Oakeley. 2007. Position dependencies in transcription factor binding sites. Bioinformatics, 23(8): 933-941 DOI: 10.1093/bioinformatics/btm055. [DOI:10.1093/bioinformatics/btm055]
48. Vinga, S. and J. Almeida. 2003. Alignment-free sequence comparison: review. Bioinformatics, 19(4): 513-523. [DOI:10.1093/bioinformatics/btg005]
49. Vinga, S. 2013. Information theory applications for biological sequence analysis. Briefings in Bioinformatics, 15(3): 376-389, DOI: 10.1093/bib/bbt068. [DOI:10.1093/bib/bbt068]
50. Warde-Farley, D., S.L. Donaldson, O. Comes, K. Zuberi, R. Badrawi, P. Chao, M. Franz, C. Grouios, F. Kazi, C.T. Lopes, A. Maitland, S. Mostafavi, J. Montojo, Q. Shao, G. Wright, G.D. Bader and Q. Morris. 2010. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Research, 2010, Vol. 38, Web Server issueDOI:10.1093/nar/gkq537. [DOI:10.1093/nar/gkq537]
51. Warnow, T.2013. Large-scale multiple sequence alignment and phylogeny estimation. In: Models and Algorithms for Genome Evolution. Springer, 85-146pp. [DOI:10.1007/978-1-4471-5298-9_6]
52. Xie, X., Y. Yu, G. Liu, Z. Yuan and J. Song. 2010. Complexity and Entropy Analysis of DNA Methyltransferase. J Data Mining in Genom Proteomics. 1(2): 1000105. [DOI:10.4172/2153-0602.1000105]
53. Yu, Z.G., V. Anh and K.S. Lau. 2003. Multifractal and correlation analysis of protein sequences from complete genome, Physical Review E, 68: 021913. [DOI:10.1103/PhysRevE.68.021913]
54. Yu, Z.G, V.V. Anh and L.Q. Zhou. 2005. Fractal and dynamical language methods to construct phylogenetic tree based on protein sequences from complete genomes, in L.Wang, K. Chen and Y.S. Ong (Eds): ICNC 2005, Lecture Notes in Computer Science, 3612: 337-347, Springer-Verlag Berlin Heidelberg. [DOI:10.1007/11539902_40]
55. Yu, Z.G., L.Q. Zhou, V. Anh and K.H. Chu. 2007. Phylogeny of prokaryotes and chloroplasts revealed by a simple composition approach on all protein sequences from whole genome without sequence alignment, Journal of Molecular Evolution, 60: 538-545. [DOI:10.1007/s00239-004-0255-9]
56. Zhang, JL., L.S. Zan, P. Fang, F. Zhang, G.L. Shen and W.Q. Tian. 2008. Genetic variation of PRLR gene and association with milk performance traits in dairy cattle. Canadian Journal of Animal Science, 88: 33-39. [DOI:10.4141/CJAS07052]
57. Zhou, L.Q., Z.G. Yu, V. Anh, P.R. Nie, F.F. Liao and Y.J. Chen. 2007. Log-correlation distance and Fourier transformation with Kullback-Leibler divergence distance for construction of vertebrate phylogeny using complete mitochondrial genomes. In Proceedings of the 3nd International Conference on Natural Computation (ICNC2007), Haikou, China, August 2007: 304-308. [DOI:10.1109/ICNC.2007.462]

ارسال نظر درباره این مقاله : نام کاربری یا پست الکترونیک شما:
CAPTCHA

ارسال پیام به نویسنده مسئول


بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این وب سایت متعلق به پژوهشهای تولیدات دامی می باشد.

طراحی و برنامه نویسی : یکتاوب افزار شرق

© 2024 CC BY-NC 4.0 | Research On Animal Production

Designed & Developed by : Yektaweb