خوشه بندی تعدادی از ژن های موثر در تولید شیر با استفاده از تئوری اطلاعات و اطلاعات متقابل

دهقان زاده, هوشنگ; میرحسینی, سید ضیاء الدین; قادری زفره‌یی, مصطفی; توکلی, حسن; اسماعیل خانیان, سعید

doi:10.29252/rap.10.23.117

دوره 10، شماره 23 - ( بهار 1398 ) جلد 10 شماره 23 صفحات 132-117 | برگشت به فهرست نسخه ها

‎ 10.29252/rap.10.23.117

Mendeley

Zotero

RefWorks

Dehghanzadeh H, Mirhoseini S Z, Ghaderi-Zefrehei M, Tavakoli H, Esmaeilkhaniyan S. (2019). Clustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information . Res Anim Prod. 10(23), 117-132. doi:10.29252/rap.10.23.117
URL: http://rap.sanru.ac.ir/article-1-790-fa.html

دهقان زاده هوشنگ، میرحسینی سید ضیاء الدین، قادری زفره‌یی مصطفی، توکلی حسن، اسماعیل خانیان سعید.(1398). خوشه بندی تعدادی از ژن های موثر در تولید شیر با استفاده از تئوری اطلاعات و اطلاعات متقابل پژوهشهاي توليدات دامي 10 (23) :132-117 10.29252/rap.10.23.117

URL: http://rap.sanru.ac.ir/article-1-790-fa.html

خوشه بندی تعدادی از ژن های موثر در تولید شیر با استفاده از تئوری اطلاعات و اطلاعات متقابل

هوشنگ دهقان زاده^*¹، سید ضیاء الدین میرحسینی²، مصطفی قادری زفره‌یی³، حسن توکلی²، سعید اسماعیل خانیان⁴

1- مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی گیلان
2- دانشگاه گیلان
3- دانشگاه یاسوج
4- موسسه تحقیقات علوم دامی کشور

چکیده: (5651 مشاهده)

نظریه‌ اطلاعات، شاخه‌ای از ریاضیات است. از تئوری اطلاعات در تجزیه و تحلیل های ژنتیکی و بیوانفورماتیکی استفاده گردیده و می‌توان از آن در آنالیز‌های مربوط به ساختارها و توالی‌های زیستی نیز استفاده نمود. در این پژوهش بعد از استخراج توالی DNA ژن ‌و اگزون‌های موثر بر تولید شیر در گاو شیری، فراسنجه آنتروپی در مراتب یک الی چهار برای هر ژن و اگزون‌های هر ژن محاسبه شد. برای استخراج تشابه میان ژن‌ها از یکدیگر، از اطلاعات متقابل بین ژن ها استفاده شد. نتایج با استفاده از هفت روش معمول خوشه‌بندی شدند. با توجه به تعدد نتایج، جهت افرایش دقت و تجمیع نتایج حاصل، از الگوریتم آدابوست استفاده گردید. در پایان جهت تایید نتایج حاصل از آدابوست و پیش بینی عملکرد ژن‌ها و ارتباط بین آن‌ها، با مراجعه به تارگاه GeneMANIA نتایج بر اساس حاشیه نویسی ژنومی آن‌ها مورد بررسی و مقایسه قرار گرفت. تجمیع نتایج هر خوشه‌بندی که با الگوریتم آدابوست انجام شد و خود نوعی درخت ژنی را تداعی می کند، نشان داد که روش پیشنهادی برای خوشه‌بندی مجموعه‌ای از ژن‌ها، از نظر زیستی جواب معقولی را حاصل می‌کند چرا که با نتایج حاشیه نویسی ژنومی ژن‌های حاصل در تارگاه GeneMANIA مطابقت داشت. اعتقاد بر این است که روش ارائه شده برای ایجاد درخت ژنی با سایر روش‌های متکی به توالی DNA برای خوشه بندی مجموعه‌ای از ژن‌ها، می‌تواند رقابت نماید و لذا می‌تواند در گروه‌بندی ژن‌های سایر گو‌نه‌ها نیز به کار رود.

واژه‌های کلیدی: آنتروپی، اطلاعات متقابل، تئوری اطلاعات، خوشه‌بندی ژن، گاو شیری

متن کامل [PDF 3943 kb] (1526 دریافت)

نوع مطالعه: پژوهشي | موضوع مقاله: ژنتیک و اصلاح نژاد طیور
دریافت: 1396/7/14 | پذیرش: 1397/6/31

فهرست منابع

1. Bindewald, E. and B.A. Shapiro. 2006. RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers, RNA (2006), 12: 342-352. Published by Cold Spring Harbor Laboratory Press. Copyright 2006 RNA Society. [DOI:10.1261/rna.2164906]

2. Blaisdell, B.E. 1986. Ameasure of the similarity of sets of sequences not requiring sequence alignment. Proceeding of National Academy of Sciences. 83(14): 5155-5159. [DOI:10.1073/pnas.83.14.5155]

3. Brunell, H., J.J. Gallardo-Chacon, A. Buil, M. Montserrat Vallverdu, J.M. Soria, P. Caminal and A. Perera. 2010. MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis .BIOINFORMATICS 26(15): 1811-1818, DOI:10.1093/bioinformatics/btq273. [DOI:10.1093/bioinformatics/btq273]

4. Buitenhuis, A.J., U.K. Sundekilde, N. Poulsen, H.C. Bertram, L.B. Larsen and P. Sørensen. 2013. Estimation of genetic parameters and detection of qtl for metabolites in Danish Holstein milk. Journal of Dairy Science, 14(79): 1-10.

5. Buslje, C.M., E. Teppa, T.D. Dome'nico, J.M. Delfino and M. Nielsen .2010. Networks of high mutual information define the structural proximity of catalytic sites: Implications for Catalytic Residue Identification. PLoS Computational Biology. . Volume 6(11). [DOI:10.1371/journal.pcbi.1000978]

6. Changchuan, Y., Y. Chen and S.T. Yau. 2014. A measure of DNA sequence similarity by Fourier Transform with applications on hierarchical clustering. Journal of Theoretical Biology, 359: 18-28. [DOI:10.1016/j.jtbi.2014.05.043]

7. Clemente, J.C., K. Satou and G. Valiente. 2007. Phylogenetic reconstruction from non-genomic data. Bioinformatics, 23: 110-115. [DOI:10.1093/bioinformatics/btl307]

8. Comin, M. and D. Verzotto. 2012. Alignment-free phylogeny of whole genomes using underlying subwords. Algorithms for Molecular Biology, 7(1). [DOI:10.1186/1748-7188-7-34]

9. Dawy, Z., J. Hagenauer, P. Hanus and J.C. Mueller. 2005. Mutual Information Based Distance Measures for Classification and Content Recognition with Applications to Genetics. 0-7803-8938-7/05/$20.00 (C) 2005 IEEE.

10. Edgar, R.C. and S. Batzoglou. 2006. Multiple sequence alignment. Curr. Opin. Struct. Biol, 16(3): 368-373. [DOI:10.1016/j.sbi.2006.04.004]

11. Edwards, S.V., B. Fertil, A. Giron and P.J. Deschavanne. 2002. A genomic schism in birds revealed by phylogenetic analysis of DNA strings. Systematic. Biology, 51: 599-613. [DOI:10.1080/10635150290102285]

12. Erill, I. 2012. Information Theory and biological sequences: Insights from an evolutionary prespective. 2012 Nova Science Publishers, Inc.

13. Freund, Y. and R. Schapire. 1996. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55: 119. CiteSeerX 10.1.1.32.8918,  DOI: 10.1006/jcss.1997.1504. [DOI:10.1006/jcss.1997.1504]

14. Freund, Y. and R. Schapire. 1996. Experiments with a new boosting algoritm. Paper read at Proceeding of the Thirteenth Internatioanal Conference on Machine Learning.

15. Forst, C.V. and K. Schulten. 2001. Phylogenetic analysis of metabolic pathways. Journal Molecular Evolution, 52: 471-489. [DOI:10.1007/s002390010178]

16. Gray, R.M. 2013. Entropy and Information Theory. First Edition. Springer-Verlag New York publisher.

17. Habibi, M., H.Pezeshk, C. Eslahchi and M. Sadegi. 2007. Allocation of protein secondary structure using entropy. Iran's fifth largest biotechnology conference. Tehran, Iran. pp: 33-39 (In Persian).

18. Herzel, H., W. Ebelling and A.O. Schmitt. 1994. Entropies of biosequences: The role of repeats. Physical Review Letters, 50: 5061-5071. [DOI:10.1103/PhysRevE.50.5061]

19. Heymans, M. and A.K. Singh. 2003. Deriving phylogenetic trees from the similarity analysis of metabolic pathways. Bioinformatics, 19(1): 138-146. [DOI:10.1093/bioinformatics/btg1018]

20. Jiang, S., C. Tang, L. Zhang and A. Zhang. 2014. A Maximum entropy approach to classifying gene array data sets. Workshop on Data Mining for Genomics, First SIAM International Conference on Data Mining.

21. Jun, S.R., G.E. Sims, G.A. Wu and S.H. Kim. 2010. Whole-proteome phylogeny of prokaryotes by feature frequency profiles: analignment-free method with optimal featurere solution. Proceedings of the National Academy of Sciences, 107 (1): 133-138. [DOI:10.1073/pnas.0913033107]

22. Katoh, K., K. Misawa, K.I. Kuma and T. Miyata. 2002. Mafft: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30(14): 3059-3066. [DOI:10.1093/nar/gkf436]

23. Kemena, C. and C. Notredame. 2009. up coming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics, 25(19): 2455-2465. [DOI:10.1093/bioinformatics/btp452]

24. Khatib, H., RL. Monson, V. Schutzkus, D.M. Kohl, G.J.M. Rosa and J.J.Rutledge. 2008. Mutations in the STAT5A gene are associated with embryonic survival and milk composition in cattle. Journal of Dairy Science, 91: 784-793. [DOI:10.3168/jds.2007-0669]

25. Kim, J., S. Kim, K. Lee and Y. Kwon .2009. Entropy analysis in yeast DNA. Chaos, Solitons and Fractals 39: 1565-1571. [DOI:10.1016/j.chaos.2007.06.036]

26. Larkin, M.A., G. Blackshields, N. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm and R. Lopez. 2007. Clustal w and clustal x version 2.0. Bioinformatics, 23(21): 2947-2948. [DOI:10.1093/bioinformatics/btm404]

27. Lemay, D.G., D.J. Lynn, W.F. Martin, M.C. Neville, T.M. Casey, G. Rincon, E.V. Kriventseva, W.C. Barris, A.S. Hinrichs, A.J. Molenaar, K.S. Pollard, N.J. Maqbool, K. Singh, R. Murney, E.M. Zdobnov, R.L. Tellam, J.F. Medrano, J.B. German and M. Rijnkels. 2009. The bovine lactation genome: insights into the evolution of mammalian milk. Genome Biology, 10:R43 (DOI: 10.1186/gb-2009-10-4-r43). [DOI:10.1186/gb-2009-10-4-r43]

28. Liou, C.Y., S.H. Tseng, W.C. Cheng and H.Y. Tsai. 2013. Structural complexity of DNA sequence. Computational and mathematical methods in medicine, Volume 2013, Article ID 628036, 11 pages. [DOI:10.1155/2013/628036]

29. Liu, B. 2007. Uncertainty Theory, 2nd ed., Springer-Verlag, Berlin.

30. Machado, J.T. 2012. Shannon Entropy Analysis of the Genome Code. Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2012, Article ID 132625, 12 pages DOI: 10.1155/2012/132625. [DOI:10.1155/2012/132625]

31. Monge, R.E. and J.L. Crespo. 2014. Comparison of complexity measures for DNA sequence analysis. 2014 International Work Conference on Bio-inspired Intelligence (IWOBI). [DOI:10.1109/IWOBI.2014.6913941]

32. Neagoe, I.M., D. Popescu and V.I.R. Niculescu. 2014. Applications of entropic divergence measures for DNA segmentation into high variable regiones of cryposporidium spp. GP60 gene. Romanian Reports in Physics, 66(4): 1078-1087.

33. Ogorevc, J., T. Kunej, A. Razpet and P. Dovc. 2009. Database of cattle candidate genes and genetic markers for milk production and mastitis. Animal Genetics, 40: 832-851. [DOI:10.1111/j.1365-2052.2009.01921.x]

34. Penner, O., P. Grassberger and M. Paczuski. 2011. Sequence Alignment, Mutual Information, and Dissimilarity Measures for Constructing Phylogenies. PLOS ONE, 6(1): e14373. DOI: 10.1371/journal.pone.0014373. [DOI:10.1371/journal.pone.0014373]

35. Pham, T.D., D.I. Crane, D. Tannock and D. Beck. 2004, Kullback-Leibler dissimilarity of markov models for phylogenetic tree reconstruction. Proceeding of 2004 international Symposium on Inteligent Multimedia, Video and Speech Processing. October 20-22, 2004 HongKong.

36. Porto-DIaz, L., V. BolOn-Canedo, A. Alonso-Betanzos and O. Fontenla-Rome. 2011. A study of performance on microarray data sets for a classifier based on information theoretic learning. Neural Networks 24: 888-896. [DOI:10.1016/j.neunet.2011.05.010]

37. Qi, J., B.Wang and B. Hao. 2004. Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach. Journal Molecular and Evolution, 58: 1-11. [DOI:10.1007/s00239-003-2493-7]

38. Reddy, Y.V. and A. Sebastian. 2009. Parameters for estimation of entropy to study price manipulation in stock markets", Research publication university of Dehli.

39. 39.Ruiz-Marin, M., M. Matilla-Garcia, J.A.G. Cordoba, J.L. Susillo-Gonzalez, A. Romo-Astorga, A. Gonzalez-Pérez, A. Ruiz and J. Gayan. 2010. An entrpyetest for single-locus genetic association analysis. BMC Genetics, 11:19. [DOI:10.1186/1471-2156-11-19]

40. Sims, G.E., S.R. Jun, G.A. Wu and S.H. Kim. 2009. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proceedings of the National Academy of Sciences. 106(8): 2677-2682. [DOI:10.1073/pnas.0813249106]

41. Shannon, C. 1948. A mathematical theory of communication. Bell System Technical Journal, vol. 27: 379-423 and 623-656. [DOI:10.1002/j.1538-7305.1948.tb00917.x]

42. Sherwin, B.W. 2010. Entropy and information approaches to genetic diversity and its expression: genomic geography. Entropy, 12: 1765-1798; DOI: 10.3390/e12071765. [DOI:10.3390/e12071765]

43. Stuart G.W, K. Moffet and S. Baker. 2002. Integrated gene species phylogenies from unaligned whole genome protein sequences. Bioinformatics, 18: 100-108. [DOI:10.1093/bioinformatics/18.1.100]

44. Stuart, G.W., K. Moffet and J.J. Leader. 2002. A comprehensive vertebrate phylogeny using vector representations of protein sequences from whole genomes. Molecular Biology and Evolution., 19: 554-562. [DOI:10.1093/oxfordjournals.molbev.a004111]

45. Sundekilde, U.K., L.B. Larsen and H.C. Bertram. 2013. NMR-Based Milk Metabolomics. Metabolites, 3:204-222. [DOI:10.3390/metabo3020204]

46. Tautz, D. and M. Trick, G.A. Dover. 1986. Cryptic simplicity in DNA is a major source of genetic variation. Nature, 322: 652-656. [DOI:10.1038/322652a0]

47. Tomovic, A. and E.J Oakeley. 2007. Position dependencies in transcription factor binding sites. Bioinformatics, 23(8): 933-941 DOI: 10.1093/bioinformatics/btm055. [DOI:10.1093/bioinformatics/btm055]

48. Vinga, S. and J. Almeida. 2003. Alignment-free sequence comparison: review. Bioinformatics, 19(4): 513-523. [DOI:10.1093/bioinformatics/btg005]

49. Vinga, S. 2013. Information theory applications for biological sequence analysis. Briefings in Bioinformatics, 15(3): 376-389, DOI: 10.1093/bib/bbt068. [DOI:10.1093/bib/bbt068]

50. Warde-Farley, D., S.L. Donaldson, O. Comes, K. Zuberi, R. Badrawi, P. Chao, M. Franz, C. Grouios, F. Kazi, C.T. Lopes, A. Maitland, S. Mostafavi, J. Montojo, Q. Shao, G. Wright, G.D. Bader and Q. Morris. 2010. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Research, 2010, Vol. 38, Web Server issueDOI:10.1093/nar/gkq537. [DOI:10.1093/nar/gkq537]

51. Warnow, T.2013. Large-scale multiple sequence alignment and phylogeny estimation. In: Models and Algorithms for Genome Evolution. Springer, 85-146pp. [DOI:10.1007/978-1-4471-5298-9_6]

52. Xie, X., Y. Yu, G. Liu, Z. Yuan and J. Song. 2010. Complexity and Entropy Analysis of DNA Methyltransferase. J Data Mining in Genom Proteomics. 1(2): 1000105. [DOI:10.4172/2153-0602.1000105]

53. Yu, Z.G., V. Anh and K.S. Lau. 2003. Multifractal and correlation analysis of protein sequences from complete genome, Physical Review E, 68: 021913. [DOI:10.1103/PhysRevE.68.021913]

54. Yu, Z.G, V.V. Anh and L.Q. Zhou. 2005. Fractal and dynamical language methods to construct phylogenetic tree based on protein sequences from complete genomes, in L.Wang, K. Chen and Y.S. Ong (Eds): ICNC 2005, Lecture Notes in Computer Science, 3612: 337-347, Springer-Verlag Berlin Heidelberg. [DOI:10.1007/11539902_40]

55. Yu, Z.G., L.Q. Zhou, V. Anh and K.H. Chu. 2007. Phylogeny of prokaryotes and chloroplasts revealed by a simple composition approach on all protein sequences from whole genome without sequence alignment, Journal of Molecular Evolution, 60: 538-545. [DOI:10.1007/s00239-004-0255-9]

56. Zhang, JL., L.S. Zan, P. Fang, F. Zhang, G.L. Shen and W.Q. Tian. 2008. Genetic variation of PRLR gene and association with milk performance traits in dairy cattle. Canadian Journal of Animal Science, 88: 33-39. [DOI:10.4141/CJAS07052]

57. Zhou, L.Q., Z.G. Yu, V. Anh, P.R. Nie, F.F. Liao and Y.J. Chen. 2007. Log-correlation distance and Fourier transformation with Kullback-Leibler divergence distance for construction of vertebrate phylogeny using complete mitochondrial genomes. In Proceedings of the 3nd International Conference on Natural Computation (ICNC2007), Haikou, China, August 2007: 304-308. [DOI:10.1109/ICNC.2007.462]

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

پایگاه های مرتبط

کلمات کلیدی

تولیدات , دام ,

نظرسنجی

کلیه حقوق این وب سایت متعلق به پژوهشهای تولیدات دامی می باشد.

طراحی و برنامه نویسی : یکتاوب افزار شرق

Designed & Developed by : Yektaweb

نظر شما در مورد عملکرد پایگاه چیست؟
	عالی
	خوب
	متوسط
	ضعیف

پژوهشهای تولیدات دامی

(علمی)

پایگاه های مرتبط

کلمات کلیدی

نظرسنجی