📄 新建 microsoft word 文档.doc
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0052)http://vip.6to23.com/yunyan8/shuhai/wenjian/2000.htm -->
<HTML><HEAD><TITLE>2000网易杯全国大学生数学建模竞赛题目</TITLE>
<META content="text/html; charset=gb2312" http-equiv=Content-Type>
<META content="MSHTML 5.00.3315.2870" name=GENERATOR>
<META content=FrontPage.Editor.Document name=ProgId>
<META content="ricepapr 111" name="Microsoft Theme"></HEAD>
<BODY aLink=#000099 background=2000.files/ricebk.jpg bgColor=#cccc99
link=#6666cc text=#000000 vLink=#336633><!--mstheme--><FONT face=宋体>
<P align=center class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-ALIGN: center; mso-line-height-rule: exactly"><B
style="mso-bidi-font-weight: normal"><SPAN lang=EN-US
style="FONT-FAMILY: 宋体; FONT-SIZE: 14pt; mso-bidi-font-size: 10.0pt; mso-hansi-font-family: 'Times New Roman'">2000网易杯全国大学生数学建模竞赛题目<O:P>
</O:P></SPAN></B></P>
<P class=MsoNormal style="LINE-HEIGHT: 18pt; mso-line-height-rule: exactly"><B
style="mso-bidi-font-weight: normal"><SPAN lang=EN-US
style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-bidi-font-size: 10.0pt; mso-hansi-font-family: 'Times New Roman'">A题<SPAN
style="mso-spacerun: yes"> </SPAN>DNA序列分类<SPAN
style="mso-spacerun: yes"> </SPAN><O:P></O:P></SPAN></B></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 24pt; mso-line-height-rule: exactly"><SPAN
lang=EN-US
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">2000年6月,人类基因组计划中DNA全序列草图完成,预计2001年可以完成精确的全序列图,此后人类将拥有一本记录着自身生老病死及遗传进化的全部信息的“天书”。这本大自然写成的“天书”是由4个字符A,T,C,G按一定顺序排成的长约30亿的序列,其中没有“断句”也没有标点符号,除了这4个字符表示4种碱基以外,人们对它包含的“内容”知之甚少,难以读懂。破译这部世界上最巨量信息的“天书”是二十一世纪最重要的任务之一。在这个目标中,研究DNA全序列具有什么结构,由这4个字符排成的看似随机的序列中隐藏着什么规律,又是解读这部天书的基础,是生物信息学(Bioinformatics)最重要的课题之一。<O:P>
</O:P></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 24.1pt; mso-line-height-rule: exactly"><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">虽然人类对这部“天书”知之甚少,但也发现了<SPAN
lang=EN-US>DNA序列中的一些规律性和结构。例如,在全序列中有一些是用于编码蛋白质的序列片段,即由这4个字符组成的64种不同的3字符串,其中大多数用于编码构成蛋白质的20种氨基酸。又例如,在不用于编码蛋白质的序列片段中,A和T的含量特别多些,于是以某些碱基特别丰富作为特征去研究DNA序列的结构也取得了一些结果。此外,利用统计的方法还发现序列的某些片段之间具有相关性,等等。这些发现让人们相信,DNA序列中存在着局部的和全局性的结构,充分发掘序列的结构对理解DNA全序列是十分有意义的。目前在这项研究中最普通的思想是省略序列的某些细节,突出特征,然后将其表示成适当的数学对象。这种被称为粗粒化和模型化的方法往往有助于研究规律性和结构。<O:P>
</O:P></SPAN></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 24pt; mso-line-height-rule: exactly"><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">作为研究<SPAN
lang=EN-US>DNA序列的结构的尝试,提出以下对序列集合进行分类的问题:<O:P> </O:P></SPAN></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; mso-line-height-rule: exactly"><SPAN lang=EN-US
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'"><SPAN
style="mso-spacerun: yes">
</SPAN>1)下面有20个已知类别的人工制造的序列(见下页),其中序列标号1—10
为A类,11-20为B类。请从中提取特征,构造分类方法,并用这些已知类别的序列,衡量你的方法是否足够好。然后用你认为满意的方法,对另外20个未标明类别的人工序列(标号21—40)进行分类,把结果用序号(按从小到大的顺序)标明它们的类别(无法分类的不写入):<O:P>
</O:P></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; mso-line-height-rule: exactly"><SPAN lang=EN-US
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'"><SPAN
style="mso-spacerun: yes"> </SPAN>A类 <U><SPAN
style="mso-spacerun: yes"> </SPAN></U><SPAN
style="mso-spacerun: yes"> </SPAN>;<SPAN style="mso-spacerun: yes">
</SPAN>B类 <U><SPAN
style="mso-spacerun: yes"> </SPAN></U>。<O:P>
</O:P></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; mso-line-height-rule: exactly"><SPAN lang=EN-US
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'"><SPAN
style="mso-spacerun: yes">
</SPAN>请详细描述你的方法,给出计算程序。如果你部分地使用了现成的分类方法,也要将方法名称准确注明。<O:P> </O:P></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 21pt; mso-line-height-rule: exactly"><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">这<SPAN
lang=EN-US>40个序列也放在如下地址的网页上,用数据文件Art-model-data 标识,供下载:<O:P>
</O:P></SPAN></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 21pt; mso-line-height-rule: exactly"><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">网易网址:<SPAN
lang=EN-US><A href="http://www.163.com/"><SPAN
style="FONT-FAMILY: 'Times New Roman'">www.163.com</SPAN></A><SPAN
style="mso-spacerun: yes"> </SPAN>教育频道<SPAN
style="mso-spacerun: yes"> </SPAN>在线试题;<O:P> </O:P></SPAN></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 21pt; mso-line-height-rule: exactly"><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">教育网:<SPAN
lang=EN-US><SPAN style="mso-spacerun: yes"> </SPAN><A
href="http://www.cbi.pku.edu.cn/"><SPAN
style="FONT-FAMILY: 'Times New Roman'">www.cbi.pku.edu.cn</SPAN></A><SPAN
style="mso-spacerun: yes"> </SPAN>News<SPAN
style="mso-spacerun: yes"> </SPAN>mcm2000<O:P> </O:P></SPAN></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 21pt; mso-line-height-rule: exactly"><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">教育网:<SPAN
lang=EN-US><SPAN style="mso-spacerun: yes">
</SPAN>www.csiam.edu.cn/mcm<O:P> </O:P></SPAN></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 24pt; mso-line-height-rule: exactly"><SPAN
lang=EN-US
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">2)在同样网址的数据文件Nat-model-data
中给出了182个自然DNA序列,它们都较长。用你的分类方法对它们进行分类,像1)一样地给出分类结果。<O:P> </O:P></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; TEXT-INDENT: 24pt; mso-line-height-rule: exactly"><B
style="mso-bidi-font-weight: normal"><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">提示</SPAN></B><SPAN
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'">:衡量分类方法优劣的标准是分类的正确率,构造分类方法有许多途径,例如提取序列的某些特征,给出它们的数学表示:几何空间或向量空间的元素等,然后再选择或构造适合这种数学表示的分类方法;又例如构造概率统计模型,然后用统计方法分类等。<SPAN
lang=EN-US><O:P> </O:P></SPAN></SPAN></P>
<P class=MsoNormal
style="LINE-HEIGHT: 18pt; mso-line-height-rule: exactly"><SPAN lang=EN-US
style="FONT-FAMILY: 宋体; mso-hansi-font-family: 'Times New Roman'"> </SPAN><SPAN
lang=EN-US>Art-model-data</SPAN><SPAN lang=EN-US
style="mso-fareast-font-family: 楷体_GB2312"><O:P> </O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">1.aggcacggaaaaacgggaataacggaggaggacttggcacggcattacacggaggacgaggtaaaggaggcttgtctacggccggaagtgaagggggatatgaccgcttgg<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">2.cggaggacaaacgggatggcggtattggaggtggcggactgttcggggaattattcggtttaaacgggacaaggaaggcggctggaacaaccggacggtggcagcaaagga<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">3.gggacggatacggattctggccacggacggaaaggaggacacggcggacatacacggcggcaacggacggaacggaggaaggagggcggcaatcggtacggaggcggcgga<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">4.atggataacggaaacaaaccagacaaacttcggtagaaatacagaagcttagatgcatatgttttttaaataaaatttgtattattatggtatcataaaaaaaggttgcga<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">5.cggctggcggacaacggactggcggattccaaaaacggaggaggcggacggaggctacaccaccgtttcggcggaaaggcggagggctggcaggaggctcattacggggag<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">6.atggaaaattttcggaaaggcggcaggcaggaggcaaaggcggaaaggaaggaaacggcggatatttcggaagtggatattaggagggcggaataaaggaacggcggcaca<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">7.atgggattattgaatggcggaggaagatccggaataaaatatggcggaaagaacttgttttcggaaatggaaaaaggactaggaatcggcggcaggaaggatatggaggcg<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">8.atggccgatcggcttaggctggaaggaacaaataggcggaattaaggaaggcgttctcgcttttcgacaaggaggcggaccataggaggcggattaggaacggttatgagg<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">9.atggcggaaaaaggaaatgtttggcatcggcgggctccggcaactggaggttcggccatggaggcgaaaatcgtgggcggcggcagcgctggccggagtttgaggagcgcg<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">10.tggccgcggaggggcccgtcgggcgcggatttctacaagggcttcctgttaaggaggtggcatccaggcgtcgcacgctcggcgcggcaggaggcacgcgggaaaaaacg<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt"> <O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">11.gttagatttaacgttttttatggaatttatggaattataaatttaaaaatttatattttttaggtaagtaatccaacgtttttattactttttaaaattaaatatttatt<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">12.gtttaattactttatcatttaatttaggttttaattttaaatttaatttaggtaagatgaatttggttttttttaaggtagttatttaattatcgttaaggaaagttaaa<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">13.gtattacaggcagaccttatttaggttattattattatttggattttttttttttttttttttaagttaaccgaattattttctttaaagacgttacttaatgtcaatgc<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">14.gttagtcttttttagattaaattattagattatgcagtttttttacataagaaaatttttttttcggagttcatattctaatctgtctttattaaatcttagagatatta<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">15.gtattatatttttttatttttattattttagaatataatttgaggtatgtgtttaaaaaaaatttttttttttttttttttttttttttttttaaaatttataaatttaa<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">16.gttatttttaaatttaattttaattttaaaatacaaaatttttactttctaaaattggtctctggatcgataatgtaaacttattgaatctatagaattacattattgat<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">17.gtatgtctatttcacggaagaatgcaccactatatgatttgaaattatctatggctaaaaaccctcagtaaaatcaatccctaaacccttaaaaaacggcggcctatccc<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
style="LETTER-SPACING: -1pt; mso-fareast-font-family: 楷体_GB2312; mso-font-kerning: 5.0pt">18.gttaattatttattccttacgggcaattaattatttattacggttttatttacaattttttttttttgtcctatagagaaattacttacaaaacgttattttacatactt<O:P>
</O:P></SPAN></P>
<P align=left class=MsoPlainText style="TEXT-ALIGN: left"><SPAN lang=EN-US
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -