首页 › 资源下载 › 其他 › This data set contai › 源码查看

http:^^www.cs.cornell.edu^info^people^jhsu^project^

来自「This data set contains WWW-pages collect」· EDU^INFO^PEOPLE^JHSU^PROJECT^ 代码 · 共 89 行

EDU^INFO^PEOPLE^JHSU^PROJECT^

89 行

MIME-Version: 1.0
Server: CERN/3.0
Date: Sunday, 01-Dec-96 20:03:47 GMT
Content-Type: text/html
Content-Length: 3083
Last-Modified: Friday, 27-Sep-96 04:01:17 GMT

<HTML><HEAD><TITLE>Jerry Hsu's MEng Project</TITLE><BODY><TABLE><TR><TD width=25%></TD><TD width=75%>	<H2>Jerry Hsu's MEng Project</H2></TD></TR><TR><TD width=25%></TD><TD width=75%><H3>Purpose</H3><P>Investigate training a neural net to process a digitized sound datastream and determine time indices that correspond to the beginning ofa spoken word.</P><H3>Background</H3> <P> One part in the process of subtitling a film (adding words totranslate a piece into a different language) is known as timing.Timing consists of a person or group of people listening to thesoundtrack and marking the starting and ending times of sentences.These times are then used by a computer along with a translation tooverlay text on the film.<BR>There are a couple methods of timing.  One method is to listen to thesoundtrack and whenever one hears the start of a sentence, he pressesa key to mark the time on a computer (known as spacebar method).  Thismethod is common among hobbyists due to minimal equipmentrequirements.  It has drawbacks though.  It can be a fairly accuratemethod of timing, but only with a large amount of practice.  The mostexperienced timers that use this method average around 3:1 or spendingthree times the running time of the actual film.  So for a two hourfilm, they would need to spend about six hours doing timing.<BR>A second method is to digitize the soundtrack and then step throughthe soundtrack in discrete intervals (1/10 second or 1/30 second).This method is slower than the spacebar method with a ratio of about10:1.  However, it has an advantage in that the skill requirement islower, the end accuracy is higher, and the method is highly parallel.Because the information is stored digitally, it can be divided amongmultiple people.  So a group of three lesser skilled people using thismethod can achieve the 3:1 of a more skilled timer.<BR>With the second method, the amount of sound a person needs to listento is less than a second.  I theorize that all the data the humanneeds to make this decision is present in the data stream.  Thus itshould be possible for a computer to simulate the decision making byanalyzing the same data.</P><h3>Project</H3><P>The goal of this project is to determine how accurately a neural netcan simulate a human in recognizing the start of speech.  As a meansof comparison, Id also analyze the accuracy of a dumb algorithm.  Thismethod is to measure relative difference in intensity between soundsegments with the start of a word being marked when the intensity goesover a threshold.  This is classically fooled by loud sound effectsand background music.  It can also be fooled depending if a sentencebegins with a hard or soft consonant.  I hypothesize that a neural netshould be able to account for these two problems.</P></TD></TR></TABLE><HR>[<!WA0><!WA0><!WA0><!WA0><A HREF="http://www.cs.cornell.edu/Info/People/jhsu/">Back to top</A>]<HR><ADDRESS>Maintained by<!WA1><!WA1><!WA1><!WA1><A HREF="http://www.fdemocracy.org/~jhsu/personal/">Jerry Hsu</A>-<!WA2><!WA2><!WA2><!WA2><A HREF="mailto:jh32@cornell.edu">jh32@cornell.edu</A></ADDRESS></BODY></HTML>

http:^^www.cs.cornell.edu^info^people^jhsu^project^ - 源码说明

本页面展示了「This data set contains WWW-pages collected from computer science departments of various universities」中的 http:^^www.cs.cornell.edu^info^people^jhsu^project^ 源码文件，采用 EDU^INFO^PEOPLE^JHSU^PROJECT^ 编程语言编写，共 89 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫开发者社区收录了大量与数据集相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?