📄 code_life_cycle.html

📁 有趣的模拟进化的程序由国外一生物学家开发十分有趣
💻 HTML
字号:
<html><title>The Stages of Replication</title><body bgcolor="#FFFFFF" text="#000000" link="#0000AA" alink="#0000FF" vlink="#000044"><h2 align=center>The Stages of Replication</h2>This document examines the details of commands directly involved inreplication.<h3>1. Allocation of Offspring Memory</h3><p>The very first instruction in most heads-based organisms is "<tt>h-alloc</tt>",which will allocate space for an offspring to be placed into.  If you look atthe file <tt>source/cpu/hardware_util.cc</tt>, you will see that thisinstruction is associated with the method<font color="#880000">cHardwareCPU</font>::<font color="#008800">Inst_MaxAlloc</font>().What this means is that the organism will automatically allocate as muchspace as it possibly can, without having to first calculate its needs.  Whenthe organism is finished copying itself and divides off its child, any excessallocated memory will automatically be discarded.  It appears in thecode as follows:<pre>  void <font color="#880000">cHardwareCPU</font>::<font color="#008800">Inst_MaxAlloc</font>()   <font color="#886600">// Allocate maximum additional memory</font>  {    const <font color="#880000">int</font> <font color="#000088">cur_size</font> = <font color="#008800">GetMemory</font>().<font color="#008800">GetSize</font>();    const <font color="#880000">int</font> <font color="#000088">alloc_size</font> = <font color="#008800">Min</font>((<font color="#880000">int</font>) (<font color="#880000">cConfig</font>::<font color="#008800">GetChildSizeRange</font>() * <font color="#000088">cur_size</font>),                               MAX_CREATURE_SIZE - <font color="#000088">cur_size</font>);    if( <font color="#008800">Allocate_Main</font>(<font color="#000088">alloc_size</font>) )  <font color="#008800">Register</font>(REG_AX) = <font color="#000088">cur_size</font>;  }</pre><p>This method will determine the maximum amount of extra space that an organismis allowed to allocate, and then run the Allocate_Main() function passing inthat amount.  Allocate_Main is a very long method which is mostly just checkto make sure that everything going on is legal, and then initializes the newmemory that was allocated as per the genesis file: random, default instruction,or leave as it was in the previous organism that used it (for "necrophelia").<h3>2. Initial Self-Analysis</h3><p>Most of the initial self-analysis done on avida organisms is with the<tt>search</tt> instruction or one of its variants.  In the heads basedinstruction set, we call this "<tt>h-search</tt>" and associate it with themethod <font color="#880000">cHardwareCPU</font>::<font color="#008800">Inst_HeadSearch</font>().<p>The search type instructions read in the template (series of nops) thatfollows it, determine the complement template, and find that complementelsewhere in the genome.  It then sets the registers BX and CX to be thedistance to the found template and the size of that template respectively.Finally, we place the flow control head at the end of the template found inorder to reference it later on.  Obviously this last step only occursin the heads-based search.<p>The first search instruction executed by a heads organism is typically usedto locate the end of its genome.  The search will place the flow-head atthe end of the genome, which the organism will use to move the write head tothis point as well.  This is done with the "<tt>mov-head</tt>" instruction,implemented as follows:<pre>  void <font color="#880000">cHardwareCPU</font>::<font color="#008800">Inst_MoveHead</font>()  {    const <font color="#880000">int</font> <font color="#000088">head_used</font> = <font color="#008800">FindModifiedHead</font>(HEAD_IP);    <font color="#008800">GetHead</font>(<font color="#000088">head_used</font>).<font color="#008800">Set</font>(<font color="#008800">GetHead</font>(HEAD_FLOW));    if (<font color="#000088">head_used</font> == HEAD_IP) <font color="#000088">advance_ip</font> = false;  }</pre><p>If the <tt>mov-head</tt> instruction is followed by a <tt>nop-C</tt> it willmove the write head to the flow head, and ideally be ready to start copyingitself into the newly allocated space for the offspring.<br><br><br><br><h3>3. The Copy Loop</h3><p>The copy loop is the heart of any organism.  It consists of a setup,a copy segment to copy one or more instructions, a test segment to determineif the loop has finished, and a "jump" type instruction to move back tocopy the next line.<p>In a hand written organism, the setup is an <tt>h-search</tt> command withno template to direct its behavior.  The default when this instruction does nothave a template is to just drop the flow-head at the very next instruction,which is what it is used for here -- it places the flow head at the beginningof the portion of code that will actually be looped through copying each line.<p>The copy segment is typically just a single h-copyinstruction, which will read in an instruction from the location of the read-head, and write it out to the location of the write-head.  It will thenadvance both heads to the next positions in the genomes.  Take a look atthe source code for the method<font color="#880000">cHardwareCPU</font>::<font color="#008800">Inst_HeadCopy</font>(),which you will find in your handout for the file <tt>hardware_cpu.cc</tt>.<p>The first thing that happens in this method is the variables<font color="#000088">read_head</font>,<font color="#000088">write_head</font>, and<font color="#000088">cpu_stats</font> are setup as references to theappropriate objects in the hardware and organism (that is, any modificationsto these references change the actual objects, not just a local copy of them).This is so that we have easy to use variables locally for those objects thatwe are working with.  The read_head and write_head are then adjusted to makesure they are in a legal position on the genome (if, for example, the lastinstruction changed the organism's size, the heads might no longer bepointing to memory that still exists).<p>Next, the instruction at the read head is recorded in the variable<font color="#000088">read_inst</font>, and we test to see if this shouldbe mutated to some other value.  If a mutation does occur, we change theread_inst variable to a random value, increment the mutation count, and markflags atthe instruction position of the write head to denote the mutation.  After wedetermine what instruction was read (be it a correct reading or not), we callthe <font color="#008800">ReadHead</font>() method, which is simply used tokeep track of the most recent template copied.  This template is used to helpdetect the end of the organism, which we shall discuss in a moment.<p>Finally, we collect the statistics that another copy command was executedin this organism, finish the write by placing this instruction at the positionof the write head (and setting its flag as being a copied instruction) andthen advancing both heads to their next positions.<p>After an organism executes one of these copies it has to test to see if itis done copying itself.  The heads based organisms will typically do thiswith the aid of the <tt>if-label</tt> instruction, which tests to see ifthe most recent label copied is the complement of the one that follows it.If so, it will execute the next instruction (often a divide), otherwise itwill skip that next instruction and execute a <tt>mov-head</tt> that willjump the instruction pointer back to the flow head that was placed at thebeginning of the copy loop.  It will continue this copy-test-jump cycleuntil all the lines have been copied.<p>A common adaptation is "unrolling the loop".In the hand-written version discussed above, each instruction must havethree instructions executed to copy it: <tt>h-copy</tt>, <tt>if-label</tt>,and <tt>mov-head</tt>.  But what if a second<tt>h-copy</tt> command were inserted after the first?  Now the programwould be one line longer, so it would have more to copy, but each timethrough the loop would now copy two instructions while executing four --that means that on average only two instructions need be executed to copyone.  A *huge* savings.  The main drawback to the organism is that its length will need to be a multiple of two, or else the test to see if it isfinished won't occur at the proper time.  This loop unrolling becomes lessand less beneficial each time the organism does it, so it won't go completelyout of control, but I do think it is a bit too easy in the current version ofthe code.<br><br><br><br><br><h3>4. Dividing off the Child</h3><p>When an organism finishes copying itself, it needs to divide off itschild using a divide command.  In the heads based instruction set, this isthe <tt>h-divide</tt> command which calls the<font color="#880000">cHardwareCPU</font>::<font color="#008800">Inst_HeadDivide</font>()method, found in your <tt>hardware_cpu.cc</tt> handout.<p>This method will use the read head to determine the starting location of theoffspring, and the write head to determine its end.  This is logical becausethese are the locations that the heads should be in right after the copy loophas finished.  Everything after the write head is cut off and discarded as"extra_lines".  This information is passed into the Divide_Main methodwhich does the bulk of the work for divide (and is called by all of thevarious divide instructions in all of the sets).<p>The <font color="#880000">cHardwareCPU</font>::<font color="#008800">Divide_Main</font>()method is therefore what we are most interested in.  It begins by calculatingthe size of that child that would result from the divide point and theextra_line count that were passed into it, and runs<font color="#008800">Divide_CheckViable</font>() to make sure that all ofthese values are legal (that is that both parent and child are reasonablesizes for organisms, and reasonable sizes in relationship to each other -- fordefinitions of reasonable as found in the <tt>genesis</tt> file).  If any ofthem are not legal, the method returns false.<p>From this point on, we know the divide is legal, so we just need to processit.  We create a variable called <font color="#000088">child_genome</font>,which we use to construct the child genome.  We use a reference to a cGenomeobject inside of the organism so that this child genome is attached to itsparent organism and will be easily accessible from other places where it willbe needed later.  We're not going to be doing all of the work on it right inthis method.  We initialize the child genome to the section of the parentsgenome that was created for it.  We then run<font color="#008800">Resize</font>() on the parent genome to getrid of all of this extra space (both child and extra lines).<p>The <font color="#008800">Divide_DoMutations</font>() method will test and(if needed) process any divide mutations that may occur.  There are many ofthem, so this method is quite long.  It is followed by<font color="#008800">Divide_TestFitnessMeasures</font>(), which will run the offspring through a test CPU for special options that may be set in thegenesis file (such as mutation reversions).  Obviously this is very processorintensive since it would occur with every birth, so tests are only performedif required.  Both of these methods are left to the reader to step throughthemselves.<p>Next, the method <font color="#008800">Divide_SetupChild</font>() isexecuted on the organism to transform the child genome we have constructedinto a full-fledged child organism (adding in references to the environment,its interface to the population, its parent's phenotypic characteristics,etc.)  Basically, it gets the child organism ready to be inserted into thepopulation, but does not do so yet.  This method is located in<tt>source/main/organism.cc</tt> if you wish to look at it for yourself.<p>If we are using extra costs associated with the first time instructions areused, those costs a reset now that a divide has occurred, and must be paid foragain on the next divide cycle.<p>After a divide, we mark that we no longer have a mal (Memory ALlocation)active.  If the parent is reset (i.e., we have two offspring, not a parentand child) we need to make sure not to advance the IP of the parent.  Thereset parent has its IP placed at the beginning of its genome, and we wantto leave it there to execute the very first instruction.<p>Finally, we tell the organism to activate the divide and do something withthe child.  Give the child to the population (or the test CPU as the casemay be) to be dealt with, and reset the parent if we're splitting into twooffspring.<p>We will examine the population in a future class, where I'll go into moredetail as to how the child organism is placed.  By default, the placing ofan organism in the population can cause the removal of an organism that isalready there, and this is the only way that an organism can be killed off.<br><br><br><h3>5. Other Bits</h3><p>In the description of this life-cycle, one issue that I have not discussed iswhere these organisms would perform their computations.  In truth, thereisn't a fixed time other than it must be before the divide occurs, sincemerit is recalculated on a divide.  In practice it will typically be placeright before the copy loop, but there are plenty of exceptions.<p>Ideally, in the longer term, an organism's life will be composed of muchmore than just replication and computations -- they will have to interact witheach other and have more interactions with the environment.  In amulti-threaded model, organisms will be doing many activities at the sametime.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -