📄 io.txt
字号:
------------------------Handling TCP connections------------------------ Establishing a TCP connection requires the well knownthree-way handshake packet-sending sequence. Depending on networktraffic and several other issues, significant delay can occur atthis phase. Dillo handles the connection by a non blocking socket scheme.Basically, a socket file descriptor of AF_INET type is requestedand set to non-blocking I/O. When the DNS server has resolved thename, the socket connection process begins by calling connect(2); {We use the Unix convention of identifying the manual section where the concept is described, in this case section 2 (system calls).}which returns immediately with an EINPROGRESS error. After the connection reaches the EINPROGRESS ``state,'' thesocket waits in background until connection succeeds (or fails),when that happens, a callback function is awaked to perform thefollowing steps: set the I/O engine to send the query and expectits answer (both in background). The advantage of this scheme is that every required step isquickly done without blocking the browser. Finally, the socketwill generate a signal whenever I/O is possible.----------------Handling queries---------------- In the case of a HTTP URL, queries typically translate into ashort transmission (the HTTP query) and a lengthy retrievalprocess. Queries are not always short though, specially whenrequesting forms (all the form data is attached within thequery), and also when requesting CGI programs. Regardless of query length, query sending is handled inbackground. The thread that was initiated at TCP connecting timehas all the transmission framework already set up; at this point,packet sending is just a matter of waiting for thewrite signal (G_IO_OUT) to come and then sending the data. Whenthe socket gets ready for transmission, the data is sent usingg_io_channel_write. --------------Receiving data-------------- Although conceptually similar to sending queries, retrievingdata is very different as the data received can easily exceed thesize of the query by many orders of magnitude (for example whendownloading images or files). This is one of the main sources oflatency, the retrieval can take several seconds or even minuteswhen downloading large files. The data retrieving process for a single file, that began bysetting up the expecting framework at TCP connecting time, simplywaits for the read signal (G_IO_IN). When it happens, thelow-level I/O engine gets called, the data is read intopre-allocated buffers and the appropriate call-backs areperformed. Technically, whenever a G_IO_IN event is generated,data is received from the socket file descriptor, using theg_io_channel_read read function. This iterative process finishesupon EOF (or on an error condition).----------------------Closing the connection---------------------- Closing a TCP connection requires four data segments, not animpressive amount but twice the round trip time, which can besubstantial. When data retrieval finishes, socket closing istriggered. There's nothing but a g_io_channel_close call on thesocket's file descriptor. This process was originally designed tosplit the four segment close into two partial closes, one whenquery sending is done and the other when all data is in. Thisscheme is not currently used because the write close alsostops the reading part.The low-level I/O engine------------------------ Dillo I/O is carried out in the background. This is achievedby using low level file descriptors and signals. Anytime a filedescriptor shows activity, a signal is raised and the signalhandler takes care of the I/O. The low-level I/O engine ("I/O engine" from here on) wasdesigned as an internal abstraction layer for background filedescriptor activity. It is intended to be used by the cachemodule only; higher level routines should ask the cache for itsURLs. Every operation that is meant to be carried out inbackground should be handled by the I/O engine. In the case ofTCP sockets, they are created and submitted to the I/O engine forany further processing. The submitting process (client) must fill a request structureand let the I/O engine handle the file descriptor activity, untilit receives a call-back for finally processing the data. This isbetter understood by examining the request structure: typedef struct { gint Key; /* Primary Key (for klist) */ gint Op; /* IORead | IOWrite | IOWrites */ gint FD; /* Current File Descriptor */ gint Flags; /* Flag array */ glong Status; /* Number of bytes read, or -errno code */ void *Buf; /* Buffer place */ size_t BufSize; /* Buffer length */ void *BufStart; /* PRIVATE: only used inside IO.c! */ void *ExtData; /* External data reference (not used by IO.c) */ void *Info; /* CCC Info structure for this IO */ GIOChannel *GioCh; /* IO channel */ } IOData_t; To request an I/O operation, this structure must be filled andpassed to the I/O engine. 'Op' and 'Buf' and 'BufSize' MUST be provided. 'ExtData' MAY be provided. 'Status', 'FD' and 'GioCh' are set by I/O engine internalroutines. When there is new data in the file descriptor, 'IO_callback'gets called (by glib). Only after the I/O engine finishesprocessing the data, the upper layers are notified.The I/O engine transfer buffer------------------------------ The 'Buf' and BufSize' fields of the request structureprovide the transfer buffer for each operation. This buffer mustbe set by the client (to increase performance by avoiding copyingdata). On reads, the client specifies the amount and where to placethe retrieved data; on writes, it specifies the amount and sourceof the data segment that is to be sent. Although this schemeincreases complexity, it has proven very fast and powerful. Forinstance, when the size of a document is known in advance, abuffer for all the data can be allocated at once, eliminating theneed for multiple memory reallocations. Even more, if the size isknown and the data transfer is taking the form of multiple smallchunks of data, the client only needs to update 'Buf' andBufSize' to point to the next byte in its large preallocatedreception buffer (by adding the chunk size to 'Buf'). On theother hand, if the size of the transfer isn't known in advance,the reception buffer can remain untouched until the connectioncloses, but the client must then accomplish the usual buffercopying and reallocation. The I/O engine also lets the client specify a full lengthtransfer buffer when sending data. It doesn't matter (from theclient's point of view) if the data fits in a single packet ornot, it's the I/O engine's job to divide it into smaller chunksif needed and to perform the operation accordingly.------------------------------------------Handling multiple simultaneous connections------------------------------------------ The previous sections describe the internal work for a singleconnection, the I/O engine handles several of them in parallel.This is the normal downloading behavior of a web page. Normally,after retrieving the main document (HTML code), severalreferences to other files (typically images) and sometimes evento other sites (mostly advertising today) are found inside thepage. In order to parse and complete the page rendering, thoseother documents must be fetched and displayed, so it is notuncommon to have multiple downloading connections (every onerequiring the whole fetching process) happening at the same time. Even though socket activity can reach a hectic pace, thebrowser never blocks. Note also that the I/O engine is the onethat directs the execution flow of the program by triggering acall-back chain whenever a file descriptor operation succeeds orfails. A key point for this multiple call-back chained I/O engine isthat every single function in the chain must be guaranteed toreturn quickly. Otherwise, the whole system blocks until itreturns.-----------Conclusions----------- Dillo is currently in alpha tests. It already shows impressiveperformance, and its interactive ``feel'' is much better thanthat of other web browsers. The modular structure of Dillo, and its reliance on GTK+ allowit to be very small. Not every feature of HTML-4.0 has beenimplemented yet, but no significant problems are foreseen indoing this. The fact that Dillo's central I/O engine is written usingadvanced features of POSIX and TCP/IP networking makes itsperformance possible, but on the other hand this also means thatonly a fraction of the interested hackers are able to work on it. A simple code base is critical when trying to attract hackersto work on a project like this one. Using the GTK+ frameworkhelped both in creating the graphical user interface and inhandling the concurrency inside the browser. By having threadscommunicate through pipes the need for explicit synchronizationis almost completely eliminated, and with it most of thecomplexity of concurrent programming disappears. A clean, strictly applied layering approach based on clearabstractions is vital in each programming project. A good,supportive framework is of much help here.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -