📄 io.txt

📁 著名的手机浏览器开源代码
💻 TXT
📖 第 1 页 / 共 2 页
字号:
12 下一页
This is the base of a paper I wrote with Horst.It provides a good introduction to Dillo's internals.(Highly recommended if you plan to patch or develop in Dillo)--Jcid-----------------------------------------------------Paralell network programming of the Dillo web browser----------------------------------------------------- Jorge Arellano-Cid <jcid@inf.utfsm.cl> Horst H. von Brand <vonbrand@inf.utfsm.cl>--------Abstract--------   Network  programs  face  several delay sources when sending orretrieving  data.  This  is  particularly problematic in programswhich interact directly with the user, most notably web browsers.We  present  a hybrid approach using threads communicated throughpipes  and  signal  driven  I/O, which allows a non-blocking mainthread and overlapping waiting times.------------Introduction------------   The Dillo project didn't start from scratch but mainly workingon  the  code base of gzilla (a light web browser written by RaphLevien). As the project went by, the code of the whole source wasstandardized,  and the networking engine was replaced with a new,faster design. The source code is currently in alpha test, and isavailable at <http://dillo.sourceforge.net> under the GNU GeneralPublic License.   This  paper covers basic design aspects of the hybrid approachthat  the  Dillo  web  browser  uses  to  solve  several  latencyproblems.  After  introducing  the  main  delay-sources, the mainpoints of the hybrid design will be addressed.-------------Delay sources-------------   Network  programs  face several delay-sources while sending orretrieving  data.  In  the particular case of a web browser, theyare found in:  DNS querying:    The time required to solve a name.  Initiating the TCP connection:    The three way handshake of the TCP protocol.  Sending the query:    The time spent uploading queries to the remote server.  Retrieving data:    The time spent expecting and receiving the query answer.  Closing the TCP connection:    The four packet-sending closing sequence of the TCP protocol.    In  a  WAN  context,  every  single  item  of this list has anassociated  delay that is non deterministic and often measured inseconds. If we add several connections per browsed page (each onerequiring  at  least  the 4 last steps), the total latency can beconsiderable.-----------------------------------The traditional (blocking) approach-----------------------------------   The main problems with the blocking approach are:     When issuing an operation that can't be completed     immediately, the process is put to sleep waiting for     completion, and the program doesn't do any other     processing in the meantime.     When waiting for a specific socket operation to complete,     packets that belong to other connections may be arriving,     and have to wait for service.     Web browsers handle many small transactions,      if waiting times are not overlapped     the latency perceived by the user can be very annoying.     If the user interface is just put to sleep during network     operations, the program becomes unresponsive, confusing     and perhaps alarming the user.     Not overlapping waiting times and processing makes     graphical rendering (which is arguably the central function     of a browser) unnecessarily slow.---------------------Dillo's hybrid design---------------------   Dillo  uses  threads  and  signal  driven  I/O  extensively tooverlap   waiting   times  and  computation.  Handling  the  userinterface  in a thread that never blocks gives a good interactive``feel.''  The  use of GTK+, a sophisticated widget framework forgraphical  user  interfaces,  helped very much to accomplish thisgoal.  All the interface, rendering and I/O engine was built uponits facilities.   The  design  is  said to be ``hybrid'' because it uses threadsfor  DNS  querying and reading local files, and signal driven I/Ofor  TCP  connections.  The  threaded  DNS  scheme is potentiallyconcurrent  (this  depends on underlying hardware), while the I/Ohandling   (both   local   files   and   remote  connections)  isdefinitively parallel.   To  simplify  the  structure  of  the browser, local files areencapsulated  into  HTTP streams and presented to the rest of thebrowser  as  such, in exactly the same way a remote connection ishandled.  To  create  this  illusion,  a thread is launched. Thisthread  opens  a  pipe  to  the  browser,  it then synthesizes anappropriate  HTTP  header, sends it together with the file to thebrowser  proper.  In  this way, all the browser sees is a handle,the  data on it can come from a remote connection or from a localfile.   To  handle  a remote connection is more complex. In this case,the  browser  asks the cache manager for the URL. The name in theURL  has  to  be  resolved  through  the DNS engine, a socket TCPconnection  must be established, the HTTP request has to be sent,and  finally  the  result  retrieved. Each of the steps mentionedcould  give  rise to errors, which have to be handled and somehowcommunicated to the rest of the program. For performance reasons,it  is  critical that responses are cached locally, so the remoteconnection  doesn't  directly  hand over the data to the browser;the  response is passed to the cache manager which then relays itto  the rest of the browser. The DNS engine caches DNS responses,and  either  answers  them from the cache or by querying the DNS.Querying  is  done  in a separate thread, so that the rest of thebrowser isn't blocked by long waits here.   The  activities  mentioned do not happen strictly in the orderstated  above.  It  is  even possible that several URLs are beinghandled  at  the  same  time,  in  order  to  overlap waiting anddownloading.   The   functions  called  directly  from  the  userinterface   have   to  return  quickly  to  maintain  interactiveresponse.  Sometimes they return connection handlers that haven'tbeen completely set up yet. As stated, I/O is signal-driven, whenone  of  the  descriptors  is ready for data transfer (reading orwriting), it wakes up the I/O engine.   Data transfer between threads inside the browser is handled bypipes,  shared  memory  is  little used. This almost obviates theneed for explicit synchronization, which is one of the main areasof  complexity and bugs in concurrent programs. Dillo handles itsthreads  in  a way that its developers can think of it as runningon a single thread of control. This is accomplished by making theDNS  engine  call-backs  happen  within  the  main thread, and byisolating file loading with pipes.   Using threads in this way has three big advantages:     The browser doesn't block when one of its child threads     blocks. In particular, the user interface is responsive     even while resolving a name or downloading a file.     Developers don't need to deal with complex concurrent     concerns. Concurrency is hard to handle,  and few developers     are adept at this. This gives access a much larger pool of     potential developers, something which can be critical     in an open-source development project.     By making the code mostly sequential, debugging the code     with traditional tools like gdb is possible. Debugging     parallel programs is very hard, and appropriate tools are     hard to come by.   Because  of  simplicity and portability concerns, DNS queryingis  done  in  a  separate  thread. The standard C library doesn'tprovide  a  function for making DNS queries that don't block. Thealternative  is  to implement a new, custom DNS querying functionthat doesn't block. This is certainly a complex task, integratingthis  mechanism  into the thread structure of the program is muchsimpler.   Using  a  thread  and  a  pipe  to  read  a  local file adds abuffering step to the process (and a certain latency), but it hasa couple of significative advantages:     By handling local files in the same way as remote     connections, a significant amount of code is reused.     A preprocessing step of the file data can be added easily,     if needed. In fact, the file is encapsulated into an HTTP     data stream.-----------DNS queries-----------   Dillo handles DNS queries with threads, letting a child threadwait  until  the  DNS server answers the request. When the answerarrives,  a call-back function is called, and the program resumeswhat  it  was doing at DNS-request time. The interesting thing isthat  the  call-back  happens in the main thread, while the childthread  simply  exits  when  done.  This is implemented through aserver-channel design. The server channel------------------   There  is  one  thread  for each channel, and each channel canhave  multiple  clients. When the program requests an IP address,the server first looks for a cached match; if it hits, the clientcall-back  is  invoked immediately, but if not, the client is putinto  a  queue,  a thread is spawned to query the DNS, and a GTK+idle  client  is  set  to poll the channel 5~times per second forcompletion,  and  when  it finally succeeds, every client of thatchannel is serviced.   This  scheme  allows all the further processing to continue onthe same thread it began: the main thread.
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -