📄 todo

📁 distcc编译器的源代码.好像是readhat公司开发的.
💻
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
        One approach would be to have a central controller (ie receptionist), who    knows which clients are waiting and which servers are free, but I don't    really think the complexity is justified at this stage.    Imagine if the clients sat so that they could see which doctor had their    door open and was ready to accept a new patient.  The first client who    sees that then gets up to go through that door.  There is a possibility of    a race when two patients head for the door at the same time, but we just    need to make sure that only one of them wins, and that the other returns    to her seat and keeps looking rather than getting stuck.    Ideally this will be built on top of some mechanism that does not rely on    polling.    I had wondered whether it would work to use refused TCP connections to    indicate that a server's door is closed, but I think that is no good.      It seems that at least on Linux, and probably on other platforms, you    cannot set the TCP SYN backlog down to zero for a socket.  The kernel will    still accept new connections on behalf of the process if it is listening,    even if it's asked for no backlog and if it's not accepting them yet.    netstat shows these processes just in     It looks like the only way to reliably have the server turn away    connections is to either close its listening socket when it's too busy, or    drop connections.  This would work OK, but it forces the client into    retrying, which is inefficient and ugly.    Suppose clients connect and then wait for a prompt from the server before    they begin to send.  For multiple servers the client would keep opening    connections to new machines until it got an invitation to send a job.    This requires a change to the protocol but it can be made backward    compatible if necessary, though perhaps that's not necessary.    This would have the advantage of working over either TCP or SSH.  The main    problem is that the client will potentially need to open connections to    many machines before it can proceed.    We almost certainly need to do this with nonblocking IO, but that should    be reasonably portable.        Local compilation needs to be handled by lockfiles or some similar    mechanism.    So in pseudocode this will be something like    looking_fds = []    while not accepted:      select() on looking_fds:        if any have failed, remove them	if any have sent an invitation:	  close all others	  use the accepted connection      open a new connection    I'm not sure if connections should be opened in random order or the order    they're listed.    Clients are almost certainly not going to be accepted in the order in    which they arrive.     If the client sends its job early then it doesn't hurt anybody else.  I    suppose it could open a lot of connections but that sort of fairness issue    is not really something that distcc needs to handle.  (Just block the user    if they misbehave.)    We can't use select() to check for the ability to run a process locally.    Perhaps the select() needs to timeout and we can then, say, check the load    average.problems with new protocol    Does anyone actually want this?  I really need an example of    somewhere where it would be useful.    The server may need to know the right extension for the temporary    file to make the compiler behave in the right way.  In fact,    knowing the acceptable temporary filenames is part of the    application definition.Compression    Can compression automatically be turned on, rather than requiring    user configuration?  I can't tell at the moment when would be the    right time to do that.        Is it cheap enough to always have it on?  We not only pay the cost    of compression, but we also need to give up on using sendfile()    and therefore pay for more kernel-userspace transitions and some    data copying.  Therefore probably not, at least for GigE.User Manual    The UML manual is very good - Add some documentation of the benchmark system.  Does this belong   in the manual, or in a separate manual? - FAQ: Can't you check the gcc version?  No, because gcc programs which   report the same versions number can have different behaviours, perhaps due   to vendor/distributor patches.Just cpp and linker?   Is it easy to describe how to install only the bits of gcc needed for   distcc clients?  Basically the driver, header, linker, and specs.  Would   this save much space?   Certainly installing gcc is much easier than installing a full cross   development environment, because you don't need headers or libraries.  So   if you have a target machine that is a bit slower but not terrible (or you   don't have many of them) it might be convenient to do most of your builds   on the target, but rely on helpers with cross-compilers to help out.-g support    I'm told that gcc may fix this properly in a future release.  There would    then be no need to kludge around it in distcc.    Perhaps detect the -g option, and then absolutify filenames passed to the    compiler.  This will cause absolute filenames to appear in error messages,    but I don't see any easy way to have both correct stabs info and also    correct error messages.    Is anything else wrong with this approach?  kill compiler    If the client is killed, it will close the connection.  The server ought    to kill the compiler so as to prevent runaway processes on the server.     This probably involves selecting() for read on the connection.    The compilation will complete relatively soon anyhow, so it's not worth    doing this unless there is a simple implementation.    tcp fiddling    I wonder if increasing the maximum window size (sys.net.core.wmem_default,    etc) will help anything?  It's probably dominated by scheduling    inefficiency at the moment.    The client does seem to spend time in wait_for_tcp_memory, which    might be benefitted by increasing the available memory.benchmark    Try aspell and xmms, which may have strange Makefiles.    glibc    gtk/glib    glibc++    qt    gcc    gdb    linux    openoffice    mozillarsync-like distributed caching    Look in the remote machine's cache as well.    Perhaps use a SQUID-like broadcast of the file digest and other critical    details to find out if any machine in the workgroup has the file cached.    Perhaps this could be built on top of a more general file-caching    mechanism that maps from hash to body.  At the moment this sounds like    premature optimization.    Send source as an rdiff against the previous version.    Needs to be able to fall back to just sending plain text of course.    Perhaps use different compression for source and binary.    librsync is probably not stable enough to do this very well.--ping option           It would be nice to have a <tt>--ping</tt> client option to contact    all the remote servers, and perhaps return some kind of interesting    information.      Output should be machine-parseable e.g. to use in removing    unreachable machines from the host list.    Perhaps send little fixed signatures, based on --version.  Would    this ever be useful?non-CC-specific Protocol  Perhaps rather than getting the server to reinterpret the command  line, we should mark the input and output parameters on the client.  So what's sent across the network might be    distcc -c @@INPUT@@ -o @@OUTPUT@@  It's probably better to add additional protocol sections to say  which words should be the input and output files than to use magic  values.  The attraction is that this would allow a particularly knotty part  of code to be included only in the client and run only once.  If any  bugs are fixed in this, then only the client will need to be  upgraded.  This might remove most of the gcc-specific knowledge from  the server.  Different clients might be used to support various very different  distributable jobs.  We ought to allow for running commands that don't take an input or  output file, in case we want to run "gcc --version".  The drawback is that probably new servers need to be installed to  handle the new protocol version.  I don't know if there's really a compelling reason to do this.  If  the argument parser depends on things that can only be seen on the  client, such as checking whether files exist, then this may be  needed.    The server needs to use an appropriately-named temporary file.gcc wierdnesses:    distcc needs to  handle <tt>$COMPILER_PATH</tt> and    <tt>$GCC_EXEC_PREFIX</tt> in some sensible way, if there is one.    Not urgent because I have never heard of them being used.networking timeouts:    Also we want a timeout for name resolution.  The GNU resolver has    a specific feature to do this.  On other systems we probably need    to use alarm(), but that might be more trouble than it is worth.  Jonas    Jensen says:	Timing out the connect call could be done easier than this, just by	interrupting it with a SIGALRM, but that's not enough to abort	gethostbyname. This method of longjmp'ing from a signal handler is what	they use in curl, so it should be ok.configurable timeout?    Maybe make the various timeouts configurable?  Isn't it possible    to choose values that suit everyone?    Maybe the initial connection timeout should be shorter?waitstatus    Make sure that native waitstatus formats are the same as the    Unix/Linux/BSD formats used on the wire.  (See    <http://www.opengroup.org/onlinepubs/007904975/functions/wait.html>,    which says they may only be interpreted by macros.)  I don't know    of any system where they're different.override compiler name	        distcc could support cross-compilation by a per-volunteer option to    override the compiler name.  On the local host, it might invoke gcc    directly, but on some volunteers it might be necessary to specify a more    detailed description of the compiler to get the appropriate cross tool.    This might be insufficient for Makefiles that need to call several    different compilers, perhaps gcc and g++ or different versions of gcc.    Perhaps they can make do with changing the DISTCC host settings at    appropriate times.    I'm not convinced this complexity is justified.    Rusty is doing this in ccontrol, which is possibly a better place    for it.use spawn() on Windows      fork() is very slow.  Can we get away with only using spawn()?Installable package for Windows    Also, it would be nice to have an easily installable package for Windows    that makes the machine be a Cygwin-based compile volunteer.  It probably    needs to include cross-compilers for Linux (or whatever), or at least    simple instructions for building them.autodetection (Rendezvous, etc)    http://dotlocal.org/mdnsd/    The Apple licence is apparently not GPL compatible.    Brad reckons SLP is a better fit.    Automatic detection ("zero configuration") of compile volunteers is    probably not a good idea, because it might be complicated to implement,    and would possibly cause breakage by distributing to machines which are    not properly configured.OpenMOSIX autodiscovery    what is this?central configuration    Notwithstanding the previous point, centralized configuration for a site    would be good, and probably quite practical.  Setting up a list of    machines centrally rather than configuring each one sounds more friendly.    The most likely design is to use DNS SRV records (RFC2052), or perhaps    multi-RR A records.  For exmaple, compile.ozlabs.foo.com would resolve to    all relevant machines.  Another possibility would be to use SLP, the    Service Location Protocol, but that adds a larger dependency and it seems    not to be widely deployed.Large-scale Distribution    distcc in it's present form works well on small numbers of close machines    owned by the same people.  It might be an interesting project to    investigate scaling up to large numbers of machines, which potentially do    not trust each other.  This would make distcc somewhat more like other    "peer-to-peer" systems like Freenet and Napster.preprocess remotely    Some people might like to assume that all the machines have the same    headers installed, in which case we really can preprocess remotely and    only ship the source.  Imagine e.g. a Clearcase environment where the same    filesystem view is mounted on all machines, and they're all running the    exact same system release.    It's probably not really a good idea, because it will be marginally faster    but much more risky.  It is possible, though, and perhaps people building    files with enormous headers would like it.    Perhaps those people should just use a different tool like dmake, etc.Local variables:mode: indented-textindent-tabs-mode: nilEnd:
上一页 1 23
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -