📄 todo
字号:
One approach would be to have a central controller (ie receptionist), who knows which clients are waiting and which servers are free, but I don't really think the complexity is justified at this stage. Imagine if the clients sat so that they could see which doctor had their door open and was ready to accept a new patient. The first client who sees that then gets up to go through that door. There is a possibility of a race when two patients head for the door at the same time, but we just need to make sure that only one of them wins, and that the other returns to her seat and keeps looking rather than getting stuck. Ideally this will be built on top of some mechanism that does not rely on polling. I had wondered whether it would work to use refused TCP connections to indicate that a server's door is closed, but I think that is no good. It seems that at least on Linux, and probably on other platforms, you cannot set the TCP SYN backlog down to zero for a socket. The kernel will still accept new connections on behalf of the process if it is listening, even if it's asked for no backlog and if it's not accepting them yet. netstat shows these processes just in It looks like the only way to reliably have the server turn away connections is to either close its listening socket when it's too busy, or drop connections. This would work OK, but it forces the client into retrying, which is inefficient and ugly. Suppose clients connect and then wait for a prompt from the server before they begin to send. For multiple servers the client would keep opening connections to new machines until it got an invitation to send a job. This requires a change to the protocol but it can be made backward compatible if necessary, though perhaps that's not necessary. This would have the advantage of working over either TCP or SSH. The main problem is that the client will potentially need to open connections to many machines before it can proceed. We almost certainly need to do this with nonblocking IO, but that should be reasonably portable. Local compilation needs to be handled by lockfiles or some similar mechanism. So in pseudocode this will be something like looking_fds = [] while not accepted: select() on looking_fds: if any have failed, remove them if any have sent an invitation: close all others use the accepted connection open a new connection I'm not sure if connections should be opened in random order or the order they're listed. Clients are almost certainly not going to be accepted in the order in which they arrive. If the client sends its job early then it doesn't hurt anybody else. I suppose it could open a lot of connections but that sort of fairness issue is not really something that distcc needs to handle. (Just block the user if they misbehave.) We can't use select() to check for the ability to run a process locally. Perhaps the select() needs to timeout and we can then, say, check the load average.problems with new protocol Does anyone actually want this? I really need an example of somewhere where it would be useful. The server may need to know the right extension for the temporary file to make the compiler behave in the right way. In fact, knowing the acceptable temporary filenames is part of the application definition.Compression Can compression automatically be turned on, rather than requiring user configuration? I can't tell at the moment when would be the right time to do that. Is it cheap enough to always have it on? We not only pay the cost of compression, but we also need to give up on using sendfile() and therefore pay for more kernel-userspace transitions and some data copying. Therefore probably not, at least for GigE.User Manual The UML manual is very good - Add some documentation of the benchmark system. Does this belong in the manual, or in a separate manual? - FAQ: Can't you check the gcc version? No, because gcc programs which report the same versions number can have different behaviours, perhaps due to vendor/distributor patches.Just cpp and linker? Is it easy to describe how to install only the bits of gcc needed for distcc clients? Basically the driver, header, linker, and specs. Would this save much space? Certainly installing gcc is much easier than installing a full cross development environment, because you don't need headers or libraries. So if you have a target machine that is a bit slower but not terrible (or you don't have many of them) it might be convenient to do most of your builds on the target, but rely on helpers with cross-compilers to help out.-g support I'm told that gcc may fix this properly in a future release. There would then be no need to kludge around it in distcc. Perhaps detect the -g option, and then absolutify filenames passed to the compiler. This will cause absolute filenames to appear in error messages, but I don't see any easy way to have both correct stabs info and also correct error messages. Is anything else wrong with this approach? kill compiler If the client is killed, it will close the connection. The server ought to kill the compiler so as to prevent runaway processes on the server. This probably involves selecting() for read on the connection. The compilation will complete relatively soon anyhow, so it's not worth doing this unless there is a simple implementation. tcp fiddling I wonder if increasing the maximum window size (sys.net.core.wmem_default, etc) will help anything? It's probably dominated by scheduling inefficiency at the moment. The client does seem to spend time in wait_for_tcp_memory, which might be benefitted by increasing the available memory.benchmark Try aspell and xmms, which may have strange Makefiles. glibc gtk/glib glibc++ qt gcc gdb linux openoffice mozillarsync-like distributed caching Look in the remote machine's cache as well. Perhaps use a SQUID-like broadcast of the file digest and other critical details to find out if any machine in the workgroup has the file cached. Perhaps this could be built on top of a more general file-caching mechanism that maps from hash to body. At the moment this sounds like premature optimization. Send source as an rdiff against the previous version. Needs to be able to fall back to just sending plain text of course. Perhaps use different compression for source and binary. librsync is probably not stable enough to do this very well.--ping option It would be nice to have a <tt>--ping</tt> client option to contact all the remote servers, and perhaps return some kind of interesting information. Output should be machine-parseable e.g. to use in removing unreachable machines from the host list. Perhaps send little fixed signatures, based on --version. Would this ever be useful?non-CC-specific Protocol Perhaps rather than getting the server to reinterpret the command line, we should mark the input and output parameters on the client. So what's sent across the network might be distcc -c @@INPUT@@ -o @@OUTPUT@@ It's probably better to add additional protocol sections to say which words should be the input and output files than to use magic values. The attraction is that this would allow a particularly knotty part of code to be included only in the client and run only once. If any bugs are fixed in this, then only the client will need to be upgraded. This might remove most of the gcc-specific knowledge from the server. Different clients might be used to support various very different distributable jobs. We ought to allow for running commands that don't take an input or output file, in case we want to run "gcc --version". The drawback is that probably new servers need to be installed to handle the new protocol version. I don't know if there's really a compelling reason to do this. If the argument parser depends on things that can only be seen on the client, such as checking whether files exist, then this may be needed. The server needs to use an appropriately-named temporary file.gcc wierdnesses: distcc needs to handle <tt>$COMPILER_PATH</tt> and <tt>$GCC_EXEC_PREFIX</tt> in some sensible way, if there is one. Not urgent because I have never heard of them being used.networking timeouts: Also we want a timeout for name resolution. The GNU resolver has a specific feature to do this. On other systems we probably need to use alarm(), but that might be more trouble than it is worth. Jonas Jensen says: Timing out the connect call could be done easier than this, just by interrupting it with a SIGALRM, but that's not enough to abort gethostbyname. This method of longjmp'ing from a signal handler is what they use in curl, so it should be ok.configurable timeout? Maybe make the various timeouts configurable? Isn't it possible to choose values that suit everyone? Maybe the initial connection timeout should be shorter?waitstatus Make sure that native waitstatus formats are the same as the Unix/Linux/BSD formats used on the wire. (See <http://www.opengroup.org/onlinepubs/007904975/functions/wait.html>, which says they may only be interpreted by macros.) I don't know of any system where they're different.override compiler name distcc could support cross-compilation by a per-volunteer option to override the compiler name. On the local host, it might invoke gcc directly, but on some volunteers it might be necessary to specify a more detailed description of the compiler to get the appropriate cross tool. This might be insufficient for Makefiles that need to call several different compilers, perhaps gcc and g++ or different versions of gcc. Perhaps they can make do with changing the DISTCC host settings at appropriate times. I'm not convinced this complexity is justified. Rusty is doing this in ccontrol, which is possibly a better place for it.use spawn() on Windows fork() is very slow. Can we get away with only using spawn()?Installable package for Windows Also, it would be nice to have an easily installable package for Windows that makes the machine be a Cygwin-based compile volunteer. It probably needs to include cross-compilers for Linux (or whatever), or at least simple instructions for building them.autodetection (Rendezvous, etc) http://dotlocal.org/mdnsd/ The Apple licence is apparently not GPL compatible. Brad reckons SLP is a better fit. Automatic detection ("zero configuration") of compile volunteers is probably not a good idea, because it might be complicated to implement, and would possibly cause breakage by distributing to machines which are not properly configured.OpenMOSIX autodiscovery what is this?central configuration Notwithstanding the previous point, centralized configuration for a site would be good, and probably quite practical. Setting up a list of machines centrally rather than configuring each one sounds more friendly. The most likely design is to use DNS SRV records (RFC2052), or perhaps multi-RR A records. For exmaple, compile.ozlabs.foo.com would resolve to all relevant machines. Another possibility would be to use SLP, the Service Location Protocol, but that adds a larger dependency and it seems not to be widely deployed.Large-scale Distribution distcc in it's present form works well on small numbers of close machines owned by the same people. It might be an interesting project to investigate scaling up to large numbers of machines, which potentially do not trust each other. This would make distcc somewhat more like other "peer-to-peer" systems like Freenet and Napster.preprocess remotely Some people might like to assume that all the machines have the same headers installed, in which case we really can preprocess remotely and only ship the source. Imagine e.g. a Clearcase environment where the same filesystem view is mounted on all machines, and they're all running the exact same system release. It's probably not really a good idea, because it will be marginally faster but much more risky. It is possible, though, and perhaps people building files with enormous headers would like it. Perhaps those people should just use a different tool like dmake, etc.Local variables:mode: indented-textindent-tabs-mode: nilEnd:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -