📄 readme.overcast
字号:
README.overcastContained in this file is a brief discusson log about the comments on MACEDON's overcast implementation of the Overcast protocol. Prior to releasing this MAC file, we sent it to Jannotti and Kaashoek for comments.These are the comments and responses. The main differences between our implementations is that we use distributed locks instead of ancestor lists and we are not calculating/using bottleneck bandwidths.Please send additional comments to macedon@kcubes.comThanks to John and Adolfo for these comments:> Ok, here are Overcast specific comments. I may have misinterpreted things> occasionally, I'll be happen to do a couple rounds of this if I get a lot> of corrections.>> As I mentioned in my last mail, Overcast doesn't have anything that> explicitly limits the children of a node. So, in that sense, I can't> complain about the bahaviour of your code in that case. On the other hand,> if we DID do that, we would probably hand the joiner the list of all> children and leave it up to them what to do (not hand them a random node).> That just feels like the spirit behind the rest of the protocol.>This is a common issue with many overlay protocols. That is, we have seento the need to explicitly state the maximum number of peers that a node mayhave, allowing us to identify clear communication and memory usage bounds.As Chip has noted, we need to allow this for protocols that do not fit inthis paradigm, such as Overcast. At any rate, during our tests, we have, attimes, just set the maximum to something "really large" (bigger than thenumber of nodes) to gauge the behavior if we had unlimited neighbor listsizes. As an interesting side note, in Overcast, allowing the root to havea really large number of children slows the rate at which nodes can movesince probes conflict at a very high rate (at the root). In many cases, aswe have noticed, it makes sense to just randomly get rid of children fast.As you mention, we could hand the joining node a set of nodes to consider,perhaps it measures bandwidth to them and picks the best one, though thisprocess may take some time (and we didn't want to delay joining).> What's the difference between a "join" and "add" message?A join corresponds to entering the system for the first time. addcorresponds to making a move in the overlay once already joined. Since thebehavior is slightly different (a node making a move needs to remove itselffrom the old parent, for example), we decided to just have differentmessages. We just as easily could have done it with the same message, I amsure.>> On join_reply, you seem to be handling the case of multiple parents. But> you don't allow that, right?>You are correct we don't allow it. Your confusion is likely due to amuch-hated hack that uses neighbor_random to get the handle of a singletonpeer. The random call simply returns the only one there is.> I'm confused by replay_* What is that all about?This just prints out tracing messages that allow our debug code toreconstruct how the overlay changed over time.>> What are the locks (lock_requested)? Are you trying to ensure that only> one probe experiment happens at a time?>Yes, distributed "locks" do two things. First, we are coordinating probesso that a node does not probe/is not probed to/by more than one node at thesame time, since that would negatively affect the validity of themeasurement (since the concurrent probes would be competing for bandwidth onthe NIC and local LAN, etc). Second, we use these locks to avoid cycles. Anode "locks" the nodes it will consider, measure them, make a potentialmove, and then unlock them all. In this way, no cycles can exist. Ofcourse, the locks aren't "real" locks, they only cause transtions to theappropriate states.>> It's not clear to me how you avoid cycles (maybe this is, in fact, what the> locking is about). We did it by refusing to accept a join from someone> that we thought was an ancestor. This required that a node knew its> ancestors in the tree, which you don't seem to track. We put that into the> join_reply.>See above. Becuase tree height can be quite large (O(n)), ancestor listscan also be that high. We wanted to avoid this, one mechanism is via the"locks" as described above.> Incidentally, we didn't send out parent_info periodically from the parent> (all connections in Overcast are initiated from the downstream node, due to> firewalls). Instead, nodes periodically sent "join" messages, and the> join_reply contained things that you put in parent_info (plus ancestors).>We could change this easily as well....> I don't think you're tracking bandwidth the way we intended. For example,> it seems that you'll try to change to you're grandpa if the bandwidth> between you and him is much higher than the bandwidth to your parent. You> should only move if the BOTTLENECK bandwidth to the source if higher by> going through the grandparent than through the parent. From the paper:>> In each round the new node considers its bandwidth to \ttt{current} as> well as the bandwidth to \ttt{current} \emph{through each of}> \ttt{current}\emph{'s children}. If the bandwidth through any of the> children is about as high as the direct bandwidth to \ttt{current}, then> one of these children becomes \ttt{current} and a new round commences.>> So, for example, if my grandpa only had 10kbps of bandwidth to the source,> then I don't bother moving up to him just because I have 2Mbps to him, but> only 1Mbps to my parent. It's all pointless because I would only get> 10kbps anyway. To implement this, probees have to tell their probers about> their bottleneck bandwidth (and then you use the minimum of the computed> bandwidth and the reported bandwidth). It's possible I'm misreading your> code... I'm still getting the hang of the state-transition style.>It looks like we missed this detail from the paper. I definitely see andagree with your point that you should only move to the grandparent if thestream of received bandwidth is such that moving there is actually worth it.Of course, the challenge here is in making sure that the botteleneckmeasurement is not stale and is accurate. We have seen that bandwidthfluctuates widely in many overlays, so keeping an accurate "average"(exponential weighted, etc) is complicated.> I also don't see where you break bandwidth ties by using hopcount. I> suspect this is because you don't expect ties to happen very often. But> that's probably because you're not using bottleneck bandwidth. If you do,> then you'll get lots of ties (because the bottleneck is far up the tree, so > it really doesn't matter who you attach to, with respect to bandwidth.)>Right. We did not observe very many ties. I assume that you are alsoimplying that the bottleneck measurement is made not just when consideringyour grandparent, but also when considering siblings (is this right?). Inthe extreme case, all siblings, parent, and grandparent are limited by thebottleneck, in which case a node moves to the one with the lowest hopcount,right? I'm unclear how much of a difference this would make in ourexperiments since we were building "long/skinny" trees already, though notoptimized for hopcount.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -