📄 item_007.htm
字号:
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman";
mso-fareast-font-family:"Times New Roman";
mso-ansi-language:#0400;
mso-fareast-language:#0400;
mso-bidi-language:#0400;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="6146"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1"/>
</o:shapelayout></xml><![endif]-->
</head>
<body lang=ZH-CN style='tab-interval:21.0pt;text-justify-trim:punctuation'>
<div class=Section1 style='layout-grid:15.6pt'>
<h3><span lang=EN-US>7</span><span style='font-family:宋体;mso-ascii-font-family:
Arial;mso-hansi-font-family:Arial'>.了解何时及如何为可伸缩性编写代码。(</span><span lang=EN-US>Know
when and how to code for scalability.</span><span style='font-family:宋体;
mso-ascii-font-family:Arial;mso-hansi-font-family:Arial'>)</span></h3>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>摘要</span><span lang=EN-US><o:p></o:p></span></b></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>要留意数据的爆炸性增长:即便没有过早地优化,也要留心渐近时间复杂度(</span><span
lang=EN-US>asymptotic complexity</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>)。根据被处理的数据量,用于用户数据的算法应该在一个可预计的时间内完成,而且最好是不差于线性时间。在优化被证实是需要的而且很重要时,应该集中精力改善</span><span
lang=EN-US>big-Oh</span><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>复杂度而不是集中精力于节省一个额外的加法之类的微优化(</span><span
lang=EN-US>micro-optimization</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>),如果数据量持续增长,那么尤其应该这样。</span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>讨论</span></b></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>本条描述了位于第</span><span lang=EN-US>8</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>条和第</span><span lang=EN-US>9</span><span style='font-family:
宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:"Times New Roman"'>条,“不要过早地优化”和“不要过早地退而求次”,之间的一个重要平衡点。因为唯恐本条被曲解为“过早的优化”</span>
<span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>,这使得本条非常难写。实际情况并非如此。</span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>下面是本条的写作背景和动机:内存和磁盘的容量持续地以指数级增长;例如,从</span><span
lang=EN-US>1988</span><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>年到</span><span lang=EN-US>2004</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>年间磁盘容量每年大约以</span><span lang=EN-US>112%</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>的速度增长(每十年将近有</span><span lang=EN-US>1900</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>倍的增长),然而即便是摩尔定律也只有</span><span lang=EN-US>59%</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>每年(每十年</span><span lang=EN-US>100</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>倍)。一个显然的结果就是无论今天代码做什么,明天它可能会被要求处理更多的数据——<i style='mso-bidi-font-style:
normal'>多得多</i>的数据。如果算法表现出某种糟糕的(比线性更差)时间复杂度,那么它迟早会让最强大的系统屈服:只要扔给它足够多的数据。</span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>防范可能出现的未来意味着我们希望避免在设计中包含任何潜在的性能瓶颈,它们会在面对更大的文件,更大的数据库,更多的像素,更多的窗口,更多的线程,更多传输的数据时出现。事实证明</span><span
lang=EN-US>C++</span><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>标准库的一大成功因素就是它保证了</span><span
lang=EN-US>STL</span><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>的容器操作和算法的性能复杂度。</span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>这就是平衡点:如果因为过早的优化而使用一个不怎么清晰的算法,那显然是错误的,因为你预计的大数据量可能永远也不会出现。但是如果过早地退而求次,对算法的复杂度(也称为“</span><span
lang=EN-US>big-Oh</span><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>”复杂度,即计算的成本为所处理数据的元素数量的一个数学函数)视而不见,那显然也是同样错误的。</span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>本条的建议分为两部分。首先,对特定的计算来说,除非使用一个伸缩性较差的算法(参见第</span><span
lang=EN-US>6</span><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>条)能带来非常明显的清晰性和可读性,否则在默认情况下应该避免使用那些在处理用户数据(因为它可能会增长)时伸缩性不佳的算法,甚至在知道哪些数据量会变得足够大而成为问题之前也应该如此。我们时常会感到惊讶:我们编写了十段代码,认为它们永远不会被用来处理大批量的数据,然后事实证明十次中有九次我们完全正确。第十次,我们遇到了性能瓶颈——我们知道这样的事情曾在我们身上发生,我们也知道这样的事情曾经或将会在你身上发生。当然啦,我们会把问题修正并把修正发给客户,但如果能避免此类尴尬和返工,那会更好。所以,在各方面都相同时(包括清晰性和可读性),请在一开始就按下面的做:</span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<ul style='margin-top:0cm' type=disc>
<li class=MsoNormal style='mso-list:l4 level1 lfo3;tab-stops:list 36.0pt'><i
style='mso-bidi-font-style:normal'><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>使用灵活的,动态分配的数据来代替固定大小的数组:</span></i><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>“永远比我会用到的更大”的数组是严重的错误,也无法保证安全性。(参见第</span><span
lang=EN-US>77</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>条。)如果大小确实可以在编译时就固定不变,那么数组是可以接受的。</span></li>
</ul>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US><o:p> </o:p></span></p>
<ul style='margin-top:0cm' type=disc>
<li class=MsoNormal style='mso-list:l4 level1 lfo3;tab-stops:list 36.0pt'><i
style='mso-bidi-font-style:normal'><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>了解你所使用的算法的真正复杂度:</span></i><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>要留意不易觉察的陷阱,比如看上去像是线性时间的算法,而实际上还调用了其它的线性操作,这使得算法事实上是二次方的。(实例请参见第</span><span
lang=EN-US>81</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>条。)</span></li>
</ul>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US><o:p> </o:p></span></p>
<ul style='margin-top:0cm' type=disc>
<li class=MsoNormal style='mso-list:l4 level1 lfo3;tab-stops:list 36.0pt'><i
style='mso-bidi-font-style:normal'><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>只要可能,最好是使用线性的或更快的算法:</span></i><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>常数时间复杂度,比如</span><b style='mso-bidi-font-weight:normal'><span
lang=EN-US>push_back</span></b><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>和哈希表的查找,是最完美的(参见第</span><span
lang=EN-US>76</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>和</span><span
lang=EN-US>80</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>条)。</span><span
lang=EN-US>O(log N)</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>对数复杂度,比如</span><b
style='mso-bidi-font-weight:normal'><span lang=EN-US>set/map</span></b><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>操作以及使用随机访问迭代器的</span><b style='mso-bidi-font-weight:
normal'><span lang=EN-US>lower_bound</span></b><span style='font-family:
宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:"Times New Roman"'>和</span><b
style='mso-bidi-font-weight:normal'><span lang=EN-US>upper_bound</span></b><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>,也不错(参见第</span><span lang=EN-US>76</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>,</span><span lang=EN-US>85</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>,和</span><span lang=EN-US>86</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>条)。</span><span lang=EN-US>O(N)</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>线性复杂度,比如</span><b style='mso-bidi-font-weight:normal'><span
lang=EN-US>vector::insert</span></b><span style='font-family:宋体;
mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:"Times New Roman"'>和</span><b
style='mso-bidi-font-weight:normal'><span lang=EN-US>for_each</span></b><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>,也是可以接受的(参见第</span><span lang=EN-US>76</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>,</span><span lang=EN-US>81</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>和</span><span lang=EN-US>84</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>条)。</span></li>
</ul>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US><o:p> </o:p></span></p>
<ul style='margin-top:0cm' type=disc>
<li class=MsoNormal style='mso-list:l4 level1 lfo3;tab-stops:list 36.0pt'><i
style='mso-bidi-font-style:normal'><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>如果合理,尽量避免使用比线性复杂度更差的算法:</span></i><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>例如,如果你面对一个</span><span lang=EN-US>O(N log N)</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>或</span><span lang=EN-US>O(N<sup>2</sup>)</span><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>算法,那么在默认的情况下,花些精力找一个替代算法,这样在数据量显著增长的情况下,代码也不会深陷于性能的泥潭。例如,第</span><span
lang=EN-US>81</span><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>条建议最好是使用范围成员函数(通常来说是线性的)而不要重复调用它们对应的单元素成员函数(如果一个线性操作调用另一个线性操作,那么很容易成为二次方的),这是一个主要的原因。</span></li>
</ul>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US><o:p> </o:p></span></p>
<ul style='margin-top:0cm' type=disc>
<li class=MsoNormal style='mso-list:l4 level1 lfo3;tab-stops:list 36.0pt'><i
style='mso-bidi-font-style:normal'><span style='font-family:宋体;mso-ascii-font-family:
"Times New Roman";mso-hansi-font-family:"Times New Roman"'>除非你无路可退而且你真的别无选择,否则绝对不要使用指数级的算法:</span></i><span
style='font-family:宋体;mso-ascii-font-family:"Times New Roman";mso-hansi-font-family:
"Times New Roman"'>在勉强接受一个指数级的算法之前,尽最大的努力去找一个替代算法,因为对指数级的算法来说,即使是数据量的适度增长都意味着性能的急剧下降。</span><i
style='mso-bidi-font-style:normal'><span lang=EN-US><o:p></o:p></span></i></li>
</ul>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>其次,在测量的结果显示优化是必需而且重要的之后,尤其是如果这是由于数据量的增长造成的,应该集中精力改善</span><span
lang=EN-US>big-Oh</span><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>复杂度而不是集中精力于节省一个额外的加法之类的微优化。</span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-family:宋体;mso-ascii-font-family:"Times New Roman";
mso-hansi-font-family:"Times New Roman"'>总之:只要可能,最好是使用线性的(或更好的)算法。如果合理,应该避免比线性复杂度更差的算法。要尽一切可能避免指数级的算法。</span></p>
</div>
</body>
</html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -