@ -15,13 +15,17 @@
< link rel = "canonical" href = "https://www.hello-algo.com/chapter_sorting/bucket_sort/" >
< link rel = "prev" href = "../counting_sort/" >
< link rel = "next" href = "../summary/" >
< link rel = "icon" href = "../../assets/images/favicon.png" >
< meta name = "generator" content = "mkdocs-1.4.2, mkdocs-material-9.1.1" >
< title > 桶排序 - Hello 算法< / title >
< title > 11.7. 桶排序( New) - Hello 算法< / title >
@ -75,7 +79,7 @@
< div data-md-component = "skip" >
< a href = "# _1 " class = "md-skip" >
< a href = "# 117 " class = "md-skip" >
跳转至
< / a >
@ -109,7 +113,7 @@
< div class = "md-header__topic" data-md-component = "header-topic" >
< span class = "md-ellipsis" >
桶排序
11.7. 桶排序( New)
< / span >
< / div >
@ -1327,16 +1331,18 @@
< li class = "md-nav__item md-nav__item-- section md-nav__item--nested">
< li class = "md-nav__item md-nav__item-- active md-nav__item-- section md-nav__item--nested">
< input class = "md-nav__toggle md-toggle " type = "checkbox" id = "__nav_12" >
< input class = "md-nav__toggle md-toggle " type = "checkbox" id = "__nav_12" checked >
@ -1354,6 +1360,8 @@
< label class = "md-nav__link" for = "__nav_12" id = "__nav_12_label" tabindex = "0" >
@ -1361,7 +1369,7 @@
< span class = "md-nav__icon md-icon" > < / span >
< / label >
< nav class = "md-nav" data-md-level = "1" aria-labelledby = "__nav_12_label" aria-expanded = " fals e">
< nav class = "md-nav" data-md-level = "1" aria-labelledby = "__nav_12_label" aria-expanded = " tru e">
< label class = "md-nav__title" for = "__nav_12" >
< span class = "md-nav__icon md-icon" > < / span >
11. 排序算法
@ -1456,10 +1464,79 @@
< li class = "md-nav__item md-nav__item--active" >
< input class = "md-nav__toggle md-toggle" type = "checkbox" id = "__toc" >
< label class = "md-nav__link md-nav__link--active" for = "__toc" >
11.7. 桶排序( New)
< span class = "md-nav__icon md-icon" > < / span >
< / label >
< a href = "./" class = "md-nav__link md-nav__link--active" >
11.7. 桶排序( New)
< / a >
< nav class = "md-nav md-nav--secondary" aria-label = "目录" >
< label class = "md-nav__title" for = "__toc" >
< span class = "md-nav__icon md-icon" > < / span >
目录
< / label >
< ul class = "md-nav__list" data-md-component = "toc" data-md-scrollfix >
< li class = "md-nav__item" >
< a href = "#1171" class = "md-nav__link" >
11.7.1. 算法流程
< / a >
< / li >
< li class = "md-nav__item" >
< a href = "#1172" class = "md-nav__link" >
11.7.2. 算法特性
< / a >
< / li >
< li class = "md-nav__item" >
< a href = "#1173" class = "md-nav__link" >
11.7.3. 如何实现平均分配
< / a >
< / li >
< / ul >
< / nav >
< / li >
< li class = "md-nav__item" >
< a href = "../summary/" class = "md-nav__link" >
11.7. 小结
11.8 . 小结
< / a >
< / li >
@ -1610,6 +1687,35 @@
< label class = "md-nav__title" for = "__toc" >
< span class = "md-nav__icon md-icon" > < / span >
目录
< / label >
< ul class = "md-nav__list" data-md-component = "toc" data-md-scrollfix >
< li class = "md-nav__item" >
< a href = "#1171" class = "md-nav__link" >
11.7.1. 算法流程
< / a >
< / li >
< li class = "md-nav__item" >
< a href = "#1172" class = "md-nav__link" >
11.7.2. 算法特性
< / a >
< / li >
< li class = "md-nav__item" >
< a href = "#1173" class = "md-nav__link" >
11.7.3. 如何实现平均分配
< / a >
< / li >
< / ul >
< / nav >
< / div >
< / div >
@ -1631,25 +1737,160 @@
< h1 id = "_1" > 桶排序< a class = "headerlink" href = "#_1" title = "Permanent link" > ¶ < / a > < / h1 >
< p > 「桶排序 Bucket Sort」考虑设置 < span class = "arithmatex" > \(k\)< / span > 个桶,并将 < span class = "arithmatex" > \(n\)< / span > 个元素根据大小分配到 < span class = "arithmatex" > \(k\)< / span > 个桶中,< strong > 并在每个桶内部分别执行排序< / strong > ,由于桶之间的大小关系的确定的,因此最后按照桶之间的顺序将元素依次展开即可。< / p >
< p > 假设元素平均分布在各个桶内,则每个桶内元素数量为 < span class = "arithmatex" > \(\frac{n}{k}\)< / span > ;如果使用「快速排序」来实现桶内排序,则排序单个桶使用 < span class = "arithmatex" > \(O(\frac{n}{k} \log\frac{n}{k})\)< / span > 时间,排序所有桶使用 < span class = "arithmatex" > \(O(n \log\frac{n}{k})\)< / span > 时间。< strong > 当桶数量 < span class = "arithmatex" > \(k\)< / span > 比较大时,时间复杂度则趋向于 < span class = "arithmatex" > \(O(n)\)< / span > < / strong > 。< / p >
< div class = "admonition note 计数排序与桶排序的关系" >
< p class = "admonition-title" > Note< / p >
< p > < strong > 计数排序可以看作是桶排序的一种特例< / strong > 。我们可以把计数排序中 < code > counter< / code > 的每个索引想象成一个桶,将统计数量的过程想象成把 < span class = "arithmatex" > \(n\)< / span > 个元素分配到对应的桶中,再根据桶之间的有序性输出结果,从而实现排序。< / p >
< h1 id = "117" > 11.7. 桶排序< a class = "headerlink" href = "#117" title = "Permanent link" > ¶ < / a > < / h1 >
< p > 「桶排序 Bucket Sort」是分治思想的典型体现, 其通过设置一些桶, 将数据平均分配到各个桶中, 并在每个桶内部分别执行排序, 最终根据桶之间天然的大小顺序将各个桶内元素合并, 从而得到排序结果。< / p >
< h2 id = "1171" > 11.7.1. 算法流程< a class = "headerlink" href = "#1171" title = "Permanent link" > ¶ < / a > < / h2 >
< p > 输入一个长度为 < span class = "arithmatex" > \(n\)< / span > 的数组,元素是范围 < span class = "arithmatex" > \([0, 1)\)< / span > 的浮点数,桶排序流程为:< / p >
< ol >
< li > 初始化 < span class = "arithmatex" > \(k\)< / span > 个桶,将 < span class = "arithmatex" > \(n\)< / span > 个元素分配至 < span class = "arithmatex" > \(k\)< / span > 个桶中;< / li >
< li > 对每个桶分别执行排序(本文采用编程语言的内置排序函数);< / li >
< li > 按照桶的从小到大的顺序,合并结果;< / li >
< / ol >
< p > < img alt = "桶排序算法流程" src = "../bucket_sort.assets/bucket_sort_overview.png" / > < / p >
< p align = "center" > Fig. 桶排序算法流程 < / p >
< div class = "tabbed-set tabbed-alternate" data-tabs = "1:10" > < input checked = "checked" id = "__tabbed_1_1" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_2" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_3" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_4" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_5" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_6" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_7" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_8" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_9" name = "__tabbed_1" type = "radio" / > < input id = "__tabbed_1_10" name = "__tabbed_1" type = "radio" / > < div class = "tabbed-labels" > < label for = "__tabbed_1_1" > Java< / label > < label for = "__tabbed_1_2" > C++< / label > < label for = "__tabbed_1_3" > Python< / label > < label for = "__tabbed_1_4" > Go< / label > < label for = "__tabbed_1_5" > JavaScript< / label > < label for = "__tabbed_1_6" > TypeScript< / label > < label for = "__tabbed_1_7" > C< / label > < label for = "__tabbed_1_8" > C#< / label > < label for = "__tabbed_1_9" > Swift< / label > < label for = "__tabbed_1_10" > Zig< / label > < / div >
< div class = "tabbed-content" >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.java< / span > < pre > < span > < / span > < code > < a id = "__codelineno-0-1" name = "__codelineno-0-1" href = "#__codelineno-0-1" > < / a > < span class = "cm" > /* 桶排序 */< / span >
< a id = "__codelineno-0-2" name = "__codelineno-0-2" href = "#__codelineno-0-2" > < / a > < span class = "kt" > void< / span > < span class = "w" > < / span > < span class = "nf" > bucketSort< / span > < span class = "p" > (< / span > < span class = "kt" > float< / span > < span class = "o" > []< / span > < span class = "w" > < / span > < span class = "n" > nums< / span > < span class = "p" > )< / span > < span class = "w" > < / span > < span class = "p" > {< / span >
< a id = "__codelineno-0-3" name = "__codelineno-0-3" href = "#__codelineno-0-3" > < / a > < span class = "w" > < / span > < span class = "c1" > // 初始化 k = n/3 个桶,预期向每个桶分配 3 个元素< / span >
< a id = "__codelineno-0-4" name = "__codelineno-0-4" href = "#__codelineno-0-4" > < / a > < span class = "w" > < / span > < span class = "kt" > int< / span > < span class = "w" > < / span > < span class = "n" > k< / span > < span class = "w" > < / span > < span class = "o" > =< / span > < span class = "w" > < / span > < span class = "n" > nums< / span > < span class = "p" > .< / span > < span class = "na" > length< / span > < span class = "w" > < / span > < span class = "o" > /< / span > < span class = "w" > < / span > < span class = "mi" > 2< / span > < span class = "p" > ;< / span >
< a id = "__codelineno-0-5" name = "__codelineno-0-5" href = "#__codelineno-0-5" > < / a > < span class = "w" > < / span > < span class = "n" > List< / span > < span class = "o" > < < / span > < span class = "n" > List< / span > < span class = "o" > < < / span > < span class = "n" > Float< / span > < span class = "o" > > > < / span > < span class = "w" > < / span > < span class = "n" > buckets< / span > < span class = "w" > < / span > < span class = "o" > =< / span > < span class = "w" > < / span > < span class = "k" > new< / span > < span class = "w" > < / span > < span class = "n" > ArrayList< / span > < span class = "o" > < > < / span > < span class = "p" > ();< / span >
< a id = "__codelineno-0-6" name = "__codelineno-0-6" href = "#__codelineno-0-6" > < / a > < span class = "w" > < / span > < span class = "k" > for< / span > < span class = "w" > < / span > < span class = "p" > (< / span > < span class = "kt" > int< / span > < span class = "w" > < / span > < span class = "n" > i< / span > < span class = "w" > < / span > < span class = "o" > =< / span > < span class = "w" > < / span > < span class = "mi" > 0< / span > < span class = "p" > ;< / span > < span class = "w" > < / span > < span class = "n" > i< / span > < span class = "w" > < / span > < span class = "o" > < < / span > < span class = "w" > < / span > < span class = "n" > k< / span > < span class = "p" > ;< / span > < span class = "w" > < / span > < span class = "n" > i< / span > < span class = "o" > ++< / span > < span class = "p" > )< / span > < span class = "w" > < / span > < span class = "p" > {< / span >
< a id = "__codelineno-0-7" name = "__codelineno-0-7" href = "#__codelineno-0-7" > < / a > < span class = "w" > < / span > < span class = "n" > buckets< / span > < span class = "p" > .< / span > < span class = "na" > add< / span > < span class = "p" > (< / span > < span class = "k" > new< / span > < span class = "w" > < / span > < span class = "n" > ArrayList< / span > < span class = "o" > < > < / span > < span class = "p" > ());< / span >
< a id = "__codelineno-0-8" name = "__codelineno-0-8" href = "#__codelineno-0-8" > < / a > < span class = "w" > < / span > < span class = "p" > }< / span >
< a id = "__codelineno-0-9" name = "__codelineno-0-9" href = "#__codelineno-0-9" > < / a > < span class = "w" > < / span > < span class = "c1" > // 1. 将数组元素分配到各个桶中< / span >
< a id = "__codelineno-0-10" name = "__codelineno-0-10" href = "#__codelineno-0-10" > < / a > < span class = "w" > < / span > < span class = "k" > for< / span > < span class = "w" > < / span > < span class = "p" > (< / span > < span class = "kt" > float< / span > < span class = "w" > < / span > < span class = "n" > num< / span > < span class = "w" > < / span > < span class = "p" > :< / span > < span class = "w" > < / span > < span class = "n" > nums< / span > < span class = "p" > )< / span > < span class = "w" > < / span > < span class = "p" > {< / span >
< a id = "__codelineno-0-11" name = "__codelineno-0-11" href = "#__codelineno-0-11" > < / a > < span class = "w" > < / span > < span class = "c1" > // 输入数据范围 [0, 1),使用 num * k 映射到索引范围 [0, k-1]< / span >
< a id = "__codelineno-0-12" name = "__codelineno-0-12" href = "#__codelineno-0-12" > < / a > < span class = "w" > < / span > < span class = "kt" > int< / span > < span class = "w" > < / span > < span class = "n" > i< / span > < span class = "w" > < / span > < span class = "o" > =< / span > < span class = "w" > < / span > < span class = "p" > (< / span > < span class = "kt" > int< / span > < span class = "p" > )< / span > < span class = "w" > < / span > < span class = "n" > num< / span > < span class = "w" > < / span > < span class = "o" > *< / span > < span class = "w" > < / span > < span class = "n" > k< / span > < span class = "p" > ;< / span >
< a id = "__codelineno-0-13" name = "__codelineno-0-13" href = "#__codelineno-0-13" > < / a > < span class = "w" > < / span > < span class = "c1" > // 将 num 添加进桶 i< / span >
< a id = "__codelineno-0-14" name = "__codelineno-0-14" href = "#__codelineno-0-14" > < / a > < span class = "w" > < / span > < span class = "n" > buckets< / span > < span class = "p" > .< / span > < span class = "na" > get< / span > < span class = "p" > (< / span > < span class = "n" > i< / span > < span class = "p" > ).< / span > < span class = "na" > add< / span > < span class = "p" > (< / span > < span class = "n" > num< / span > < span class = "p" > );< / span >
< a id = "__codelineno-0-15" name = "__codelineno-0-15" href = "#__codelineno-0-15" > < / a > < span class = "w" > < / span > < span class = "p" > }< / span >
< a id = "__codelineno-0-16" name = "__codelineno-0-16" href = "#__codelineno-0-16" > < / a > < span class = "w" > < / span > < span class = "c1" > // 2. 对各个桶执行排序< / span >
< a id = "__codelineno-0-17" name = "__codelineno-0-17" href = "#__codelineno-0-17" > < / a > < span class = "w" > < / span > < span class = "k" > for< / span > < span class = "w" > < / span > < span class = "p" > (< / span > < span class = "n" > List< / span > < span class = "o" > < < / span > < span class = "n" > Float< / span > < span class = "o" > > < / span > < span class = "w" > < / span > < span class = "n" > bucket< / span > < span class = "w" > < / span > < span class = "p" > :< / span > < span class = "w" > < / span > < span class = "n" > buckets< / span > < span class = "p" > )< / span > < span class = "w" > < / span > < span class = "p" > {< / span >
< a id = "__codelineno-0-18" name = "__codelineno-0-18" href = "#__codelineno-0-18" > < / a > < span class = "w" > < / span > < span class = "c1" > // 使用内置排序函数,也可以替换成其它排序算法< / span >
< a id = "__codelineno-0-19" name = "__codelineno-0-19" href = "#__codelineno-0-19" > < / a > < span class = "w" > < / span > < span class = "n" > Collections< / span > < span class = "p" > .< / span > < span class = "na" > sort< / span > < span class = "p" > (< / span > < span class = "n" > bucket< / span > < span class = "p" > );< / span >
< a id = "__codelineno-0-20" name = "__codelineno-0-20" href = "#__codelineno-0-20" > < / a > < span class = "w" > < / span > < span class = "p" > }< / span >
< a id = "__codelineno-0-21" name = "__codelineno-0-21" href = "#__codelineno-0-21" > < / a > < span class = "w" > < / span > < span class = "c1" > // 3. 遍历桶合并结果< / span >
< a id = "__codelineno-0-22" name = "__codelineno-0-22" href = "#__codelineno-0-22" > < / a > < span class = "w" > < / span > < span class = "kt" > int< / span > < span class = "w" > < / span > < span class = "n" > i< / span > < span class = "w" > < / span > < span class = "o" > =< / span > < span class = "w" > < / span > < span class = "mi" > 0< / span > < span class = "p" > ;< / span >
< a id = "__codelineno-0-23" name = "__codelineno-0-23" href = "#__codelineno-0-23" > < / a > < span class = "w" > < / span > < span class = "k" > for< / span > < span class = "w" > < / span > < span class = "p" > (< / span > < span class = "n" > List< / span > < span class = "o" > < < / span > < span class = "n" > Float< / span > < span class = "o" > > < / span > < span class = "w" > < / span > < span class = "n" > bucket< / span > < span class = "w" > < / span > < span class = "p" > :< / span > < span class = "w" > < / span > < span class = "n" > buckets< / span > < span class = "p" > )< / span > < span class = "w" > < / span > < span class = "p" > {< / span >
< a id = "__codelineno-0-24" name = "__codelineno-0-24" href = "#__codelineno-0-24" > < / a > < span class = "w" > < / span > < span class = "k" > for< / span > < span class = "w" > < / span > < span class = "p" > (< / span > < span class = "kt" > float< / span > < span class = "w" > < / span > < span class = "n" > num< / span > < span class = "w" > < / span > < span class = "p" > :< / span > < span class = "w" > < / span > < span class = "n" > bucket< / span > < span class = "p" > )< / span > < span class = "w" > < / span > < span class = "p" > {< / span >
< a id = "__codelineno-0-25" name = "__codelineno-0-25" href = "#__codelineno-0-25" > < / a > < span class = "w" > < / span > < span class = "n" > nums< / span > < span class = "o" > [< / span > < span class = "n" > i< / span > < span class = "o" > ++]< / span > < span class = "w" > < / span > < span class = "o" > =< / span > < span class = "w" > < / span > < span class = "n" > num< / span > < span class = "p" > ;< / span >
< a id = "__codelineno-0-26" name = "__codelineno-0-26" href = "#__codelineno-0-26" > < / a > < span class = "w" > < / span > < span class = "p" > }< / span >
< a id = "__codelineno-0-27" name = "__codelineno-0-27" href = "#__codelineno-0-27" > < / a > < span class = "w" > < / span > < span class = "p" > }< / span >
< a id = "__codelineno-0-28" name = "__codelineno-0-28" href = "#__codelineno-0-28" > < / a > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< p > (图)< / p >
< p > 理论上桶排序的时间复杂度是 < span class = "arithmatex" > \(O(n)\)< / span > , < strong > 但前提是需要将元素均匀分配到各个桶中< / strong > ,而这并不容易做到。假设想要把淘宝的 < span class = "arithmatex" > \(100\)< / span > 万件商品根据价格范围平均分配到 < span class = "arithmatex" > \(100\)< / span > 个桶中,而商品价格不是均匀分布的,例如 < span class = "arithmatex" > \(100\)< / span > 元以下的商品非常多、< span class = "arithmatex" > \(1000\)< / span > 元以上的商品非常少等。如果我们将价格区间平均划分为 < span class = "arithmatex" > \(100\)< / span > 份,那么各个桶内的商品数量差距会非常大。为了实现平均分配,我们一般这样做:< / p >
< ul >
< li > 先粗略设置分界线,将元素分配完后,< strong > 把元素较多的桶继续划分为多个桶< / strong > ,直至所有桶内元素数量合理为止;该做法本质上是一个递归树;< / li >
< li > 如果我们提前知道商品价格的概率分布,< strong > 则可以根据已知分布来设置每个桶的价格分界线< / strong > ;值得说明的是,数据分布不一定需要特意统计,也可以根据数据特点采用某种常见概率模型来近似,例如自然界的正态分布等;< / li >
< / ul >
< p > (图)< / p >
< p > 另外,排序桶内元素需要选择一种合适的排序算法,比如快速排序。< / p >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.cpp< / span > < pre > < span > < / span > < code > < a id = "__codelineno-1-1" name = "__codelineno-1-1" href = "#__codelineno-1-1" > < / a > < span class = "p" > [< / span > < span class = "k" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "n" > func< / span > < span class = "p" > ]{< / span > < span class = "n" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.py< / span > < pre > < span > < / span > < code > < a id = "__codelineno-2-1" name = "__codelineno-2-1" href = "#__codelineno-2-1" > < / a > < span class = "p" > [< / span > < span class = "n" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "n" > func< / span > < span class = "p" > ]{< / span > < span class = "n" > bucket_sort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.go< / span > < pre > < span > < / span > < code > < a id = "__codelineno-3-1" name = "__codelineno-3-1" href = "#__codelineno-3-1" > < / a > < span class = "p" > [< / span > < span class = "nx" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "kd" > func< / span > < span class = "p" > ]{< / span > < span class = "nx" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.js< / span > < pre > < span > < / span > < code > < a id = "__codelineno-4-1" name = "__codelineno-4-1" href = "#__codelineno-4-1" > < / a > < span class = "p" > [< / span > < span class = "kd" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "nx" > func< / span > < span class = "p" > ]{< / span > < span class = "nx" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.ts< / span > < pre > < span > < / span > < code > < a id = "__codelineno-5-1" name = "__codelineno-5-1" href = "#__codelineno-5-1" > < / a > < span class = "p" > [< / span > < span class = "kd" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "nx" > func< / span > < span class = "p" > ]{< / span > < span class = "nx" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.c< / span > < pre > < span > < / span > < code > < a id = "__codelineno-6-1" name = "__codelineno-6-1" href = "#__codelineno-6-1" > < / a > < span class = "p" > [< / span > < span class = "n" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "n" > func< / span > < span class = "p" > ]{< / span > < span class = "n" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.cs< / span > < pre > < span > < / span > < code > < a id = "__codelineno-7-1" name = "__codelineno-7-1" href = "#__codelineno-7-1" > < / a > < span class = "na" > [class]< / span > < span class = "p" > {< / span > < span class = "n" > bucket_sort< / span > < span class = "p" > }< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "n" > func< / span > < span class = "p" > ]{< / span > < span class = "n" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.swift< / span > < pre > < span > < / span > < code > < a id = "__codelineno-8-1" name = "__codelineno-8-1" href = "#__codelineno-8-1" > < / a > < span class = "p" > [< / span > < span class = "kd" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "kd" > func< / span > < span class = "p" > ]{< / span > < span class = "n" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< div class = "tabbed-block" >
< div class = "highlight" > < span class = "filename" > bucket_sort.zig< / span > < pre > < span > < / span > < code > < a id = "__codelineno-9-1" name = "__codelineno-9-1" href = "#__codelineno-9-1" > < / a > < span class = "p" > [< / span > < span class = "n" > class< / span > < span class = "p" > ]{}< / span > < span class = "o" > -< / span > < span class = "p" > [< / span > < span class = "n" > func< / span > < span class = "p" > ]{< / span > < span class = "n" > bucketSort< / span > < span class = "p" > }< / span >
< / code > < / pre > < / div >
< / div >
< / div >
< / div >
< div class = "admonition note" >
< p class = "admonition-title" > 桶排序是计数排序的一种推广< / p >
< p > 从桶排序的角度,我们可以把计数排序中计数数组 < code > counter< / code > 的每个索引想象成一个桶,将统计数量的过程想象成把各个元素分配到对应的桶中,再根据桶之间的有序性输出结果,从而实现排序。< / p >
< / div >
< h2 id = "1172" > 11.7.2. 算法特性< a class = "headerlink" href = "#1172" title = "Permanent link" > ¶ < / a > < / h2 >
< p > < strong > 时间复杂度 < span class = "arithmatex" > \(O(n + k)\)< / span > < / strong > :假设元素平均分布在各个桶内,则每个桶内元素数量为 < span class = "arithmatex" > \(\frac{n}{k}\)< / span > 。假设排序单个桶使用 < span class = "arithmatex" > \(O(\frac{n}{k} \log\frac{n}{k})\)< / span > 时间,则排序所有桶使用 < span class = "arithmatex" > \(O(n \log\frac{n}{k})\)< / span > 时间,< strong > 当桶数量 < span class = "arithmatex" > \(k\)< / span > 比较大时,时间复杂度则趋向于 < span class = "arithmatex" > \(O(n)\)< / span > < / strong > 。最后合并结果需要遍历 < span class = "arithmatex" > \(n\)< / span > 个桶,使用 < span class = "arithmatex" > \(O(k)\)< / span > 时间。< / p >
< p > 最差情况下,所有数据被分配到一个桶中,且排序算法退化至 < span class = "arithmatex" > \(O(n^2)\)< / span > ,此时使用 < span class = "arithmatex" > \(O(n^2)\)< / span > 时间,因此是“自适应排序”。< / p >
< p > < strong > 空间复杂度 < span class = "arithmatex" > \(O(n + k)\)< / span > < / strong > :需要借助 < span class = "arithmatex" > \(k\)< / span > 个桶和共 < span class = "arithmatex" > \(n\)< / span > 个元素的额外空间,是“非原地排序”。< / p >
< p > 桶排序是否稳定取决于排序桶内元素的算法是否稳定。< / p >
< h2 id = "1173" > 11.7.3. 如何实现平均分配< a class = "headerlink" href = "#1173" title = "Permanent link" > ¶ < / a > < / h2 >
< p > 桶排序的时间复杂度理论上可以达到 < span class = "arithmatex" > \(O(n)\)< / span > , < strong > 难点是需要将元素均匀分配到各个桶中< / strong > ,因为现实中的数据往往都不是均匀分布的。举个例子,假设我们想要把淘宝的所有商品根据价格范围平均分配到 10 个桶中, 然而商品价格不是均匀分布的, 100 元以下非常多、1000 元以上非常少;如果我们将价格区间平均划为 10 份,那么各个桶内的商品数量差距会非常大。< / p >
< p > 为了实现平均分配,我们可以先大致设置一个分界线,将数据粗略分到 3 个桶,分配完后,< strong > 再把商品较多的桶继续划分为 3 个桶,直至所有桶内元素数量大致平均为止< / strong > 。此方法本质上是生成一个递归树,让叶结点的值尽量平均。当然,不一定非要划分为 3 个桶,可以根据数据特点灵活选取。< / p >
< p > < img alt = "递归划分桶" src = "../bucket_sort.assets/scatter_in_buckets_recursively.png" / > < / p >
< p align = "center" > Fig. 递归划分桶 < / p >
< p > 如果我们提前知道商品价格的概率分布,< strong > 那么也可以根据数据概率分布来设置每个桶的价格分界线< / strong > 。注意,数据分布不一定需要特意去统计,也可以根据数据特点采用某种概率模型来近似。如下图所示,我们假设商品价格服从正态分布,就可以合理设置价格区间,将商品平均分配到各个桶中。< / p >
< p > < img alt = "根据概率分布划分桶" src = "../bucket_sort.assets/scatter_in_buckets_distribution.png" / > < / p >
< p align = "center" > Fig. 根据概率分布划分桶 < / p >
< h2 id = "__comments" > 评论< / h2 >
<!-- Insert generated snippet here -->
< script
src="https://giscus.app/client.js"
data-repo="krahets/hello-algo"
data-repo-id="R_kgDOIXtSqw"
data-category="Announcements"
data-category-id="DIC_kwDOIXtSq84CSZk_"
data-mapping="pathname"
data-strict="1"
data-reactions-enabled="1"
data-emit-metadata="0"
data-input-position="bottom"
data-theme="preferred_color_scheme"
data-lang="zh-CN"
crossorigin="anonymous"
async
>
< / script >
<!-- Synchronize Giscus theme with palette -->
< script >
var giscus = document.querySelector("script[src*=giscus]")
/* Set palette on initial load */
var palette = __md_get("__palette")
if (palette & & typeof palette.color === "object") {
var theme = palette.color.scheme === "slate" ? "dark" : "light"
giscus.setAttribute("data-theme", theme)
}
/* Register event handlers after documented loaded */
document.addEventListener("DOMContentLoaded", function() {
var ref = document.querySelector("[data-md-component=palette]")
ref.addEventListener("change", function() {
var palette = __md_get("__palette")
if (palette & & typeof palette.color === "object") {
var theme = palette.color.scheme === "slate" ? "dark" : "light"
/* Instruct Giscus to change theme */
var frame = document.querySelector(".giscus-frame")
frame.contentWindow.postMessage(
{ giscus: { setConfig: { theme } } },
"https://giscus.app"
)
}
})
})
< / script >
< / article >
@ -1670,6 +1911,42 @@
< footer class = "md-footer" >
< nav class = "md-footer__inner md-grid" aria-label = "页脚" >
< a href = "../counting_sort/" class = "md-footer__link md-footer__link--prev" aria-label = "上一页: 11.6. &nbsp; 计数排序( New) " rel = "prev" >
< div class = "md-footer__button md-icon" >
< svg xmlns = "http://www.w3.org/2000/svg" viewBox = "0 0 24 24" > < path d = "M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11h12Z" / > < / svg >
< / div >
< div class = "md-footer__title" >
< div class = "md-ellipsis" >
< span class = "md-footer__direction" >
上一页
< / span >
11.6. 计数排序( New)
< / div >
< / div >
< / a >
< a href = "../summary/" class = "md-footer__link md-footer__link--next" aria-label = "下一页: 11.8. &nbsp; 小结" rel = "next" >
< div class = "md-footer__title" >
< div class = "md-ellipsis" >
< span class = "md-footer__direction" >
下一页
< / span >
11.8. 小结
< / div >
< / div >
< div class = "md-footer__button md-icon" >
< svg xmlns = "http://www.w3.org/2000/svg" viewBox = "0 0 24 24" > < path d = "M4 11v2h12l-5.5 5.5 1.42 1.42L19.84 12l-7.92-7.92L10.5 5.5 16 11H4Z" / > < / svg >
< / div >
< / a >
< / nav >
< div class = "md-footer-meta md-typeset" >
< div class = "md-footer-meta__inner md-grid" >