translation: Add the initial translation of chapter "divide and conquer" (#1322)
* Add the initial translation of chapter "divide and conquer" * Update index.md * Update summary.md * Update index.md * Update summary.mdpull/1326/head
After Width: | Height: | Size: 16 KiB |
@ -0,0 +1,45 @@
|
||||
# Divide and conquer search strategy
|
||||
|
||||
We have learned that search algorithms fall into two main categories.
|
||||
|
||||
- **Brute-force search**: It is implemented by traversing the data structure, with a time complexity of $O(n)$.
|
||||
- **Adaptive search**: It utilizes a unique data organization form or prior information, and its time complexity can reach $O(\log n)$ or even $O(1)$.
|
||||
|
||||
In fact, **search algorithms with a time complexity of $O(\log n)$ are usually based on the divide-and-conquer strategy**, such as binary search and trees.
|
||||
|
||||
- Each step of binary search divides the problem (searching for a target element in an array) into a smaller problem (searching for the target element in half of the array), continuing until the array is empty or the target element is found.
|
||||
- Trees represent the divide-and-conquer idea, where in data structures like binary search trees, AVL trees, and heaps, the time complexity of various operations is $O(\log n)$.
|
||||
|
||||
The divide-and-conquer strategy of binary search is as follows.
|
||||
|
||||
- **The problem can be divided**: Binary search recursively divides the original problem (searching in an array) into subproblems (searching in half of the array), achieved by comparing the middle element with the target element.
|
||||
- **Subproblems are independent**: In binary search, each round handles one subproblem, unaffected by other subproblems.
|
||||
- **The solutions of subproblems do not need to be merged**: Binary search aims to find a specific element, so there is no need to merge the solutions of subproblems. When a subproblem is solved, the original problem is also solved.
|
||||
|
||||
Divide-and-conquer can enhance search efficiency because brute-force search can only eliminate one option per round, **whereas divide-and-conquer can eliminate half of the options**.
|
||||
|
||||
### Implementing binary search based on divide-and-conquer
|
||||
|
||||
In previous chapters, binary search was implemented based on iteration. Now, we implement it based on divide-and-conquer (recursion).
|
||||
|
||||
!!! question
|
||||
|
||||
Given an ordered array `nums` of length $n$, where all elements are unique, please find the element `target`.
|
||||
|
||||
From a divide-and-conquer perspective, we denote the subproblem corresponding to the search interval $[i, j]$ as $f(i, j)$.
|
||||
|
||||
Starting from the original problem $f(0, n-1)$, perform the binary search through the following steps.
|
||||
|
||||
1. Calculate the midpoint $m$ of the search interval $[i, j]$, and use it to eliminate half of the search interval.
|
||||
2. Recursively solve the subproblem reduced by half in size, which could be $f(i, m-1)$ or $f(m+1, j)$.
|
||||
3. Repeat steps `1.` and `2.`, until `target` is found or the interval is empty and returns.
|
||||
|
||||
The diagram below shows the divide-and-conquer process of binary search for element $6$ in an array.
|
||||
|
||||
![The divide-and-conquer process of binary search](binary_search_recur.assets/binary_search_recur.png)
|
||||
|
||||
In the implementation code, we declare a recursive function `dfs()` to solve the problem $f(i, j)$:
|
||||
|
||||
```src
|
||||
[file]{binary_search_recur}-[class]{}-[func]{binary_search}
|
||||
```
|
After Width: | Height: | Size: 15 KiB |
After Width: | Height: | Size: 13 KiB |
After Width: | Height: | Size: 24 KiB |
After Width: | Height: | Size: 25 KiB |
After Width: | Height: | Size: 10 KiB |
After Width: | Height: | Size: 10 KiB |
After Width: | Height: | Size: 12 KiB |
After Width: | Height: | Size: 12 KiB |
After Width: | Height: | Size: 12 KiB |
After Width: | Height: | Size: 14 KiB |
After Width: | Height: | Size: 14 KiB |
After Width: | Height: | Size: 15 KiB |
After Width: | Height: | Size: 16 KiB |
After Width: | Height: | Size: 24 KiB |
After Width: | Height: | Size: 25 KiB |
After Width: | Height: | Size: 15 KiB |
@ -0,0 +1,91 @@
|
||||
# Divide and conquer algorithms
|
||||
|
||||
<u>Divide and conquer</u>, fully referred to as "divide and rule", is an extremely important and common algorithm strategy. Divide and conquer is usually based on recursion and includes two steps: "divide" and "conquer".
|
||||
|
||||
1. **Divide (partition phase)**: Recursively decompose the original problem into two or more sub-problems until the smallest sub-problem is reached and the process terminates.
|
||||
2. **Conquer (merge phase)**: Starting from the smallest sub-problem with a known solution, merge the solutions of the sub-problems from bottom to top to construct the solution to the original problem.
|
||||
|
||||
As shown in the figure below, "merge sort" is one of the typical applications of the divide and conquer strategy.
|
||||
|
||||
1. **Divide**: Recursively divide the original array (original problem) into two sub-arrays (sub-problems), until the sub-array has only one element (smallest sub-problem).
|
||||
2. **Conquer**: Merge the ordered sub-arrays (solutions to the sub-problems) from bottom to top to obtain an ordered original array (solution to the original problem).
|
||||
|
||||
![Merge sort's divide and conquer strategy](divide_and_conquer.assets/divide_and_conquer_merge_sort.png)
|
||||
|
||||
## How to identify divide and conquer problems
|
||||
|
||||
Whether a problem is suitable for a divide and conquer solution can usually be judged based on the following criteria.
|
||||
|
||||
1. **The problem can be decomposed**: The original problem can be decomposed into smaller, similar sub-problems and can be recursively divided in the same manner.
|
||||
2. **Sub-problems are independent**: There is no overlap between sub-problems, and they are independent and can be solved separately.
|
||||
3. **Solutions to sub-problems can be merged**: The solution to the original problem is obtained by merging the solutions of the sub-problems.
|
||||
|
||||
Clearly, merge sort meets these three criteria.
|
||||
|
||||
1. **The problem can be decomposed**: Recursively divide the array (original problem) into two sub-arrays (sub-problems).
|
||||
2. **Sub-problems are independent**: Each sub-array can be sorted independently (sub-problems can be solved independently).
|
||||
3. **Solutions to sub-problems can be merged**: Two ordered sub-arrays (solutions to the sub-problems) can be merged into one ordered array (solution to the original problem).
|
||||
|
||||
## Improving efficiency through divide and conquer
|
||||
|
||||
**Divide and conquer can not only effectively solve algorithm problems but often also improve algorithm efficiency**. In sorting algorithms, quicksort, merge sort, and heap sort are faster than selection, bubble, and insertion sorts because they apply the divide and conquer strategy.
|
||||
|
||||
Then, we may ask: **Why can divide and conquer improve algorithm efficiency, and what is the underlying logic?** In other words, why are the steps of decomposing a large problem into multiple sub-problems, solving the sub-problems, and merging the solutions of the sub-problems into the solution of the original problem more efficient than directly solving the original problem? This question can be discussed from the aspects of the number of operations and parallel computation.
|
||||
|
||||
### Optimization of operation count
|
||||
|
||||
Taking "bubble sort" as an example, it requires $O(n^2)$ time to process an array of length $n$. Suppose we divide the array from the midpoint into two sub-arrays as shown in the figure below, then the division requires $O(n)$ time, sorting each sub-array requires $O((n / 2)^2)$ time, and merging the two sub-arrays requires $O(n)$ time, with the total time complexity being:
|
||||
|
||||
$$
|
||||
O(n + (\frac{n}{2})^2 \times 2 + n) = O(\frac{n^2}{2} + 2n)
|
||||
$$
|
||||
|
||||
![Bubble sort before and after array partition](divide_and_conquer.assets/divide_and_conquer_bubble_sort.png)
|
||||
|
||||
Next, we calculate the following inequality, where the left and right sides are the total number of operations before and after the partition, respectively:
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
n^2 & > \frac{n^2}{2} + 2n \newline
|
||||
n^2 - \frac{n^2}{2} - 2n & > 0 \newline
|
||||
n(n - 4) & > 0
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
**This means that when $n > 4$, the number of operations after partitioning is fewer, and the sorting efficiency should be higher**. Please note that the time complexity after partitioning is still quadratic $O(n^2)$, but the constant factor in the complexity has decreased.
|
||||
|
||||
Further, **what if we keep dividing the sub-arrays from their midpoints into two sub-arrays** until the sub-arrays have only one element left? This idea is actually "merge sort," with a time complexity of $O(n \log n)$.
|
||||
|
||||
Furthermore, **what if we set several more partition points** and evenly divide the original array into $k$ sub-arrays? This situation is very similar to "bucket sort," which is very suitable for sorting massive data, and theoretically, the time complexity can reach $O(n + k)$.
|
||||
|
||||
### Optimization through parallel computation
|
||||
|
||||
We know that the sub-problems generated by divide and conquer are independent of each other, **thus they can usually be solved in parallel**. This means that divide and conquer can not only reduce the algorithm's time complexity, **but also facilitate parallel optimization by the operating system**.
|
||||
|
||||
Parallel optimization is especially effective in environments with multiple cores or processors, as the system can process multiple sub-problems simultaneously, making fuller use of computing resources and significantly reducing the overall runtime.
|
||||
|
||||
For example, in the "bucket sort" shown in the figure below, we distribute massive data evenly across various buckets, then the sorting tasks of all buckets can be distributed to different computing units, and the results are merged after completion.
|
||||
|
||||
![Bucket sort's parallel computation](divide_and_conquer.assets/divide_and_conquer_parallel_computing.png)
|
||||
|
||||
## Common applications of divide and conquer
|
||||
|
||||
On one hand, divide and conquer can be used to solve many classic algorithm problems.
|
||||
|
||||
- **Finding the closest point pair**: This algorithm first divides the set of points into two parts, then finds the closest point pair in each part, and finally finds the closest point pair that spans the two parts.
|
||||
- **Large integer multiplication**: For example, the Karatsuba algorithm, which breaks down large integer multiplication into several smaller integer multiplications and additions.
|
||||
- **Matrix multiplication**: For example, the Strassen algorithm, which decomposes large matrix multiplication into multiple small matrix multiplications and additions.
|
||||
- **Tower of Hanoi problem**: The Tower of Hanoi problem can be solved recursively, a typical application of the divide and conquer strategy.
|
||||
- **Solving inverse pairs**: In a sequence, if a number in front is greater than a number behind, these two numbers form an inverse pair. Solving the inverse pair problem can utilize the idea of divide and conquer, with the aid of merge sort.
|
||||
|
||||
On the other hand, divide and conquer is very widely applied in the design of algorithms and data structures.
|
||||
|
||||
- **Binary search**: Binary search divides an ordered array from the midpoint index into two parts, then decides which half to exclude based on the comparison result between the target value and the middle element value, and performs the same binary operation in the remaining interval.
|
||||
- **Merge sort**: Already introduced at the beginning of this section, no further elaboration is needed.
|
||||
- **Quicksort**: Quicksort selects a pivot value, then divides the array into two sub-arrays, one with elements smaller than the pivot and the other with elements larger than the pivot, and then performs the same partitioning operation on these two parts until the sub-array has only one element.
|
||||
- **Bucket sort**: The basic idea of bucket sort is to distribute data to multiple buckets, then sort the elements within each bucket, and finally retrieve the elements from the buckets in order to obtain an ordered array.
|
||||
- **Trees**: For example, binary search trees, AVL trees, red-black trees, B-trees, B+ trees, etc., their operations such as search, insertion, and deletion can all be considered applications of the divide and conquer strategy.
|
||||
- **Heap**: A heap is a special type of complete binary tree, whose various operations, such as insertion, deletion, and heapification, actually imply the idea of divide and conquer.
|
||||
- **Hash table**: Although hash tables do not directly apply divide and conquer, some hash collision resolution solutions indirectly apply the divide and conquer strategy, for example, long lists in chained addressing being converted to red-black trees to improve query efficiency.
|
||||
|
||||
It can be seen that **divide and conquer is a subtly pervasive algorithmic idea**, embedded within various algorithms and data structures.
|
After Width: | Height: | Size: 31 KiB |
After Width: | Height: | Size: 12 KiB |
After Width: | Height: | Size: 5.6 KiB |
After Width: | Height: | Size: 8.9 KiB |
After Width: | Height: | Size: 6.0 KiB |
After Width: | Height: | Size: 9.2 KiB |
After Width: | Height: | Size: 12 KiB |
After Width: | Height: | Size: 14 KiB |
After Width: | Height: | Size: 9.3 KiB |
After Width: | Height: | Size: 10 KiB |
After Width: | Height: | Size: 13 KiB |
After Width: | Height: | Size: 16 KiB |
After Width: | Height: | Size: 18 KiB |
@ -0,0 +1,9 @@
|
||||
# Divide and conquer
|
||||
|
||||
![Divide and Conquer](../assets/covers/chapter_divide_and_conquer.jpg)
|
||||
|
||||
!!! abstract
|
||||
|
||||
Difficult problems are decomposed layer by layer, each decomposition making them simpler.
|
||||
|
||||
Divide and conquer reveals an important truth: start with simplicity, and nothing is complex anymore.
|
@ -0,0 +1,11 @@
|
||||
# Summary
|
||||
|
||||
- Divide and conquer is a common algorithm design strategy, which includes dividing (partitioning) and conquering (merging) two stages, usually implemented based on recursion.
|
||||
- The basis for judging whether it is a divide and conquer algorithm problem includes: whether the problem can be decomposed, whether the subproblems are independent, and whether the subproblems can be merged.
|
||||
- Merge sort is a typical application of the divide and conquer strategy, which recursively divides the array into two equal-length subarrays until only one element remains, and then starts merging layer by layer to complete the sorting.
|
||||
- Introducing the divide and conquer strategy can often improve algorithm efficiency. On one hand, the divide and conquer strategy reduces the number of operations; on the other hand, it is conducive to parallel optimization of the system after division.
|
||||
- Divide and conquer can solve many algorithm problems and is widely used in data structure and algorithm design, where its presence is ubiquitous.
|
||||
- Compared to brute force search, adaptive search is more efficient. Search algorithms with a time complexity of $O(\log n)$ are usually based on the divide and conquer strategy.
|
||||
- Binary search is another typical application of the divide and conquer strategy, which does not include the step of merging the solutions of subproblems. We can implement binary search through recursive divide and conquer.
|
||||
- In the problem of constructing binary trees, building the tree (original problem) can be divided into building the left and right subtree (subproblems), which can be achieved by partitioning the index intervals of the preorder and inorder traversals.
|
||||
- In the Tower of Hanoi problem, a problem of size $n$ can be divided into two subproblems of size $n-1$ and one subproblem of size $1$. By solving these three subproblems in sequence, the original problem is consequently resolved.
|