translation: Add the initial translation of chapter "divide and conquer" (#1322)

* Add the initial translation of chapter "divide and conquer"

* Update index.md

* Update summary.md

* Update index.md

* Update summary.md
pull/1326/head
Yudong Jin 7 months ago committed by GitHub
parent 01a2e31203
commit 3bd416600e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

@ -0,0 +1,45 @@
# Divide and conquer search strategy
We have learned that search algorithms fall into two main categories.
- **Brute-force search**: It is implemented by traversing the data structure, with a time complexity of $O(n)$.
- **Adaptive search**: It utilizes a unique data organization form or prior information, and its time complexity can reach $O(\log n)$ or even $O(1)$.
In fact, **search algorithms with a time complexity of $O(\log n)$ are usually based on the divide-and-conquer strategy**, such as binary search and trees.
- Each step of binary search divides the problem (searching for a target element in an array) into a smaller problem (searching for the target element in half of the array), continuing until the array is empty or the target element is found.
- Trees represent the divide-and-conquer idea, where in data structures like binary search trees, AVL trees, and heaps, the time complexity of various operations is $O(\log n)$.
The divide-and-conquer strategy of binary search is as follows.
- **The problem can be divided**: Binary search recursively divides the original problem (searching in an array) into subproblems (searching in half of the array), achieved by comparing the middle element with the target element.
- **Subproblems are independent**: In binary search, each round handles one subproblem, unaffected by other subproblems.
- **The solutions of subproblems do not need to be merged**: Binary search aims to find a specific element, so there is no need to merge the solutions of subproblems. When a subproblem is solved, the original problem is also solved.
Divide-and-conquer can enhance search efficiency because brute-force search can only eliminate one option per round, **whereas divide-and-conquer can eliminate half of the options**.
### Implementing binary search based on divide-and-conquer
In previous chapters, binary search was implemented based on iteration. Now, we implement it based on divide-and-conquer (recursion).
!!! question
Given an ordered array `nums` of length $n$, where all elements are unique, please find the element `target`.
From a divide-and-conquer perspective, we denote the subproblem corresponding to the search interval $[i, j]$ as $f(i, j)$.
Starting from the original problem $f(0, n-1)$, perform the binary search through the following steps.
1. Calculate the midpoint $m$ of the search interval $[i, j]$, and use it to eliminate half of the search interval.
2. Recursively solve the subproblem reduced by half in size, which could be $f(i, m-1)$ or $f(m+1, j)$.
3. Repeat steps `1.` and `2.`, until `target` is found or the interval is empty and returns.
The diagram below shows the divide-and-conquer process of binary search for element $6$ in an array.
![The divide-and-conquer process of binary search](binary_search_recur.assets/binary_search_recur.png)
In the implementation code, we declare a recursive function `dfs()` to solve the problem $f(i, j)$:
```src
[file]{binary_search_recur}-[class]{}-[func]{binary_search}
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

@ -0,0 +1,99 @@
# Building binary tree problem
!!! question
Given the preorder traversal `preorder` and inorder traversal `inorder` of a binary tree, construct the binary tree and return the root node of the binary tree. Assume that there are no duplicate values in the nodes of the binary tree (as shown in the diagram below).
![Example data for building a binary tree](build_binary_tree_problem.assets/build_tree_example.png)
### Determining if it is a divide and conquer problem
The original problem of constructing a binary tree from `preorder` and `inorder` is a typical divide and conquer problem.
- **The problem can be decomposed**: From the perspective of divide and conquer, we can divide the original problem into two subproblems: building the left subtree and building the right subtree, plus one operation: initializing the root node. For each subtree (subproblem), we can still use the above division method, dividing it into smaller subtrees (subproblems), until the smallest subproblem (empty subtree) is reached.
- **The subproblems are independent**: The left and right subtrees are independent of each other, with no overlap. When building the left subtree, we only need to focus on the parts of the inorder and preorder traversals that correspond to the left subtree. The same applies to the right subtree.
- **Solutions to subproblems can be combined**: Once the solutions for the left and right subtrees (solutions to subproblems) are obtained, we can link them to the root node to obtain the solution to the original problem.
### How to divide the subtrees
Based on the above analysis, this problem can be solved using divide and conquer, **but how do we use the preorder traversal `preorder` and inorder traversal `inorder` to divide the left and right subtrees?**
By definition, `preorder` and `inorder` can be divided into three parts.
- Preorder traversal: `[ Root | Left Subtree | Right Subtree ]`, for example, the tree in the diagram corresponds to `[ 3 | 9 | 2 1 7 ]`.
- Inorder traversal: `[ Left Subtree | Root | Right Subtree ]`, for example, the tree in the diagram corresponds to `[ 9 | 3 | 1 2 7 ]`.
Using the data in the diagram above, we can obtain the division results as shown in the steps below.
1. The first element 3 in the preorder traversal is the value of the root node.
2. Find the index of the root node 3 in `inorder`, and use this index to divide `inorder` into `[ 9 | 3 1 2 7 ]`.
3. Based on the division results of `inorder`, it is easy to determine the number of nodes in the left and right subtrees as 1 and 3, respectively, thus dividing `preorder` into `[ 3 | 9 | 2 1 7 ]`.
![Dividing the subtrees in preorder and inorder traversals](build_binary_tree_problem.assets/build_tree_preorder_inorder_division.png)
### Describing subtree intervals based on variables
Based on the above division method, **we have now obtained the index intervals of the root, left subtree, and right subtree in `preorder` and `inorder`**. To describe these index intervals, we need the help of several pointer variables.
- Let the index of the current tree's root node in `preorder` be denoted as $i$.
- Let the index of the current tree's root node in `inorder` be denoted as $m$.
- Let the index interval of the current tree in `inorder` be denoted as $[l, r]$.
As shown in the table below, the above variables can represent the index of the root node in `preorder` as well as the index intervals of the subtrees in `inorder`.
<p align="center"> Table <id> &nbsp; Indexes of the root node and subtrees in preorder and inorder traversals </p>
| | Root node index in `preorder` | Subtree index interval in `inorder` |
| ------------- | ----------------------------- | ----------------------------------- |
| Current tree | $i$ | $[l, r]$ |
| Left subtree | $i + 1$ | $[l, m-1]$ |
| Right subtree | $i + 1 + (m - l)$ | $[m+1, r]$ |
Please note, the meaning of $(m-l)$ in the right subtree root index is "the number of nodes in the left subtree", which is suggested to be understood in conjunction with the diagram below.
![Indexes of the root node and left and right subtrees](build_binary_tree_problem.assets/build_tree_division_pointers.png)
### Code implementation
To improve the efficiency of querying $m$, we use a hash table `hmap` to store the mapping of elements in `inorder` to their indexes:
```src
[file]{build_tree}-[class]{}-[func]{build_tree}
```
The diagram below shows the recursive process of building the binary tree, where each node is established during the "descending" process, and each edge (reference) is established during the "ascending" process.
=== "<1>"
![Recursive process of building a binary tree](build_binary_tree_problem.assets/built_tree_step1.png)
=== "<2>"
![built_tree_step2](build_binary_tree_problem.assets/built_tree_step2.png)
=== "<3>"
![built_tree_step3](build_binary_tree_problem.assets/built_tree_step3.png)
=== "<4>"
![built_tree_step4](build_binary_tree_problem.assets/built_tree_step4.png)
=== "<5>"
![built_tree_step5](build_binary_tree_problem.assets/built_tree_step5.png)
=== "<6>"
![built_tree_step6](build_binary_tree_problem.assets/built_tree_step6.png)
=== "<7>"
![built_tree_step7](build_binary_tree_problem.assets/built_tree_step7.png)
=== "<8>"
![built_tree_step8](build_binary_tree_problem.assets/built_tree_step8.png)
=== "<9>"
![built_tree_step9](build_binary_tree_problem.assets/built_tree_step9.png)
Each recursive function's division results of `preorder` and `inorder` are shown in the diagram below.
![Division results in each recursive function](build_binary_tree_problem.assets/built_tree_overall.png)
Assuming the number of nodes in the tree is $n$, initializing each node (executing a recursive function `dfs()`) takes $O(1)$ time. **Thus, the overall time complexity is $O(n)$**.
The hash table stores the mapping of `inorder` elements to their indexes, with a space complexity of $O(n)$. In the worst case, when the binary tree degenerates into a linked list, the recursive depth reaches $n$, using $O(n)$ stack frame space. **Therefore, the overall space complexity is $O(n)$**.

@ -0,0 +1,91 @@
# Divide and conquer algorithms
<u>Divide and conquer</u>, fully referred to as "divide and rule", is an extremely important and common algorithm strategy. Divide and conquer is usually based on recursion and includes two steps: "divide" and "conquer".
1. **Divide (partition phase)**: Recursively decompose the original problem into two or more sub-problems until the smallest sub-problem is reached and the process terminates.
2. **Conquer (merge phase)**: Starting from the smallest sub-problem with a known solution, merge the solutions of the sub-problems from bottom to top to construct the solution to the original problem.
As shown in the figure below, "merge sort" is one of the typical applications of the divide and conquer strategy.
1. **Divide**: Recursively divide the original array (original problem) into two sub-arrays (sub-problems), until the sub-array has only one element (smallest sub-problem).
2. **Conquer**: Merge the ordered sub-arrays (solutions to the sub-problems) from bottom to top to obtain an ordered original array (solution to the original problem).
![Merge sort's divide and conquer strategy](divide_and_conquer.assets/divide_and_conquer_merge_sort.png)
## How to identify divide and conquer problems
Whether a problem is suitable for a divide and conquer solution can usually be judged based on the following criteria.
1. **The problem can be decomposed**: The original problem can be decomposed into smaller, similar sub-problems and can be recursively divided in the same manner.
2. **Sub-problems are independent**: There is no overlap between sub-problems, and they are independent and can be solved separately.
3. **Solutions to sub-problems can be merged**: The solution to the original problem is obtained by merging the solutions of the sub-problems.
Clearly, merge sort meets these three criteria.
1. **The problem can be decomposed**: Recursively divide the array (original problem) into two sub-arrays (sub-problems).
2. **Sub-problems are independent**: Each sub-array can be sorted independently (sub-problems can be solved independently).
3. **Solutions to sub-problems can be merged**: Two ordered sub-arrays (solutions to the sub-problems) can be merged into one ordered array (solution to the original problem).
## Improving efficiency through divide and conquer
**Divide and conquer can not only effectively solve algorithm problems but often also improve algorithm efficiency**. In sorting algorithms, quicksort, merge sort, and heap sort are faster than selection, bubble, and insertion sorts because they apply the divide and conquer strategy.
Then, we may ask: **Why can divide and conquer improve algorithm efficiency, and what is the underlying logic?** In other words, why are the steps of decomposing a large problem into multiple sub-problems, solving the sub-problems, and merging the solutions of the sub-problems into the solution of the original problem more efficient than directly solving the original problem? This question can be discussed from the aspects of the number of operations and parallel computation.
### Optimization of operation count
Taking "bubble sort" as an example, it requires $O(n^2)$ time to process an array of length $n$. Suppose we divide the array from the midpoint into two sub-arrays as shown in the figure below, then the division requires $O(n)$ time, sorting each sub-array requires $O((n / 2)^2)$ time, and merging the two sub-arrays requires $O(n)$ time, with the total time complexity being:
$$
O(n + (\frac{n}{2})^2 \times 2 + n) = O(\frac{n^2}{2} + 2n)
$$
![Bubble sort before and after array partition](divide_and_conquer.assets/divide_and_conquer_bubble_sort.png)
Next, we calculate the following inequality, where the left and right sides are the total number of operations before and after the partition, respectively:
$$
\begin{aligned}
n^2 & > \frac{n^2}{2} + 2n \newline
n^2 - \frac{n^2}{2} - 2n & > 0 \newline
n(n - 4) & > 0
\end{aligned}
$$
**This means that when $n > 4$, the number of operations after partitioning is fewer, and the sorting efficiency should be higher**. Please note that the time complexity after partitioning is still quadratic $O(n^2)$, but the constant factor in the complexity has decreased.
Further, **what if we keep dividing the sub-arrays from their midpoints into two sub-arrays** until the sub-arrays have only one element left? This idea is actually "merge sort," with a time complexity of $O(n \log n)$.
Furthermore, **what if we set several more partition points** and evenly divide the original array into $k$ sub-arrays? This situation is very similar to "bucket sort," which is very suitable for sorting massive data, and theoretically, the time complexity can reach $O(n + k)$.
### Optimization through parallel computation
We know that the sub-problems generated by divide and conquer are independent of each other, **thus they can usually be solved in parallel**. This means that divide and conquer can not only reduce the algorithm's time complexity, **but also facilitate parallel optimization by the operating system**.
Parallel optimization is especially effective in environments with multiple cores or processors, as the system can process multiple sub-problems simultaneously, making fuller use of computing resources and significantly reducing the overall runtime.
For example, in the "bucket sort" shown in the figure below, we distribute massive data evenly across various buckets, then the sorting tasks of all buckets can be distributed to different computing units, and the results are merged after completion.
![Bucket sort's parallel computation](divide_and_conquer.assets/divide_and_conquer_parallel_computing.png)
## Common applications of divide and conquer
On one hand, divide and conquer can be used to solve many classic algorithm problems.
- **Finding the closest point pair**: This algorithm first divides the set of points into two parts, then finds the closest point pair in each part, and finally finds the closest point pair that spans the two parts.
- **Large integer multiplication**: For example, the Karatsuba algorithm, which breaks down large integer multiplication into several smaller integer multiplications and additions.
- **Matrix multiplication**: For example, the Strassen algorithm, which decomposes large matrix multiplication into multiple small matrix multiplications and additions.
- **Tower of Hanoi problem**: The Tower of Hanoi problem can be solved recursively, a typical application of the divide and conquer strategy.
- **Solving inverse pairs**: In a sequence, if a number in front is greater than a number behind, these two numbers form an inverse pair. Solving the inverse pair problem can utilize the idea of divide and conquer, with the aid of merge sort.
On the other hand, divide and conquer is very widely applied in the design of algorithms and data structures.
- **Binary search**: Binary search divides an ordered array from the midpoint index into two parts, then decides which half to exclude based on the comparison result between the target value and the middle element value, and performs the same binary operation in the remaining interval.
- **Merge sort**: Already introduced at the beginning of this section, no further elaboration is needed.
- **Quicksort**: Quicksort selects a pivot value, then divides the array into two sub-arrays, one with elements smaller than the pivot and the other with elements larger than the pivot, and then performs the same partitioning operation on these two parts until the sub-array has only one element.
- **Bucket sort**: The basic idea of bucket sort is to distribute data to multiple buckets, then sort the elements within each bucket, and finally retrieve the elements from the buckets in order to obtain an ordered array.
- **Trees**: For example, binary search trees, AVL trees, red-black trees, B-trees, B+ trees, etc., their operations such as search, insertion, and deletion can all be considered applications of the divide and conquer strategy.
- **Heap**: A heap is a special type of complete binary tree, whose various operations, such as insertion, deletion, and heapification, actually imply the idea of divide and conquer.
- **Hash table**: Although hash tables do not directly apply divide and conquer, some hash collision resolution solutions indirectly apply the divide and conquer strategy, for example, long lists in chained addressing being converted to red-black trees to improve query efficiency.
It can be seen that **divide and conquer is a subtly pervasive algorithmic idea**, embedded within various algorithms and data structures.

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

@ -0,0 +1,97 @@
# Tower of Hanoi Problem
In both merge sorting and building binary trees, we decompose the original problem into two subproblems, each half the size of the original problem. However, for the Tower of Hanoi, we adopt a different decomposition strategy.
!!! question
Given three pillars, denoted as `A`, `B`, and `C`. Initially, pillar `A` is stacked with $n$ discs, arranged in order from top to bottom from smallest to largest. Our task is to move these $n$ discs to pillar `C`, maintaining their original order (as shown below). The following rules must be followed during the disc movement process:
1. A disc can only be picked up from the top of a pillar and placed on top of another pillar.
2. Only one disc can be moved at a time.
3. A smaller disc must always be on top of a larger disc.
![Example of the Tower of Hanoi](hanota_problem.assets/hanota_example.png)
**We denote the Tower of Hanoi of size $i$ as $f(i)$**. For example, $f(3)$ represents the Tower of Hanoi of moving $3$ discs from `A` to `C`.
### Consider the base case
As shown below, for the problem $f(1)$, i.e., when there is only one disc, we can directly move it from `A` to `C`.
=== "<1>"
![Solution for a problem of size 1](hanota_problem.assets/hanota_f1_step1.png)
=== "<2>"
![hanota_f1_step2](hanota_problem.assets/hanota_f1_step2.png)
As shown below, for the problem $f(2)$, i.e., when there are two discs, **since the smaller disc must always be above the larger disc, `B` is needed to assist in the movement**.
1. First, move the smaller disc from `A` to `B`.
2. Then move the larger disc from `A` to `C`.
3. Finally, move the smaller disc from `B` to `C`.
=== "<1>"
![Solution for a problem of size 2](hanota_problem.assets/hanota_f2_step1.png)
=== "<2>"
![hanota_f2_step2](hanota_problem.assets/hanota_f2_step2.png)
=== "<3>"
![hanota_f2_step3](hanota_problem.assets/hanota_f2_step3.png)
=== "<4>"
![hanota_f2_step4](hanota_problem.assets/hanota_f2_step4.png)
The process of solving the problem $f(2)$ can be summarized as: **moving two discs from `A` to `C` with the help of `B`**. Here, `C` is called the target pillar, and `B` is called the buffer pillar.
### Decomposition of subproblems
For the problem $f(3)$, i.e., when there are three discs, the situation becomes slightly more complicated.
Since we already know the solutions to $f(1)$ and $f(2)$, we can think from a divide-and-conquer perspective and **consider the two top discs on `A` as a unit**, performing the steps shown below. This way, the three discs are successfully moved from `A` to `C`.
1. Let `B` be the target pillar and `C` the buffer pillar, and move the two discs from `A` to `B`.
2. Move the remaining disc from `A` directly to `C`.
3. Let `C` be the target pillar and `A` the buffer pillar, and move the two discs from `B` to `C`.
=== "<1>"
![Solution for a problem of size 3](hanota_problem.assets/hanota_f3_step1.png)
=== "<2>"
![hanota_f3_step2](hanota_problem.assets/hanota_f3_step2.png)
=== "<3>"
![hanota_f3_step3](hanota_problem.assets/hanota_f3_step3.png)
=== "<4>"
![hanota_f3_step4](hanota_problem.assets/hanota_f3_step4.png)
Essentially, **we divide the problem $f(3)$ into two subproblems $f(2)$ and one subproblem $f(1)$**. By solving these three subproblems in order, the original problem is resolved. This indicates that the subproblems are independent, and their solutions can be merged.
From this, we can summarize the divide-and-conquer strategy for solving the Tower of Hanoi shown in the following image: divide the original problem $f(n)$ into two subproblems $f(n-1)$ and one subproblem $f(1)$, and solve these three subproblems in the following order.
1. Move $n-1$ discs with the help of `C` from `A` to `B`.
2. Move the remaining one disc directly from `A` to `C`.
3. Move $n-1$ discs with the help of `A` from `B` to `C`.
For these two subproblems $f(n-1)$, **they can be recursively divided in the same manner** until the smallest subproblem $f(1)$ is reached. The solution to $f(1)$ is already known and requires only one move.
![Divide and conquer strategy for solving the Tower of Hanoi](hanota_problem.assets/hanota_divide_and_conquer.png)
### Code implementation
In the code, we declare a recursive function `dfs(i, src, buf, tar)` whose role is to move the $i$ discs on top of pillar `src` with the help of buffer pillar `buf` to the target pillar `tar`:
```src
[file]{hanota}-[class]{}-[func]{solve_hanota}
```
As shown below, the Tower of Hanoi forms a recursive tree with a height of $n$, each node representing a subproblem, corresponding to an open `dfs()` function, **thus the time complexity is $O(2^n)$, and the space complexity is $O(n)$**.
![Recursive tree of the Tower of Hanoi](hanota_problem.assets/hanota_recursive_tree.png)
!!! quote
The Tower of Hanoi originates from an ancient legend. In a temple in ancient India, monks had three tall diamond pillars and $64$ differently sized golden discs. The monks continuously moved the discs, believing that when the last disc is correctly placed, the world would end.
However, even if the monks moved a disc every second, it would take about $2^{64} \approx 1.84×10^{19}$ seconds, approximately 585 billion years, far exceeding current estimates of the age of the universe. Thus, if the legend is true, we probably do not need to worry about the world ending.

@ -0,0 +1,9 @@
# Divide and conquer
![Divide and Conquer](../assets/covers/chapter_divide_and_conquer.jpg)
!!! abstract
Difficult problems are decomposed layer by layer, each decomposition making them simpler.
Divide and conquer reveals an important truth: start with simplicity, and nothing is complex anymore.

@ -0,0 +1,11 @@
# Summary
- Divide and conquer is a common algorithm design strategy, which includes dividing (partitioning) and conquering (merging) two stages, usually implemented based on recursion.
- The basis for judging whether it is a divide and conquer algorithm problem includes: whether the problem can be decomposed, whether the subproblems are independent, and whether the subproblems can be merged.
- Merge sort is a typical application of the divide and conquer strategy, which recursively divides the array into two equal-length subarrays until only one element remains, and then starts merging layer by layer to complete the sorting.
- Introducing the divide and conquer strategy can often improve algorithm efficiency. On one hand, the divide and conquer strategy reduces the number of operations; on the other hand, it is conducive to parallel optimization of the system after division.
- Divide and conquer can solve many algorithm problems and is widely used in data structure and algorithm design, where its presence is ubiquitous.
- Compared to brute force search, adaptive search is more efficient. Search algorithms with a time complexity of $O(\log n)$ are usually based on the divide and conquer strategy.
- Binary search is another typical application of the divide and conquer strategy, which does not include the step of merging the solutions of subproblems. We can implement binary search through recursive divide and conquer.
- In the problem of constructing binary trees, building the tree (original problem) can be divided into building the left and right subtree (subproblems), which can be achieved by partitioning the index intervals of the preorder and inorder traversals.
- In the Tower of Hanoi problem, a problem of size $n$ can be divided into two subproblems of size $n-1$ and one subproblem of size $1$. By solving these three subproblems in sequence, the original problem is consequently resolved.
Loading…
Cancel
Save