Merge sort is similar to the quick sort algorithm as works on the concept of divide and conquer. It is one of the most popular and efficient sorting algorithm. It is the best example for divide and conquer category of algorithms.
It divides the given list in the two halves, calls itself for the two halves and then merges the two sorted halves. We define the merge() function used to merging two halves.
The sub lists are divided again and again into halves until we get the only one element each. Then we combine the pair of one element lists into two element lists, sorting them in the process. The sorted two element pairs is merged into the four element lists, and so on until we get the sorted list.
Merge Sort Concept
Let's see the following Merge sort diagram.
We have divided the given list in the two halves. The list couldn't be divided in equal parts it doesn't matter at all.
Merge sort can be implement using the two ways - top-down approach and bottom-up approach. We use the top down approach in the above example, which is Merge sort most often used.
The bottom-up approach provides the more optimization which we will define later.
The main part of the algorithm is that how we combine the two sorted sublists. Let's merge the two sorted merge list.
- A : [2, 4, 7, 8]
- B : [1, 3, 11]
- sorted : empty
First, we observe the first element of both lists. We find the B's first element is smaller, so we add this in our sorted list and move forward in the B list.
- A : [2, 4, 7, 8]
- B : [1, 3, 11]
- Sorted : 1
Now we look at the next pair of elements 2 and 3. 2 is smaller so we add it into our sorted list and move forward to the list.
- A : [2, 4, 7, 8]
- B : [1, 3, 11]
- Sorted : 1
Continue this process and we end up with the sorted list of {1, 2, 3, 4, 7, 8, 11}. There can be two special cases.
- What if both sublists have same elements - In such case, we can move either one sublist and add the element to the sorted list. Technically, we can move forward in both sublist and add the elements to the sorted list.
- We have no element left in one sublist. When we run out the in a sublist simply add the element of the second one after the other.
We should remember that we can sort the element in the any order. We sort the given list in ascending order but we can easily sort in descending order.
Implementation
The merge sort algorithm is implemented by suing the top-down approach. It can be look slightly difficult, so we will elaborate each step in details. Here, we will implement this algorithm on two types of collections - integer element's list (typically used to introduce sorting) and a custom objects (a more practical and realistic scenario).
Sorting Array
The main concept of algorithm is to divide (sub)list into halves and sort them recursively. We continue the process until we end up lists that have only one element. Let's understand the following function for division -
def merge_sort(array, left_index, right_index):
if left_index >= right_index:
return middle = (left_index + right_index)//2
merge_sort(array, left_index, middle)
merge_sort(array, middle + 1, right_index)
merge(array, left_index, right_index, middle)
Our primary focus to divide the list into subparts before the sorting happen. We need to get the integer value so we use the // operator for our indices.
Let's understand the above procedure by following steps.
- First step is to create copies of lists. The first list contains the lists from [left_index,...,middle] and the second from [middle+1,?,right_index].
- We traverse both copies of list by using the pointer, select the smaller value of the two values and add them to the sorted list. Once we add the element to the list and we move forward in the sorted list regardless.
- Add the remaining elements in the other copy to the sorted array.
Python Program
Output:
Sorting Custom Objects
We can also sort the custom objects by using the Python class. This algorithm is almost similar to the above but we need to make it more versatile and pass the comparison function.
We will create a custom class, Car and add a few fields to it. We make few changes in the below algorithm to make it more versatile. We can do this by using the lambda functions.
Let's understand the following example.
Python Program
Output:
Optimization
We can improve the performance of the merge sort algorithm. First let's understand the difference between the top-down and bottom-up merge sort. The bottom-up approach sorts the elements of adjacent lists iteratively where the top-down approach breaks down the lists into the two halves.
The given list is [10, 4, 2, 12, 1, 3], instead of breaking it down into [10], [4], [2], [12], [1], [3] - we divides into the sublists which may already sorted: [10, 4], [2], [1, 12], [3] and now are ready to sort them.
Merge sort is inefficient algorithm in both time and space for the smaller sublists. So insertion sort is more efficient algorithm than the merge sort for the smaller sublists.
Conclusion
Merge sort is popular and efficient algorithm. It is more efficient algorithm for the large lists. It doesn't depend on the any unfortunate decisions that lead to bad runtimes.
There is one major demerit in the merge sort. It uses the additional memory that is used to store the temporary copies of lists before merging them. However Merge sort is widely used in the software. Its performance is fast and produces the excellent result.
We have discussed the merge sort concept in brief and implement it both on simple integer list and on custom objects via a lambda function used for comparison.
need an explanation for this answer? contact us directly to get an explanation for this answer