# Unit-6:Sorting Technique

Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to arrange data in a particular order. Most common orders are in numerical or lexicographical order.

Insertion sort

The array is searched sequentially and unsorted items are moved and inserted into the sorted sub-list (in the same array). This algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n2), where n is the number of items.

## How Insertion Sort Works?

We take an unsorted array for our example.

Insertion sort compares the first two elements.

It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list.

Insertion sort moves ahead and compares 33 with 27.

And finds that 33 is not in the correct position.

It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see that the sorted sub-list has only one element 14, and 27 is greater than 14. Hence, the sorted sub-list remains sorted after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.

These values are not in a sorted order.

So we swap them.

However, swapping makes 27 and 10 unsorted.

Hence, we swap them too.

Again we find 14 and 10 in an unsorted order.

We swap them again. By the end of third iteration, we have a sorted sub-list of 4 items.

This process goes on until all the unsorted values are covered in a sorted sub-list. Now we shall see some programming aspects of insertion sort.

### Algorithm

Now we have a bigger picture of how this sorting technique works, so we can derive simple steps by which we can achieve insertion sort.

`Step 1 − If it is the first element, it is already sorted. return 1;Step 2 − Pick next elementStep 3 − Compare with all elements in the sorted sub-listStep 4 − Shift all the elements in the sorted sub-list that is greater than the          value to be sortedStep 5 − Insert the valueStep 6 − Repeat until list is sorted`

Selection sort

Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place comparison-based algorithm in which the list is divided into two parts, the sorted part at the left end and the unsorted part at the right end. Initially, the sorted part is empty and the unsorted part is the entire list.

The smallest element is selected from the unsorted array and swapped with the leftmost element, and that element becomes a part of the sorted array. This process continues moving unsorted array boundary by one element to the right.

This algorithm is not suitable for large data sets as its average and worst case complexities are of Ο(n2), where n is the number of items.

### How Selection Sort Works?

Consider the following depicted array as an example.

For the first position in the sorted list, the whole list is scanned sequentially. The first position where 14 is stored presently, we search the whole list and find that 10 is the lowest value.

So we replace 14 with 10. After one iteration 10, which happens to be the minimum value in the list, appears in the first position of the sorted list.

For the second position, where 33 is residing, we start scanning the rest of the list in a linear manner.

We find that 14 is the second lowest value in the list and it should appear at the second place. We swap these values.

After two iterations, two least values are positioned at the beginning in a sorted manner.

The same process is applied to the rest of the items in the array.

Following is a pictorial depiction of the entire sorting process −

Now, let us learn some programming aspects of selection sort.

### Algorithm

`Step 1 − Set MIN to location 0Step 2 − Search the minimum element in the listStep 3 − Swap with value at location MINStep 4 − Increment MIN to point to next elementStep 5 − Repeat until list is sorted`

Merge sort is a sorting technique based on divide and conquer technique. With worst-case time complexity being Ο(n log n), it is one of the most respected algorithms.

Merge sort first divides the array into equal halves and then combines them in a sorted manner.

### How Merge Sort Works?

To understand merge sort, we take an unsorted array as the following −

We know that merge sort first divides the whole array iteratively into equal halves unless the atomic values are achieved. We see here that an array of 8 items is divided into two arrays of size 4.

This does not change the sequence of appearance of items in the original. Now we divide these two arrays into halves.

We further divide these arrays and we achieve atomic value which can no more be divided.

Now, we combine them in exactly the same manner as they were broken down. Please note the color codes given to these lists.

We first compare the element for each list and then combine them into another list in a sorted manner. We see that 14 and 33 are in sorted positions. We compare 27 and 10 and in the target list of 2 values we put 10 first, followed by 27. We change the order of 19 and 35 whereas 42 and 44 are placed sequentially.

In the next iteration of the combining phase, we compare lists of two data values, and merge them into a list of found data values placing all in a sorted order.

After the final merging, the list should look like this −

Now we should learn some programming aspects of merge sorting.

### Algorithm

Merge sort keeps on dividing the list into equal halves until it can no more be divided. By definition, if it is only one element in the list, it is sorted. Then, merge sort combines the smaller sorted lists keeping the new list sorted too.

`Step 1 − if it is only one element in the list it is already sorted, return.Step 2 − divide the list recursively into two halves until it can no more be divided.Step 3 − merge the smaller lists into new list in sorted order.`

## What is Searching?

• Searching is the process of finding a given value position in a list of values.
• It decides whether a search key is present in the data or not.
• It is the algorithmic process of finding a particular item in a collection of items.
• It can be done on internal data structure or on external data structure.

## Searching Techniques

To search an element in a given array, it can be done in following ways:

1. Sequential Search
2. Binary Search

#### 1. Sequential Search

• Sequential search is also called as Linear Search.
• Sequential search starts at the beginning of the list and checks every element of the list.
• It is a basic and simple search algorithm.
• Sequential search compares the element with all the other elements given in the list. If the element is matched, it returns the value index, else it returns -1.

The above figure shows how sequential search works. It searches an element or value from an array till the desired element or value is not found. If we search the element 25, it will go step by step in a sequence order. It searches in a sequence order. Sequential search is applied on the unsorted or unordered list when there are fewer elements in a list.

#### Binary Search

• Binary Search is used for searching an element in a sorted array.
• It is a fast search algorithm with run-time complexity of O(log n).
• Binary search works on the principle of divide and conquer.
• This searching technique looks for a particular element by comparing the middle most element of the collection.
• It is useful when there are large number of elements in an array.
• The above array is sorted in ascending order. As we know binary search is applied on sorted lists only for fast searching.

For example, if searching an element 25 in the 7-element array, following figure shows how binary search works:

Binary searching starts with middle element. If the element is equal to the element that we are searching then return true. If the element is less than then move to the right of the list or if the element is greater than then move to the left of the list. Repeat this, till you find an element.

## What is Hashing?

• Hashing is the process of mapping large amount of data item to smaller table with the help of hashing function.
• Hashing is also known as Hashing Algorithm or Message Digest Function.
• It is a technique to convert a range of key values into a range of indexes of an array.
• It is used to facilitate the next level searching method when compared with the linear or binary search.
• Hashing allows to update and retrieve any data entry in a constant time O(1).
• Constant time O(1) means the operation does not depend on the size of the data.
• Hashing is used with a database to enable items to be retrieved more quickly.
• It is used in the encryption and decryption of digital signatures.

## What is Hash Function?

• A fixed process converts a key to a hash key is known as a Hash Function.
• This function takes a key and maps it to a value of a certain length which is called a Hash value or Hash.
• Hash value represents the original string of characters, but it is normally smaller than the original.
• It transfers the digital signature and then both hash value and signature are sent to the receiver. Receiver uses the same hash function to generate the hash value and then compares it to that received with the message.
• If the hash values are same, the message is transmitted without errors.

## What is Hash Table?

• Hash table or hash map is a data structure used to store key-value pairs.
• It is a collection of items stored to make it easy to find them later.
• It uses a hash function to compute an index into an array of buckets or slots from which the desired value can be found.
• It is an array of list where each list is known as bucket.
• It contains value based on the key.
• Hash table is used to implement the map interface and extends Dictionary class.
• Hash table is synchronized and contains only unique elements.
• The above figure shows the hash table with the size of n = 10. Each position of the hash table is called as Slot. In the above hash table, there are n slots in the table, names = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Slot 0, slot 1, slot 2 and so on. Hash table contains no items, so every slot is empty.
• As we know the mapping between an item and the slot where item belongs in the hash table is called the hash function. The hash function takes any item in the collection and returns an integer in the range of slot names between 0 to n-1.
• Suppose we have integer items {26, 70, 18, 31, 54, 93}. One common method of determining a hash key is the division method of hashing and the formula is :

Hash Key = Key Value % Number of Slots in the Table

• After computing the hash values, we can insert each item into the hash table at the designated position as shown in the above figure. In the hash table, 6 of the 10 slots are occupied, it is referred to as the load factor and denoted by, λ = No. of items / table size. For example , λ = 6/10.
• It is easy to search for an item using hash function where it computes the slot name for the item and then checks the hash table to see if it is present.
• Constant amount of time O(1) is required to compute the hash value and index of the hash table at that location.