Linear Search
| When we need to search for a value in an unsorted collection, we have no other option than check element by element for a match.
This technique is called the linear search technique. We iterate over all elements until we have found the element were looking for or until weve searched the complete collection.
Following example fills an array with the first 10 multiples of base 2. All values from 1 to 21 are then searched for within the array using a linear search technique.
The search algorithm uses Booleans which indicate whether or not the value is found, and whether or not the complete array was searched. When the search finishes these Booleans can be used to extract the necessary information. The search algorithm also uses an index variable which is continuously incremented until the value is found, or the complete array is searched. This index variable can then be used to retrieve the position of the found value. |
| Module LinearSearch
Sub Main()
Dim base As Integer = 2
Dim count As Integer = 10
Dim upperbound As Integer = count - 1
Dim numbers(upperbound) As Integer
Dim index As Integer
For index = 0 To upperbound
numbers(index) = (index + 1) * base
Next
Dim number As Integer
For number = base - 1 To base * count + 1
Dim found As Boolean = False
Dim exhausted As Boolean = False
index = -1
Do Until found OrElse exhausted
index += 1
found = (numbers(index) = number)
exhausted = (index = upperbound)
Loop
If found Then
Console.Write(number & " found at index " & index)
Else
Console.Write(number & " not found")
End If
If exhausted Then
Console.WriteLine(", search exhausted")
Else
Console.WriteLine(", search not exhausted")
End If
Next
Console.ReadLine()
End Sub
End Module Download Broncode |
| Output : 1 not found, search exhausted
2 found at index 0, search not exhausted
3 not found, search exhausted
4 found at index 1, search not exhausted
5 not found, search exhausted
6 found at index 2, search not exhausted
7 not found, search exhausted
8 found at index 3, search not exhausted
9 not found, search exhausted
10 found at index 4, search not exhausted
11 not found, search exhausted
12 found at index 5, search not exhausted
13 not found, search exhausted
14 found at index 6, search not exhausted
15 not found, search exhausted
16 found at index 7, search not exhausted
17 not found, search exhausted
18 found at index 8, search not exhausted
19 not found, search exhausted
20 found at index 9, search exhausted
21 not found, search exhausted |
Up
Exercise Linear Search
| Task :
Search for the name of a city with a certain zip code.
Use the linear search technique to search for the position of the entered zipcode in the zipCodes array. Then retrieve then name of the city on the corresponding position in the names array. |
| Module Exercise1Task
Sub Main()
Dim count As Integer = 3
Dim names() As String = {"Brussels", "Antwerp", "Ghent"}
Dim zipCodes() As Integer = {1000, 2000, 9000}
Console.WriteLine("Zip Code ?")
Dim zipCode As String = Console.ReadLine()
Console.ReadLine()
End Sub
End Module Download Broncode |
| Output : Zip Code ?
<i>2000</i>
Antwerp |
| Output : Zip Code ?
<i>4000</i>
City not found. |
| Module Exercise1Solution
Sub Main()
Dim count As Integer = 3
Dim names() As String = {"Brussels", "Antwerp", "Ghent"}
Dim zipCodes() As Integer = {1000, 2000, 9000}
Console.WriteLine("Zip Code ?")
Dim zipCode As String = Console.ReadLine()
Dim found, exhausted As Boolean
Dim index As Integer = -1
Do Until found OrElse exhausted
index += 1
found = (zipCodes(index) = zipCode)
exhausted = (index = count - 1)
Loop
If found Then
Console.WriteLine(names(index))
Else
Console.WriteLine("City not found.")
End If
Console.ReadLine()
End Sub
End Module Download Broncode |
Up
Binary Search
| When a collection is sorted, more intelligent techniques for searching are available. One of these techniques is the binary search technique ( also called logarithmic search ).
This technique will continuously divide the search range into two parts ( therefore called BINARY search ) and will use the sort mechanism of the collections to decide which part could contain the value we're looking for.
The steps of this technique are listed below :
(1) determine the element in the middle of the array (2) check if the element in the middle contains the value we're looking for, if so, we can stop searching, if not, we go on with the next steps (3) check to see if the complete array is searched for the value, if so, we can stop searching, if not, we continue with the next step (4) determine ( based on the sort mechanism used in the array ) which part ( left of right from the middle element ) could contain the value we're looking for
Repeat steps (1), (2), (3) and (4) until the value is found, or the complete array is searched for the value.
We'll use next array to illustrate this technique : |
| index 0 1 2 3 4 5 6 7 8 9
value 5 10 15 20 25 30 35 40 45 50 |
| Suppose we need to search for value 20.
First we'll determine the position of the middle element of this array : |
| lowerbound upperbound middle index middle element
0 9 (0 + 9) \ 2 = 4 25 |
| 25 is not the value we are looking for, so well decide to ignore the right part ( right of the middle element ), and search thru the left part, because value 20 were looking for is less than the middle element 25 and can therefore only be found in that left part : |
| lowerbound (*) upperbound (*) middle index middle element
0 4 - 1 = 3 (0 + 3) \ 2 = 1 10
(*) of the search range |
| Value 20 is more than 10, so we decide to look thru the left part : |
| lowerbound (*) upperbound (*) middle index middle element
1 + 1 = 2 3 (2 + 3) \ 2 = 2 15
(*) of the search range |
| Value 20 is more than 15, so we decide to look thru the left part : |
| lowerbound (*) upperbound (*) middle index middle element
2 + 1 = 3 3 (3 + 3) \ 2 = 3 20
(*) of the search range |
| The middle element 20 is what we're looking for so we can stop searching, and conclude that the value is found at index 3.
Suppose we were looking for value 21 ( in stead of 20 ). Then 21 was more than 20, but because there are no more elements to look thru ( the search range only contained one element, which was not the value we were looking for ), we can decide to stop looking, and conclude that the value ( 21 in this case ) could not be found. |
| Module BinarySearch
Sub Main()
Dim base As Integer = 2
Dim count As Integer = 10
Dim upperbound As Integer = count - 1
Dim numbers(upperbound) As Integer
Dim index As Integer
For index = 0 To upperbound
numbers(index) = (index + 1) * base
Next
Dim number As Integer
For number = base - 1 To base * count + 1
upperbound = count - 1
Dim lowerbound As Integer = 0
Dim found As Boolean = False
Dim exhausted As Boolean = False
Do Until found OrElse exhausted
index = (lowerbound + upperbound) \ 2
found = (number = numbers(index))
exhausted = (upperbound <= lowerbound)
If Not (found OrElse exhausted) Then
If number > numbers(index) Then
lowerbound = index + 1
Else
upperbound = index - 1
End If
End If
Loop
If found Then
Console.Write(number & " found at index " & index)
Else
Console.Write(number & " not found")
End If
If exhausted Then
Console.Write(", search exhausted")
Else
Console.Write(", search not exhausted")
End If
Console.WriteLine(", last searched " & lowerbound & " to " & upperbound)
Next
Console.ReadLine()
End Sub
End Module Download Broncode |
| Output : 1 not found, search exhausted, last searched 0 to 0
2 found at index 0, search exhausted, last searched 0 to 0
3 not found, search exhausted, last searched 0 to 0
4 found at index 1, search not exhausted, last searched 0 to 3
5 not found, search exhausted, last searched 2 to 1
6 found at index 2, search not exhausted, last searched 2 to 3
7 not found, search exhausted, last searched 3 to 3
8 found at index 3, search exhausted, last searched 3 to 3
9 not found, search exhausted, last searched 3 to 3
10 found at index 4, search not exhausted, last searched 0 to 9
11 not found, search exhausted, last searched 5 to 4
12 found at index 5, search not exhausted, last searched 5 to 6
13 not found, search exhausted, last searched 6 to 6
14 found at index 6, search exhausted, last searched 6 to 6
15 not found, search exhausted, last searched 6 to 6
16 found at index 7, search not exhausted, last searched 5 to 9
17 not found, search exhausted, last searched 8 to 7
18 found at index 8, search not exhausted, last searched 8 to 9
19 not found, search exhausted, last searched 9 to 9
20 found at index 9, search exhausted, last searched 9 to 9
21 not found, search exhausted, last searched 9 to 9 |
| The more elements the collection we're searching in contains, the more interesting it is to use a binary search in stead of a linear search.
When we need to look for many values within a large unsorted collection, we could benefit by first sorting the collections before we start looking for values. If you only have 10 books ( so a small collection ) on your bookshelf, you will not benefit from sorting them alphabetically. Neither would you benefit from sorting 1000 books ( so a large collection ) when you never search for a book. You would benefit from sorting 1000 books when you often search for books. But always take in to account that a certain effort is needed to maintain the correct sequence. For instance when you add a book, you'll need to insert it on the correct position. |
Up
Predefined Search Methods
| In stead of manually coding a search algorithm, a predefined search method can be used to search for a value in an one-dimensional array.
For instance the shared method IndexOf of class Array (1) can be used to retrieve the position ( the index ) of the searchvalue ( second argument ) within the one-dimensional array ( first argument ). If the value is not found, IndexOf will return -1. |
| Module Example1
Sub Main()
Dim base As Integer = 2
Dim count As Integer = 10
Dim upperbound As Integer = count - 1
Dim numbers(upperbound) As Integer
Dim index As Integer
For index = 0 To upperbound
numbers(index) = (index + 1) * base
Next
Dim number As Integer
For number = base - 1 To base * count + 1
index = Array.IndexOf(numbers, number)
Dim found As Boolean = index > -1
If found Then
Console.WriteLine(number & " found at index " & index)
Else
Console.WriteLine(number & " not found")
End If
Next
Console.ReadLine()
End Sub
End Module Download Broncode |
| Output : 1 not found
2 found at index 0
3 not found
4 found at index 1
5 not found
6 found at index 2
7 not found
8 found at index 3
9 not found
10 found at index 4
11 not found
12 found at index 5
13 not found
14 found at index 6
15 not found
16 found at index 7
17 not found
18 found at index 8
19 not found
20 found at index 9
21 not found |
| When the one-dimensional array is sorted we can use the shared method BinarySearch of class Array (1). This method will return the index of the searchvalue ( second argument ) within the array ( first argument ). If the searchvalue is not found this method will return -1. |
| Module Example2
Sub Main()
Dim base As Integer = 2
Dim count As Integer = 10
Dim upperbound As Integer = count - 1
Dim numbers(upperbound) As Integer
Dim index As Integer
For index = 0 To upperbound
numbers(index) = (index + 1) * base
Next
Dim number As Integer
For number = base - 1 To base * count + 1
index = Array.BinarySearch(numbers, number)
Dim found As Boolean = index > -1
If found Then
Console.WriteLine(number & " found at index " & index)
Else
Console.WriteLine(number & " not found")
End If
Next
Console.ReadLine()
End Sub
End Module Download Broncode |
| Output : 1 not found
2 found at index 0
3 not found
4 found at index 1
5 not found
6 found at index 2
7 not found
8 found at index 3
9 not found
10 found at index 4
11 not found
12 found at index 5
13 not found
14 found at index 6
15 not found
16 found at index 7
17 not found
18 found at index 8
19 not found
20 found at index 9
21 not found |
| Because of the predefined search functionality, it is rarely necessary to manually code a search algorithm to let a program look for a value within a one-dimensional array. Still it is useful to understand how these basic search techniques function, the understand the performance differences they have. Besides arrays other collection types exists. It is possible that these other types of collection don't have any predefined search functionalities. In a situation like that, it could be necessary to manually implement some search algorithms. |
Up
Exercise Binary Search
| Task :
Search for the name of a city based on its zip code.
Manually code a binary search to do this. |
| Module Exercise2Task
Sub Main()
Dim count As Integer = 3
Dim names() As String = {"Brussels", "Antwerp", "Ghent"}
Dim zipCodes() As Integer = {1000, 2000, 9000}
Console.WriteLine("Zip Code ?")
Dim zipCode As String = Console.ReadLine()
Console.ReadLine()
End Sub
End Module Download Broncode |
| Output : Zip Code ?
<i>2000</i>
Antwerp |
| Output : Zip Code ?
<i>4000</i>
City not found. |
| Module Exercise2Solution
Sub Main()
Dim count As Integer = 3
Dim names() As String = {"Brussels", "Antwerp", "Ghent"}
Dim zipCodes() As Integer = {1000, 2000, 9000}
Console.WriteLine("Zip Code ?")
Dim zipCode As String = Console.ReadLine()
Dim found, exhausted As Boolean
Dim lowerbound, index As Integer
Dim upperbound As Integer = count - 1
Do Until found OrElse exhausted
index = (lowerbound + upperbound) \ 2
found = (zipCode = zipCodes(index))
exhausted = (upperbound <= lowerbound)
If Not (found OrElse exhausted) Then
If zipCode > zipCodes(index) Then
lowerbound = index + 1
Else
upperbound = index - 1
End If
End If
Loop
If found Then
Console.WriteLine(names(index))
Else
Console.WriteLine("City not found.")
End If
Console.ReadLine()
End Sub
End Module Download Broncode |
Up
Performance in Big-O-Notation
| To compare the linear and binary search we could have a look at the maximum number of search-iterations used in these techniques.
In the linear search ( with N elements ) this will be N.
In the binary search this is equal to the number of times we can divide the search range into two parts. With R being the first power of 2 above N this will be log2 R ( logarithm of base 2 ). This is why binary search is sometimes called logarithmic search. When an array has 10 elements, the first power of 2 above 10 is 16, so log2 16 ( or 4 ) is the maximum of search-iterations we'll need.
The number of search-iterations needed for the binary search is way below that of the linear search. This still doesn't mean that the binary search will always be more efficient than the linear search. One search-iteration of the binary search takes lots more effort for the processor to perform. All kinds of calculations are needed, calculate the middle index, shift the bound of the search range, ... .
How long it takes to execute a certain algorithm is hard to define. It depends on to many circumstances, for instance the platform, the processor, ... . Profile tools are the best way to check how algorithms/programs perform in a specific environment. The result of the test taken by these tools always take the environment settings ( platform, processor, ... ) into account.
Still it would be useful to compare performances of different algorithms on different runtime environments in one or the other way. This could help us decide what algorithm to chose for a specific need.
A notation often used to express the performance of an algorithm is the Big-O-Notation. This notation does not try to express the exact time it would take to execute an algorithm, but tries to express the scalability of the algorithm. It gives us an idea of how the growth of the input will effect our running-time.
For instance the instruction "x = x + 1" would take up a constant time. If it's executed several times on the same system ( under the same conditions ) it will always take up the same time to execute. This is what the call "constant running-time", expressed in Big-O with "O(1)". |
| For i = 1 to N
x = x + 1
Next |
| Where "x = x + 1" take up a constant time to execute, will here take up N times that constant time. This is what is called "linear running-time", "O(N)" in Big-O. When N multiplies 100 times, it will take 100 times longer to execute this code. There is a linear associating between the growth of N and the growth of the running-time. |
| For i = 1 to N
For j = 1 to N
x = x + 1
Next
Next |
| The instruction "x = x + 1" is now executed N by N times. The running-time is "quadratic" ( "O(N^2)" ).
The Big-O-Notation always tries to express the performance as simple as possible, therefore constant factors are ignored. "O(10N^2)" for instance can be simplified to "O(N^2)". If N is 5, N^2 is 25, if N is 30, N^2 is 900. So if N multiplies by 6 ( 30 / 5 = 6 ) then the running-time will be multiplied by 36 ( 900 / 25 = 36 ). So the running-time is quadratic, 6 ^ 2 = 36. If N is 5, 10N^2 is 250, if N is 30, 10N^2 is 9000. So if N multiplies by 6 then the running-time will be multiplied by 36 ( 9000 / 250 = 36 ). Here as well quadratic running-time is expressed, therefore constant factor 10 can be ignored.
Bases of logarithms can be left out. "O(log2 N)" expresses the same growth as "O(log3 N)" or "O(log4 N)".
If N is 1024, log2 N is 10, log4 N is 5. If N is 4096, log2 N is 12, log4 N is 6. So if N multiplies by 4 ( 4096 / 1024 = 4 ) then the running-time will be multiplied by 1.2 ( 12 / 10 = 1.2 and 6 / 5 = 1.2 ). So running-time is logarithmic, therefore "O(log N)" expresses enough.
Often ( with a large N ) smaller factors are ignored, because they are not very relevant. "O(N^2 + N)" could be expressed as "O(N^2)"/
If N is 1000, N^2 = 1 000 000 and N^2 + N = 1 001 000. If N is 3000, N^2 = 9 000 000 and N^2 + N = 9 003 000. So if N multiplies with 3 ( 3000 / 1000 = 3 ) the running-time will roughly multiply by 9 ( 9 000 000 / 1 000 000 = 9 and 9 003 000 / 1 001 000 ~= 9 ).
The higher N is, the more irrelevant the small factor "+ N" becomes, therefore it is often left out.
The performance expressed in Big-O for the above search algorithms is : - "O(N)" ( linear ) for the linear search - "O(log N)" ( logarithmic ) for the binary search
Again this does not mean that the binary search is always more efficient than the linear search. This only means that the binary search is more scalable ( will suffer less from a raising input ) than the linear search. |
This version ( published on 2008-06-24 ) is printed from http://www.studyvb.com, visit the website for more recent information.
|