Mastering Search Algorithms in Python

Introduction

Search algorithms are fundamental in software and data engineering, used for finding elements in data structures efficiently. See Understanding Python Data Structures to learn for about python data structures. In this tutorial, we’ll cover essential search algorithms using the built-in Python data structures, from basic to advanced, using Python.


1. Linear Search Algorithm

Concept

Linear search is one of the simplest algorithms used to search for items in a data set. It sequentially checks each element until the target is found.

Implementation

def linear_search(arr, target):
    for i, value in enumerate(arr):
        if value == target:
            return i  # Return index of the target
    return -1  # Not found

# Example
arr = [10, 20, 30, 40, 50]
print(linear_search(arr, 30))  # Output: 2

Time Complexity:

  • Best case: O(1)O(1) (Target is the first element)
  • Worst case: O(n)O(n) (Target is at the end or not in the list)

2. Binary Search Algorithm

Concept

Binary search works on sorted lists by repetitively dividing the search space in half to locate the target item. It will continue to split the array or list in half, reducing the search space until the item is found or the search interval is empty.

Implementation (Iterative)

def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

# Example
arr = [10, 20, 30, 40, 50]
print(binary_search(arr, 30))  # Output: 2

Implementation (Recursive)

def binary_search_recursive(arr, target, left, right):
    if left > right:
        return -1
    mid = (left + right) // 2
    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        return binary_search_recursive(arr, target, mid + 1, right)
    else:
        return binary_search_recursive(arr, target, left, mid - 1)

# Example
arr = [10, 20, 30, 40, 50]
print(binary_search_recursive(arr, 30, 0, len(arr) - 1))  # Output: 2

Time Complexity:

  • Best case: O(1)O(1)
  • Worst case: O(log⁡n)O(\log n)

3. Jump Search Algorithm

Concept

Jump search algorithm builds on the linear search, but for sorted arrays. It begins by dividing the array into a jump size using the square root of the array length (n\sqrt{n}) and jumps ahead to the position of the jump size and evaluates the value of the element at that location and continues until it finds a value greater than or equal to the target element. If the element at the jump size position is greater than the target element, it will then perform a linear search backwards until it finds the target item or gets to the beginning of the jump size block. It is an improvement on the linear search algorithm, but not as effective as the binary search algorithm.

Implementation

import math

def jump_search(arr, target):
    n = len(arr)
    step = int(math.sqrt(n))
    prev, curr = 0, 0

    while curr < n and arr[curr] < target:
        prev = curr
        curr += step
        if curr >= n:
            break

    for i in range(prev, min(curr, n)):
        if arr[i] == target:
            return i
    return -1

# Example
arr = [10, 20, 30, 40, 50, 60, 70, 80]
print(jump_search(arr, 30))  # Output: 2

Time Complexity:

  • Best case: O(1)O(1)
  • Worst case: O(n)O(\sqrt{n})

4. Interpolation Search Algorithm

Concept

Interpolation search algorithm works on a sorted dataset and improves binary search by estimating the position of the target using linear interpolation. Instead of using the middle point to divide the array for sorting, like the binary search algorithm, the interpolation search algorithm attempts to divide the array into chunks closer to the target value. For example, if the target value is closer to the 1st element, the interpolation search algorithm will start closer to the beginning of the dataset.

Implementation

def interpolation_search(arr, target):
    left, right = 0, len(arr) - 1

    while left <= right and arr[left] <= target <= arr[right]:
        pos = left + ((target - arr[left]) * (right - left) // (arr[right] - arr[left]))

        if arr[pos] == target:
            return pos
        elif arr[pos] < target:
            left = pos + 1
        else:
            right = pos - 1

    return -1

# Example
arr = [10, 20, 30, 40, 50, 60, 70]
print(interpolation_search(arr, 30))  # Output: 2

Time Complexity:

  • Best case: O(1)O(1)
  • Worst case: O(log⁡log⁡n)O(\log \log n)

5. Exponential Search Algorithm

Concept

Exponential search is useful for unbounded arrays. It finds the search range using exponential jumps and then applies binary search.

Implementation

def exponential_search(arr, target):
    if arr[0] == target:
        return 0

    i = 1
    while i < len(arr) and arr[i] <= target:
        i *= 2

    return binary_search(arr[:min(i, len(arr))], target)

# Example
arr = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
print(exponential_search(arr, 30))  # Output: 2

Time Complexity:

  • Best case: O(1)O(1)
  • Worst case: O(log⁡n)O(\log n)

Comparison of Search Algorithms

AlgorithmBest CaseAverage CaseWorst CaseUse Case
Linear SearchO(1)O(1)O(n)O(n)O(n)O(n)Unsorted data
Binary SearchO(1)O(1)O(log⁡n)O(\log n)O(log⁡n)O(\log n)Sorted data
Jump SearchO(1)O(1)O(n)O(\sqrt{n})O(n)O(\sqrt{n})Large sorted data
Interpolation SearchO(1)O(1)O(log⁡log⁡n)O(\log \log n)O(n)O(n)Uniformly distributed sorted data
Exponential SearchO(1)O(1)O(log⁡n)O(\log n)O(log⁡n)O(\log n)Unbounded sorted data

Hands-on Coding Challenge: Multi-Search in an E-Commerce Store

Problem Statement

You are building a product search feature for an e-commerce website. You have a list of product prices sorted in ascending order. Implement different search algorithms to find the first occurrence of multiple target prices in the list.

Your Task

  1. Implement Binary Search, Jump Search, and Interpolation Search to find the index of each target price.
  2. If a target price is not found, return -1.
  3. Compare the performance of these algorithms on large datasets.

Example Input

prices = [5, 12, 18, 25, 33, 40, 52, 60, 72, 85, 90, 100]targets = [25, 60, 100, 8]  # Searching for these prices

Expected Output

Binary Search Results:  [3, 7, 11, -1]
Jump Search Results:    [3, 7, 11, -1]
Interpolation Search Results:  [3, 7, 11, -1]

Starter Code

Use the following starter code to implement your solution:

import math

def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

def jump_search(arr, target):
    n = len(arr)
    step = int(math.sqrt(n))
    prev, curr = 0, 0

    while curr < n and arr[curr] < target:
        prev = curr
        curr += step
        if curr >= n:
            break

    for i in range(prev, min(curr, n)):
        if arr[i] == target:
            return i
    return -1

def interpolation_search(arr, target):
    left, right = 0, len(arr) - 1

    while left <= right and arr[left] <= target <= arr[right]:
        pos = left + ((target - arr[left]) * (right - left) // (arr[right] - arr[left]))

        if pos >= len(arr):
            return -1
        if arr[pos] == target:
            return pos
        elif arr[pos] < target:
            left = pos + 1
        else:
            right = pos - 1

    return -1

# Test the algorithms
prices = [5, 12, 18, 25, 33, 40, 52, 60, 72, 85, 90, 100]
targets = [25, 60, 100, 8]  # Prices to search for

binary_results = [binary_search(prices, target) for target in targets]
jump_results = [jump_search(prices, target) for target in targets]
interpolation_results = [interpolation_search(prices, target) for target in targets]

print("Binary Search Results:  ", binary_results)
print("Jump Search Results:    ", jump_results)
print("Interpolation Search Results:  ", interpolation_results)

🔹 Bonus Challenge

  • Measure Execution Time: Use Python’s time module to compare the execution time of each algorithm on large datasets (e.g., 10^6 elements).
  • Extend for Unsorted Lists: Modify the code to handle unsorted lists by sorting the list first before performing searches.
  • Implement Exponential Search: Add exponential search to handle unbounded search spaces.

Conclusion

Mastering search algorithms is essential for optimizing data retrieval. Choose the right search algorithm based on data structure, size, and access patterns.

Try implementing the challenge and add your results to the comments or if you need hints or help with the explanations ask questions. I’ll post the answers after a few days.


Discover more from The Data Lead

Subscribe to get the latest posts sent to your email.