Introduction

This project measures object motion speed in video frames using edge detection and clustering. With a stationary camera, edge points are extracted and clustered to identify objects, whose movement is tracked across frames. By calculating object displacement, the system determines motion speed, combining computer vision and clustering techniques for analysis.

Task Description

The task involves processing video frames to detect and track object motion. First, frames are extracted, and edge detection is applied to identify key points. These points are clustered to isolate objects, and their positions are tracked across successive frames. By calculating the displacement of objects, the system measures motion speed. The accuracy of clustering and detection depends on well-separated objects and a stationary camera for reliable results.

Methods Overview

This project employs a systematic approach for object motion detection using video frame analysis. Initially, video frames are extracted, followed by the application of edge detection algorithms, such as Sobel, to identify prominent features. The detected edge points are then clustered using the DBSCAN algorithm, which efficiently groups points based on their proximity. Each object's movement is tracked across frames by analyzing its position changes, allowing for the calculation of motion speed. Throughout this process, attention is given to optimizing data processing to enhance efficiency and accuracy in detection and tracking.

Algorithm Formulation

Overview

The algorithm formulation for this project consists of several key stages aimed at effectively detecting and tracking objects in video frames. The process begins with video capture, followed by the extraction of individual frames. Each frame undergoes edge detection using algorithms such as Sobel to highlight significant contours and features of the objects present. The resulting edge points are then transformed into a dataset suitable for clustering.

The DBSCAN algorithm is utilized to group these points based on spatial density, effectively distinguishing between different objects and filtering out noise. Subsequently, the positions of the identified clusters are tracked across frames, allowing for the calculation of object movement and speed. This structured approach facilitates object detection and motion analysis.

Edge Detection Process

Edge detection identifies object boundaries in video frames by highlighting areas of sharp intensity changes. The process begins by converting each frame to grayscale and applying a Gaussian blurring filter to reduce noise. Next, algorithms such as Sobel, Prewitt, or Laplacian use convolution filters to detect intensity gradients in both horizontal and vertical directions. These gradients reveal object edges, simplifying the image to key points essential for clustering and tracking object motion.

Let's first address the code and then the results.

Edge detection involves performing a convolution between the original image and a kernel matrix, which varies depending on the method used. After convolving for both the x and y gradients, the final gradient magnitude is calculated to highlight the edges.

$$ \text{gradient\_magnitude} = \sqrt{\text{grad\_x}^2 + \text{grad\_y}^2} $$

Here is convolution function in python, taking in image and kernel matrix as input and convolving them.

import numpy as np

def convolve(image, kernel):
    """ Apply convolution between the image and a kernel. """
    kernel_height, kernel_width = kernel.shape
    image_height, image_width = image.shape
    convolved_image = np.zeros((image_height, image_width), dtype=np.float32)

    # Pad the image to handle borders
    padded_image = np.pad(image, ((1, 1), (1, 1)), mode='edge')

    # Convolution operation
    for i in range(image_height):
        for j in range(image_width):
            # Apply the kernel
            region = padded_image[i:i + kernel_height, j:j + kernel_width]
            convolved_image[i, j] = np.sum(region * kernel)

    return convolved_image

Sobel Edge Detection

Following are kernels for Sobel Edge Detection. sobel_x is kernel for horizontal and sobel_y is for vertical edge detection. Combining gives full picture.

$$ \text{sobel\_x} = \begin{bmatrix}-1 & 0 & 1 \\-2 & 0 & 2 \\-1 & 0 & 1\end{bmatrix} \hspace{1cm} \text{sobel\_y} = \begin{bmatrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{bmatrix}

import numpy as np

def sobel_edge_detection(image):
    """ Perform Sobel edge detection. """
    # Define Sobel kernels
    sobel_x = np.array([[-1, 0, 1],
                        [-2, 0, 2],
                        [-1, 0, 1]])

    sobel_y = np.array([[1, 2, 1],
                        [0, 0, 0],
                        [-1, -2, -1]])

    # Convolve the image with the Sobel kernels
    grad_x = convolve(image, sobel_x)
    grad_y = convolve(image, sobel_y)

    # Compute the gradient magnitude
    gradient_magnitude = np.sqrt(grad_x ** 2 + grad_y ** 2)
    
    # Normalize pixels to 0-255
    gradient_magnitude = np.clip(gradient_magnitude, 0, 255).astype(np.uint8)

    return gradient_magnitude

Prewitt Edge Detection

Following are kernels for Prewitt Edge Detection. prewitt_x is kernel for horizontal and prewitt_y is for vertical edge detection. Combining gives edges on both axis.

$$ \text{prewitt\_x} = \begin{bmatrix}-1 & 0 & 1 \\-1 & 0 & 1 \\-1 & 0 & 1\end{bmatrix} \hspace{1cm} \text{prewitt\_y} = \begin{bmatrix} 1 & 1 & 1 \\ 0 & 0 & 0 \\ -1 & -1 & -1 \end{bmatrix}

import numpy as np

def prewitt_edge_detection(image):
    """ Perform Prewitt edge detection. """
    # Define Prewitt kernels
    prewitt_x = np.array([[-1, 0, 1],
                          [-1, 0, 1],
                          [-1, 0, 1]])

    prewitt_y = np.array([[1, 1, 1],
                          [0, 0, 0],
                          [-1, -1, -1]])

    # Convolve the image with the Prewitt kernels
    grad_x = convolve(image, prewitt_x)
    grad_y = convolve(image, prewitt_y)

    # Compute the gradient magnitude
    gradient_magnitude = np.sqrt(grad_x ** 2 + grad_y ** 2)
    
    # Normalize pixels to 0-255
    gradient_magnitude = np.clip(gradient_magnitude, 0, 255).astype(np.uint8)

    return gradient_magnitude

Laplacian Edge detection

Laplacian edge detection uses the second derivative to identify edges by detecting rapid intensity changes, making it effective for detecting edges in all directions simultaneously.

$$ \text{laplacian\_kernel} = \begin{bmatrix}0 & 1 & 0 \\1 & -4 & 1 \\0 & 1 & 0\end{bmatrix} $$

import numpy as np

def laplacian_edge_detection(image):
    """ Perform Laplacian edge detection. """
    # Define Laplacian kernel
    laplacian_kernel = np.array([[0, 1, 0],
                                 [1, -4, 1],
                                 [0, 1, 0]])

    # Convolve the image with the Laplacian kernel
    laplacian_image = convolve(image, laplacian_kernel)

    # Normalize to 0-255
    laplacian_image = np.clip(laplacian_image, 0, 255).astype(np.uint8)

    return laplacian_image

Gaussian Blurring

Gaussian blur applies a Gaussian function to smooth images, reducing noise and detail by averaging pixel values with surrounding ones, resulting in a softening effect.

$$ \text{gaussian\_kernel} = \frac{1}{16} \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{bmatrix}

import numpy as

def gaussian_blur(image):
    """ Apply Gaussian blur to the input image and normalize the output. """
    # Define Gaussian kernel
    gaussian_kernel = np.array([[1 / 16, 2 / 16, 1 / 16],
                                [2 / 16, 4 / 16, 2 / 16],
                                [1 / 16, 2 / 16, 1 / 16]], dtype=np.float32)
    
    # Convolve the image with the Gaussian kernel                           
    blurred_image = convolve(image, gaussian_kernel)
    
    # Normalize to 0-255
    gaussian_image = np.clip(blurred_image, 0, 255).astype(np.uint8)
    
    return gaussian_image

Final Function

import numpy as np
import cv2

def detect_edges(initial_image, kernel):
    """ Convert image into grayscale, apply convolution with the given kernel, and return the result. """
    gray_scale_image = cv2.cvtColor(initial_image, cv2.COLOR_BGR2GRAY)
    image = gaussian_blur(gray_scale_image)

    if kernel == 'sobel':
        edges = sobel_edge_detection(image)
    elif kernel == 'prewitt':
        edges = prewitt_edge_detection(image)
    elif kernel == 'laplacian':
        edges = laplacian_edge_detection(image)
    else:
        raise ValueError("Unsupported kernel type.")

    return edges