This project measures object motion speed in video frames using edge detection and clustering. With a stationary camera, edge points are extracted and clustered to identify objects, whose movement is tracked across frames. By calculating object displacement, the system determines motion speed, combining computer vision and clustering techniques for analysis.
The task involves processing video frames to detect and track object motion. First, frames are extracted, and edge detection is applied to identify key points. These points are clustered to isolate objects, and their positions are tracked across successive frames. By calculating the displacement of objects, the system measures motion speed. The accuracy of clustering and detection depends on well-separated objects and a stationary camera for reliable results.
This project employs a systematic approach for object motion detection using video frame analysis. Initially, video frames are extracted, followed by the application of edge detection algorithms, such as Sobel, to identify prominent features. The detected edge points are then clustered using the DBSCAN algorithm, which efficiently groups points based on their proximity. Each object's movement is tracked across frames by analyzing its position changes, allowing for the calculation of motion speed. Throughout this process, attention is given to optimizing data processing to enhance efficiency and accuracy in detection and tracking.
The algorithm formulation for this project consists of several key stages aimed at effectively detecting and tracking objects in video frames. The process begins with video capture, followed by the extraction of individual frames. Each frame undergoes edge detection using algorithms such as Sobel to highlight significant contours and features of the objects present. The resulting edge points are then transformed into a dataset suitable for clustering.
The DBSCAN algorithm is utilized to group these points based on spatial density, effectively distinguishing between different objects and filtering out noise. Subsequently, the positions of the identified clusters are tracked across frames, allowing for the calculation of object movement and speed. This structured approach facilitates object detection and motion analysis.
Edge detection identifies object boundaries in video frames by highlighting areas of sharp intensity changes. The process begins by converting each frame to grayscale and applying a Gaussian blurring filter to reduce noise. Next, algorithms such as Sobel, Prewitt, or Laplacian use convolution filters to detect intensity gradients in both horizontal and vertical directions. These gradients reveal object edges, simplifying the image to key points essential for clustering and tracking object motion.
Let's first address the code and then the results.
Edge detection involves performing a convolution between the original image and a kernel matrix, which varies depending on the method used. After convolving for both the x and y gradients, the final gradient magnitude is calculated to highlight the edges.
$$ \text{gradient\_magnitude} = \sqrt{\text{grad\_x}^2 + \text{grad\_y}^2} $$
Here is convolution function in python, taking in image and kernel matrix as input and convolving them.
import numpy as np
def convolve(image, kernel):
""" Apply convolution between the image and a kernel. """
kernel_height, kernel_width = kernel.shape
image_height, image_width = image.shape
convolved_image = np.zeros((image_height, image_width), dtype=np.float32)
# Pad the image to handle borders
padded_image = np.pad(image, ((1, 1), (1, 1)), mode='edge')
# Convolution operation
for i in range(image_height):
for j in range(image_width):
# Apply the kernel
region = padded_image[i:i + kernel_height, j:j + kernel_width]
convolved_image[i, j] = np.sum(region * kernel)
return convolved_image
Following are kernels for Sobel Edge Detection. sobel_x is kernel for horizontal and sobel_y is for vertical edge detection. Combining gives full picture.
$$ \text{sobel\_x} = \begin{bmatrix}-1 & 0 & 1 \\-2 & 0 & 2 \\-1 & 0 & 1\end{bmatrix} \hspace{1cm} \text{sobel\_y} = \begin{bmatrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{bmatrix}
$$
import numpy as np
def sobel_edge_detection(image):
""" Perform Sobel edge detection. """
# Define Sobel kernels
sobel_x = np.array([[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1]])
sobel_y = np.array([[1, 2, 1],
[0, 0, 0],
[-1, -2, -1]])
# Convolve the image with the Sobel kernels
grad_x = convolve(image, sobel_x)
grad_y = convolve(image, sobel_y)
# Compute the gradient magnitude
gradient_magnitude = np.sqrt(grad_x ** 2 + grad_y ** 2)
# Normalize pixels to 0-255
gradient_magnitude = np.clip(gradient_magnitude, 0, 255).astype(np.uint8)
return gradient_magnitude
Following are kernels for Prewitt Edge Detection. prewitt_x is kernel for horizontal and prewitt_y is for vertical edge detection. Combining gives edges on both axis.
$$ \text{prewitt\_x} = \begin{bmatrix}-1 & 0 & 1 \\-1 & 0 & 1 \\-1 & 0 & 1\end{bmatrix} \hspace{1cm} \text{prewitt\_y} = \begin{bmatrix} 1 & 1 & 1 \\ 0 & 0 & 0 \\ -1 & -1 & -1 \end{bmatrix}
$$
import numpy as np
def prewitt_edge_detection(image):
""" Perform Prewitt edge detection. """
# Define Prewitt kernels
prewitt_x = np.array([[-1, 0, 1],
[-1, 0, 1],
[-1, 0, 1]])
prewitt_y = np.array([[1, 1, 1],
[0, 0, 0],
[-1, -1, -1]])
# Convolve the image with the Prewitt kernels
grad_x = convolve(image, prewitt_x)
grad_y = convolve(image, prewitt_y)
# Compute the gradient magnitude
gradient_magnitude = np.sqrt(grad_x ** 2 + grad_y ** 2)
# Normalize pixels to 0-255
gradient_magnitude = np.clip(gradient_magnitude, 0, 255).astype(np.uint8)
return gradient_magnitude
Laplacian edge detection uses the second derivative to identify edges by detecting rapid intensity changes, making it effective for detecting edges in all directions simultaneously.
$$ \text{laplacian\_kernel} = \begin{bmatrix}0 & 1 & 0 \\1 & -4 & 1 \\0 & 1 & 0\end{bmatrix} $$
import numpy as np
def laplacian_edge_detection(image):
""" Perform Laplacian edge detection. """
# Define Laplacian kernel
laplacian_kernel = np.array([[0, 1, 0],
[1, -4, 1],
[0, 1, 0]])
# Convolve the image with the Laplacian kernel
laplacian_image = convolve(image, laplacian_kernel)
# Normalize to 0-255
laplacian_image = np.clip(laplacian_image, 0, 255).astype(np.uint8)
return laplacian_image
Gaussian blur applies a Gaussian function to smooth images, reducing noise and detail by averaging pixel values with surrounding ones, resulting in a softening effect.
$$ \text{gaussian\_kernel} = \frac{1}{16} \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{bmatrix}
$$
import numpy as
def gaussian_blur(image):
""" Apply Gaussian blur to the input image and normalize the output. """
# Define Gaussian kernel
gaussian_kernel = np.array([[1 / 16, 2 / 16, 1 / 16],
[2 / 16, 4 / 16, 2 / 16],
[1 / 16, 2 / 16, 1 / 16]], dtype=np.float32)
# Convolve the image with the Gaussian kernel
blurred_image = convolve(image, gaussian_kernel)
# Normalize to 0-255
gaussian_image = np.clip(blurred_image, 0, 255).astype(np.uint8)
return gaussian_image
import numpy as np
import cv2
def detect_edges(initial_image, kernel):
""" Convert image into grayscale, apply convolution with the given kernel, and return the result. """
gray_scale_image = cv2.cvtColor(initial_image, cv2.COLOR_BGR2GRAY)
image = gaussian_blur(gray_scale_image)
if kernel == 'sobel':
edges = sobel_edge_detection(image)
elif kernel == 'prewitt':
edges = prewitt_edge_detection(image)
elif kernel == 'laplacian':
edges = laplacian_edge_detection(image)
else:
raise ValueError("Unsupported kernel type.")
return edges