OpenCV Image Processing Complete Guide | OpenCV图像处理完全指南


Introduction: The Inner Logic of Image Processing | 引言:图像处理的内在逻辑

Fundamental Concepts | 核心概念

In computer vision, images are represented as matrices of pixel values. A color image typically consists of three channels (BGR in OpenCV), while a grayscale image has a single channel. Each pixel value ranges from 0 to 255, representing intensity.

在计算机视觉中,图像被表示为像素值矩阵。彩色图像通常包含三个通道(OpenCV中为BGR格式),而灰度图像只有一个通道。每个像素值的范围是0到255,表示亮度强度。

Processing Pipeline | 处理流程

The operations in this guide follow a logical sequence:

本指南中的操作遵循以下逻辑流程:

  1. Image Acquisition → Reading images/videos from files or cameras 图像获取 → 从文件或摄像头读取图像/视频
  2. Preprocessing → Basic operations like resizing, cropping, color space conversion 预处理 → 调整大小、裁剪、颜色空间转换等基本操作
  3. Enhancement → Smoothing, sharpening, contrast adjustment 增强 → 平滑、锐化、对比度调整
  4. Feature Extraction → Edge detection, contour finding, gradient calculation 特征提取 → 边缘检测、轮廓查找、梯度计算
  5. Transformation → Pyramids, Fourier transform for frequency domain analysis 变换 → 金字塔、傅里叶变换用于频域分析

Core Principles | 核心原理

  • Convolution (卷积): The foundation of many operations (smoothing, edge detection). A kernel slides over the image, computing weighted sums. 卷积:许多操作的基础(平滑、边缘检测)。一个核在图像上滑动,计算加权和。
  • Thresholding (阈值化): Converts grayscale images to binary by comparing pixel values to a threshold. 阈值化:通过将像素值与阈值比较,将灰度图像转换为二值图像。
  • Morphology (形态学): Based on set theory, operations like erosion and dilation modify shape. 形态学:基于集合论,腐蚀和膨胀等操作修改形状。
  • Frequency Domain (频域): Transforms spatial information to frequency domain for filtering. 频域:将空间信息转换到频域进行滤波。

Why These Operations? | 为什么需要这些操作?

These operations form the building blocks of computer vision systems. They enable:

这些操作构成了计算机视觉系统的基础模块,使我们能够:

  • Object Detection (目标检测): Identify and locate objects in images 目标检测:识别并定位图像中的物体
  • Image Segmentation (图像分割): Separate foreground from background 图像分割:分离前景和背景
  • Feature Matching (特征匹配): Compare and match patterns across images 特征匹配:在图像间比较和匹配模式
  • Image Restoration (图像恢复): Remove noise and artifacts 图像恢复:去除噪声和伪影

Understanding these operations is essential for developing robust computer vision applications, especially for RM (RoboMaster) competition scenarios where real-time processing and accuracy are critical.

理解这些操作对于开发强大的计算机视觉应用至关重要,特别是在RM(RoboMaster)竞赛场景中,实时处理和准确性至关重要。

Version Compatibility | 版本兼容性

This guide is based on OpenCV 4.x. Note that:

  • OpenCV 4.x: cv2.findContours() returns (contours, hierarchy)
  • OpenCV 3.x: cv2.findContours() returns (image, contours, hierarchy)

本指南基于OpenCV 4.x编写。注意:

  • OpenCV 4.xcv2.findContours()返回(contours, hierarchy)
  • OpenCV 3.xcv2.findContours()返回(image, contours, hierarchy)

Table of Contents | 目录

  1. Basic Image Operations
  2. Image Transformations
  3. Practical Tips & FAQ

1. Basic Image Operations | 图像基本操作

1.1 Reading Images (cv2.imread) | 图像读取

Syntax
img = cv2.imread(filename, flags)
Parameter Description (参数)
filename Path to image file (图像文件路径)
flags Reading mode (读取模式)

flags Parameter Details (flags参数详解):

Value (值) Constant (常量) Description (说明)
1 cv2.IMREAD_COLOR Color image (default), ignores transparency (彩色图像,默认,忽略透明度)
0 cv2.IMREAD_GRAYSCALE Grayscale image (灰度图像)
-1 cv2.IMREAD_UNCHANGED Keep original channels including alpha (保留原始通道,含透明度)

Note (注意): OpenCV reads images in BGR format, not RGB! When displaying with Matplotlib, convert to RGB using cv2.cvtColor(img, cv2.COLOR_BGR2RGB).

OpenCV读取图像的格式是BGR,不是RGB!用Matplotlib显示时,需要用cv2.cvtColor(img, cv2.COLOR_BGR2RGB)转换。

import cv2

img = cv2.imread('cat.jpg')  # Read color image (BGR format) | 读取彩色图像(BGR格式)
img_gray = cv2.imread('cat.jpg', cv2.IMREAD_GRAYSCALE)  # Read grayscale | 读取灰度图

1.2 Displaying Images (cv2.imshow) | 图像显示

Syntax
cv2.imshow(winname, mat)
Parameter Description (参数)
winname Window name (窗口名称)
mat Image array to display (要显示的图像)

Supporting Functions (配合使用的函数):

Function (函数) Syntax (语法) Description (说明)
cv2.waitKey() cv2.waitKey(delay) Wait for key press, delay=0 means infinite (等待按键,0=无限等待)
cv2.destroyAllWindows() cv2.destroyAllWindows() Destroy all windows (销毁所有窗口)
cv2.imshow('image', img)  # Display image | 显示图像
cv2.waitKey(0)  # Press any key to continue | 按任意键继续
cv2.destroyAllWindows()  # Close windows | 关闭窗口

# Convenience function | 便捷函数
def cv_show(name, img):
    cv2.imshow(name, img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

1.3 Saving Images (cv2.imwrite) | 图像保存

Syntax
retval = cv2.imwrite(filename, img)
Parameter Description (参数)
filename Save path and filename (保存路径)
img Image to save (要保存的图像)
retval Returns True if successful (返回值)
cv2.imwrite('my_cat.png', img)  # Save as PNG | 保存为PNG格式

1.4 Image Properties | 图像属性

Property/Method (属性) Description (说明) Example (示例)
img.shape Image shape: (height, width, channels) (图像形状:高度, 宽度, 通道数) (414, 500, 3)
img.size Total number of pixels (总像素数) 621000
img.dtype Data type (数据类型) uint8 (0-255)
type(img) Object type (对象类型) numpy.ndarray
print(img.shape)   # (414, 500, 3)
print(img.size)    # 621000
print(img.dtype)   # uint8
print(type(img))   # <class 'numpy.ndarray'>

1.5 Reading Video (cv2.VideoCapture) | 视频读取

Syntax
cap = cv2.VideoCapture(index/filename)
Parameter Description (参数)
index Camera index, 0=default (摄像头索引,0=默认)
filename Path to video file (视频文件路径)

Common Methods (常用方法):

Method (方法) Description (说明)
cap.isOpened() Check if video opened successfully (检查视频是否成功打开)
cap.read() Read one frame, returns (ret, frame) (读取一帧,返回ret, frame)
cap.release() Release resources (释放资源)
vc = cv2.VideoCapture('test.mp4')  # Open video file | 打开视频文件

if not vc.isOpened():
    print("Cannot open video")  # Cannot open video | 无法打开视频
    exit()

while True:
    ret, frame = vc.read()  # Read frame | 读取帧
    if not ret:
        break
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    cv2.imshow('result', gray)
    if cv2.waitKey(100) & 0xFF == 27:  # ESC to exit | ESC键退出
        break

vc.release()
cv2.destroyAllWindows()

1.6 ROI Extraction | ROI截取

ROI (Region of Interest) | ROI(感兴趣区域)

img = cv2.imread('cat.jpg')
cat_face = img[0:50, 0:200]  # [row range, column range] | [行范围, 列范围]
cv_show('cat_face', cat_face)

1.7 Color Channel Operations | 颜色通道操作

Channel Split (cv2.split) | 通道分离

Syntax
b, g, r = cv2.split(img)

Note (注意): Order is B, G, R because OpenCV uses BGR format. 返回顺序是B、G、R,因为OpenCV使用BGR格式。

Channel Merge (cv2.merge) | 通道合并

Syntax
img = cv2.merge((b, g, r))
b, g, r = cv2.split(img)  # Split channels | 分离通道
img = cv2.merge((b, g, r))  # Merge channels | 合并通道

# Keep only red channel | 只保留红色通道
cur_img = img.copy()
cur_img[:, :, 0] = 0  # Set B channel to 0 | B通道置0
cur_img[:, :, 1] = 0  # Set G channel to 0 | G通道置0

1.8 Border Padding (cv2.copyMakeBorder) | 边界填充

Syntax
dst = cv2.copyMakeBorder(src, top, bottom, left, right, borderType, value)
Parameter Description (参数)
top/bottom/left/right Padding pixels in each direction (各方向填充像素数)
borderType Padding type (填充类型)
value Constant value for CONSTANT type (常量填充的值)

borderType Options (borderType填充类型):

Type (类型) Description (说明) Example (示例)
BORDER_REPLICATE Replicate edge pixels (复制边缘像素) aaaaaa|abcdefgh|hhhhhhh
BORDER_REFLECT Mirror reflection (镜像反射) fedcba|abcdefgh|hgfedcb
BORDER_REFLECT_101 Reflect around edge (以边缘为轴镜像) gfedcb|abcdefgh|gfedcba
BORDER_WRAP Wrap around (外包装) cdefgh|abcdefgh|abcdefg
BORDER_CONSTANT Constant value (常量填充) iiiiii|abcdefgh|iiiiii
top_size, bottom_size, left_size, right_size = (50, 50, 50, 50)

replicate = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REPLICATE)
reflect = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT)
reflect101 = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT_101)
wrap = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_WRAP)
constant = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_CONSTANT, value=0)

1.9 Arithmetic Operations | 数值计算

Direct Arithmetic (numpy) | 直接算术运算(numpy方式)

img_cat = cv2.imread('cat.jpg')
img_cat2 = img_cat + 10  # Add 10 to each pixel | 每个像素加10

# Numpy wraps around on overflow | numpy加法溢出取模
result = img_cat + img_cat2  # 256 becomes 0, 257 becomes 1

OpenCV Addition (cv2.add) | OpenCV加法

Syntax
dst = cv2.add(src1, src2)

Feature (特点): Clamps values at 255 (saturation operation) | 超过255时取255(饱和操作)

result = cv2.add(img_cat, img_cat2)  # Values > 255 become 255 | 超过255变为255

1.10 Image Blending (cv2.addWeighted) | 图像融合

Syntax
dst = cv2.addWeighted(src1, alpha, src2, beta, gamma)
Parameter Description (参数) 说明
src1, src2 Images to blend, must be same size (要融合的两幅图像,大小必须相同) 要融合的两幅图像
alpha Weight for src1 (src1的权重) src1的权重
beta Weight for src2 (src2的权重) src2的权重
gamma Bias value (偏置值) 偏置值

Formula (公式): dst = src1 * alpha + src2 * beta + gamma

img_cat = cv2.imread('cat.jpg')
img_dog = cv2.imread('dog.jpg')

# Resize to same size | 调整到相同大小
img_dog = cv2.resize(img_dog, (500, 414))

# Blend: cat 40%, dog 60% | 融合:cat占40%,dog占60%
res = cv2.addWeighted(img_cat, 0.4, img_dog, 0.6, 0)

1.11 Image Resizing (cv2.resize) | 图像缩放

Syntax
dst = cv2.resize(src, dsize, fx, fy)
Parameter Description (参数) 说明
dsize Target size (width, height) (目标大小:宽, 高) 目标大小
fx Horizontal scaling factor (水平缩放因子) 水平缩放因子
fy Vertical scaling factor (垂直缩放因子) 垂直缩放因子
res = cv2.resize(img, (500, 400))  # Specify target size | 指定目标大小
res = cv2.resize(img, (0, 0), fx=2, fy=2)  # Double size | 放大2倍
res = cv2.resize(img, (0, 0), fx=0.5, fy=0.5)  # Half size | 缩小一半

2. Image Transformations | 图像变换处理

2.1 Color Space Conversion (cv2.cvtColor) | 颜色空间转换

Syntax
dst = cv2.cvtColor(src, code)
Parameter Description (参数) 说明
src Input image (输入图像) 输入图像
code Conversion code (转换代码) 转换代码

Common Conversion Codes (常用转换代码):

Code (代码) Description (说明)
cv2.COLOR_BGR2GRAY BGR to Grayscale (BGR转灰度)
cv2.COLOR_BGR2HSV BGR to HSV (BGR转HSV)
cv2.COLOR_BGR2RGB BGR to RGB (for Matplotlib) (BGR转RGB,用于Matplotlib)
cv2.COLOR_GRAY2BGR Grayscale to BGR (灰度转BGR)

2.1.1 Grayscale Conversion | 灰度转换

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

2.1.2 HSV Color Space | HSV颜色空间

HSV Channel Meanings (HSV三通道含义):

Channel Name (名称) Range (范围) Description (说明)
H Hue (色调) 0-180 Color type (颜色种类,与光照无关)
S Saturation (饱和度) 0-255 Color purity (颜色纯度,越高越鲜艳)
V Value (明度) 0-255 Brightness (亮度,越大越亮)

Why HSV for RM? (为什么RM常用HSV?)

  • RGB对光照敏感,同样颜色在亮暗环境下R/G/B值变化大
  • HSV更鲁棒:H通道只表示颜色种类,光照变化影响小
  • HSV对光照更鲁棒,H通道色调不受亮度影响
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Extract red region | 提取红色区域
lower_red = np.array([0, 100, 100])
upper_red = np.array([10, 255, 255])
mask = cv2.inRange(hsv, lower_red, upper_red)

2.2 Image Thresholding (cv2.threshold) | 图像阈值

Syntax
ret, dst = cv2.threshold(src, thresh, maxval, type)
Parameter Description (参数) 说明
src Input image, single channel grayscale (单通道灰度图) 输入图像(单通道灰度图)
thresh Threshold value (阈值) 阈值
maxval Value when threshold exceeded (超过阈值赋予的值) 超过阈值赋予的值
type Threshold type (阈值类型) 阈值类型

Five Threshold Types (五种阈值类型):

Type (类型) Formula (公式) Visual Effect (视觉效果) 说明
THRESH_BINARY dst = maxval if src > thresh else 0 超过阈值变白,否则变黑 最常用,提取亮区
THRESH_BINARY_INV dst = 0 if src > thresh else maxval 超过阈值变黑,否则变白 THRESH_BINARY的反转
THRESH_TRUNC dst = thresh if src > thresh else src 超过阈值变阈值,否则不变 限幅,保护过曝
THRESH_TOZERO dst = src if src > thresh else 0 超过阈值不变,否则变黑 保留亮区细节
THRESH_TOZERO_INV dst = 0 if src > thresh else src 超过阈值变黑,否则不变 保留暗区细节

Visual Diagram (可视化图示):

Original:     ████████████████████
BINARY:       ████████████████▓▓▓▓  (>127变白)
BINARY_INV:   ▓▓▓▓███████████████  (>127变黑)
TRUNC:        ████████████████████  (>127变127)
TOZERO:       ??????????██████████  (<127变黑)
ret, thresh1 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)
ret, thresh2 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY_INV)
ret, thresh3 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TRUNC)
ret, thresh4 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TOZERO)
ret, thresh5 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TOZERO_INV)

2.3 Image Smoothing | 图像平滑

2.3.1 Mean Filter (cv2.blur) | 均值滤波

Syntax
dst = cv2.blur(src, ksize)

Feature (特点): Simple average, fast but blurs edges | 简单平均,速度快,但边缘模糊严重

blur = cv2.blur(img, (3, 3))

2.3.2 Box Filter (cv2.boxFilter) | 方框滤波

Syntax
dst = cv2.boxFilter(src, ddepth, ksize, normalize)
Parameter Description (参数) 说明
normalize=True Normalized, equivalent to blur (归一化,等价于均值滤波) 归一化
normalize=False Not normalized, may overflow (不归一化,可能溢出) 不归一化
box = cv2.boxFilter(img, -1, (3, 3), normalize=True)  # Equivalent to blur | 等价于blur

2.3.3 Gaussian Filter (cv2.GaussianBlur) | 高斯滤波

Syntax
dst = cv2.GaussianBlur(src, ksize, sigmaX)

Feature (特点): Gaussian weights, better edge preservation | 高斯权重,边缘保留较好

gaussian = cv2.GaussianBlur(img, (5, 5), 1)

2.3.4 Median Filter (cv2.medianBlur) | 中值滤波

Syntax
dst = cv2.medianBlur(src, ksize)

Feature (特点): Uses median value, best for salt-and-pepper noise | 用中值替代,对椒盐噪声效果最好

median = cv2.medianBlur(img, 5)

Comparison (对比):

Filter (滤波器) Best For (最适合) Feature (特点)
blur (均值) General smoothing (一般平滑) Fast but blurs edges (速度快,但模糊边缘)
GaussianBlur (高斯) Gaussian noise (高斯噪声) Better edge preservation (边缘保留较好)
medianBlur (中值) Salt-and-pepper noise (椒盐噪声) Best for impulse noise (对脉冲噪声最好)

2.4 Morphological Operations | 形态学操作

2.4.1 Erosion (cv2.erode) | 腐蚀

Syntax
dst = cv2.erode(src, kernel, iterations)

Effect (效果): Shrinks foreground, removes small noise | 前景缩小,去除毛刺

kernel = np.ones((3, 3), np.uint8)
erosion = cv2.erode(img, kernel, iterations=1)

2.4.2 Dilation (cv2.dilate) | 膨胀

Syntax
dst = cv2.dilate(src, kernel, iterations)

Effect (效果): Expands foreground, fills holes | 前景扩大,填补空洞

dilation = cv2.dilate(img, kernel, iterations=1)

2.4.3 Advanced Operations (cv2.morphologyEx) | 高级形态学操作

Syntax
dst = cv2.morphologyEx(src, op, kernel)
Operation (操作) Formula (公式) Description (说明)
MORPH_OPEN (开运算) Erode then Dilate Removes noise (去噪点)
MORPH_CLOSE (闭运算) Dilate then Erode Fills holes (填空洞)
MORPH_GRADIENT (梯度) Dilate - Erode Extracts edges (提取轮廓)
MORPH_TOPHAT (礼帽) Original - Open Extracts small objects (提取小物体)
MORPH_BLACKHAT (黑帽) Close - Original Extracts small holes (提取小空洞)
kernel = np.ones((5, 5), np.uint8)
opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)  # 开运算
closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)  # 闭运算
gradient = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)  # 梯度

2.5 Image Gradient | 图像梯度

2.5.1 Sobel Operator (cv2.Sobel) | Sobel算子

Syntax
dst = cv2.Sobel(src, ddepth, dx, dy, ksize)
Parameter Description (参数) 说明
ddepth Output depth, commonly CV_64F (输出深度,常用CV_64F) 输出深度
dx X derivative (1=compute, 0=ignore) (X方向导数) X方向导数
dy Y derivative (Y方向导数) Y方向导数
ksize Kernel size (1, 3, 5, 7) (核大小) 核大小

Important (重要): Sobel计算时白到黑是负数,会被截断为0。解决:取绝对值 + convertScaleAbs

正确做法:分离计算x和y方向,然后加权融合

# Separate calculation | 分离计算
sobelx = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3)
sobelx = cv2.convertScaleAbs(sobelx)  # Take absolute value | 取绝对值

sobely = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3)
sobely = cv2.convertScaleAbs(sobely)

# Combine | 融合
sobelxy = cv2.addWeighted(sobelx, 0.5, sobely, 0.5, 0)

2.5.2 Scharr Operator (cv2.Scharr) | Scharr算子

Syntax
dst = cv2.Scharr(src, ddepth, dx, dy)

Feature (特点): More sensitive than Sobel, fixed ksize=3 | 比Sobel更灵敏,ksize固定为3

scharrx = cv2.Scharr(img, cv2.CV_64F, 1, 0)
scharry = cv2.Scharr(img, cv2.CV_64F, 0, 1)

2.5.3 Laplacian Operator (cv2.Laplacian) | Laplacian算子

Syntax
dst = cv2.Laplacian(src, ddepth)

Feature (特点): Second derivative, noise sensitive, usually needs prior smoothing | 二阶导数,对噪声敏感,通常需要先滤波

blurred = cv2.GaussianBlur(img, (3, 3), 0)  # Prior smoothing | 先去噪
laplacian = cv2.Laplacian(blurred, cv2.CV_64F)
laplacian = cv2.convertScaleAbs(laplacian)

2.6 Edge Detection (cv2.Canny) | 边缘检测

Syntax
dst = cv2.Canny(src, threshold1, threshold2)
Parameter Description (参数) 说明
threshold1 Lower threshold (低阈值) 低阈值
threshold2 Upper threshold (高阈值) 高阈值

Canny Algorithm Steps (Canny算法步骤):

  1. Gaussian blur for noise reduction (高斯滤波去噪)
  2. Calculate gradient magnitude and direction (计算梯度)
  3. Non-maximum suppression (非极大值抑制)
  4. Double threshold detection (双阈值检测)
  5. Edge tracking by hysteresis (抑制孤立弱边缘)

Threshold Selection Guide (阈值选择指南):

  • threshold2 ≈ 2~3倍 threshold1 | 最佳平衡
  • 阈值都低 → 检测多(可能含噪声)
  • 阈值都高 → 检测少(可能丢失真正边缘)
  • 调试建议:先从(50, 150)开始,根据结果调整
v1 = cv2.Canny(img, 50, 100)   # Loose (宽松)
v2 = cv2.Canny(img, 80, 150)   # Moderate (适中) - commonly used for RM
v3 = cv2.Canny(img, 100, 200)  # Strict (严格)

2.7 Image Pyramid | 图像金字塔

2.7.1 Up-sampling (cv2.pyrUp) | 向上采样

Syntax
dst = cv2.pyrUp(src)

Effect (效果): Image size doubles (图像放大2倍)

up = cv2.pyrUp(img)

2.7.2 Down-sampling (cv2.pyrDown) | 向下采样

Syntax
dst = cv2.pyrDown(src)

Effect (效果): Image size halves (图像缩小2倍)

down = cv2.pyrDown(img)

2.7.3 Laplacian Pyramid | 拉普拉斯金字塔

# Laplacian = Original - pyrUp(pyrDown(Original))
down = cv2.pyrDown(img)
down_up = cv2.pyrUp(down)
laplacian = img - down_up

2.8 Image Contours | 图像轮廓

2.8.1 Find Contours (cv2.findContours) | 查找轮廓

Syntax
contours, hierarchy = cv2.findContours(src, mode, method)

OpenCV Version Note (版本注意):

  • OpenCV 4.x: returns (contours, hierarchy)
  • OpenCV 3.x: returns (image, contours, hierarchy)
Parameter Description (参数) 说明
mode Contour retrieval mode (轮廓检索模式) 轮廓检索模式
method Contour approximation method (轮廓逼近方法) 轮廓逼近方法

Retrieval Modes (检索模式):

Mode (模式) Description (说明)
RETR_EXTERNAL Only outer contours (只检索最外层)
RETR_LIST All contours, no hierarchy (所有轮廓,无层级)
RETR_CCOMP Two-level hierarchy (两层层级)
RETR_TREE Full hierarchy tree (完整层级) - most commonly used

Approximation Methods (逼近方法):

Method (方法) Description (说明)
CHAIN_APPROX_NONE Keep all points (保留所有点)
CHAIN_APPROX_SIMPLE Compress to endpoints (压缩只保留端点) - recommended
ret, thresh = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

2.8.2 Draw Contours (cv2.drawContours) | 绘制轮廓

Syntax
image = cv2.drawContours(image, contours, contourIdx, color, thickness)
draw_img = img.copy()
res = cv2.drawContours(draw_img, contours, -1, (0, 0, 255), 2)  # -1 = all contours

2.8.3 Contour Features | 轮廓特征

Function (函数) Description (说明)
cv2.contourArea(cnt) Calculate area (计算面积)
cv2.arcLength(cnt, closed) Calculate perimeter (计算周长)
cv2.approxPolyDP(cnt, epsilon, closed) Approximate polygon (多边形逼近)
cv2.boundingRect(cnt) Get bounding rectangle (边界矩形)
cv2.minEnclosingCircle(cnt) Get minimum enclosing circle (外接圆)
cnt = contours[0]
area = cv2.contourArea(cnt)  # Area | 面积
perimeter = cv2.arcLength(cnt, True)  # Perimeter | 周长

# Polygon approximation | 多边形逼近
epsilon = 0.15 * perimeter  # 0.01-0.1 for fine to rough | 精细到粗糙
approx = cv2.approxPolyDP(cnt, epsilon, True)

# Bounding rectangle | 边界矩形
x, y, w, h = cv2.boundingRect(cnt)

# Minimum enclosing circle | 外接圆
(x, y), radius = cv2.minEnclosingCircle(cnt)

2.9 Fourier Transform | 傅里叶变换

2.9.1 DFT (cv2.dft) | 离散傅里叶变换

Syntax
dst = cv2.dft(src, flags)

Key Concepts (核心概念):

Concept Description (说明)
High Frequency (高频) Rapid changes: edges, noise (快速变化:边缘、噪声)
Low Frequency (低频) Slow changes: background (缓慢变化:背景)
Low-pass Filter (低通滤波) Keep low frequency → Blur (保留低频 → 模糊)
High-pass Filter (高通滤波) Keep high frequency → Sharpen (保留高频 → 锐化)
import numpy as np

img = cv2.imread('lena.jpg', 0)
img_float = np.float32(img)  # Must be float32 | 必须是浮点型

# DFT | 傅里叶变换
dft = cv2.dft(img_float, flags=cv2.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft)  # Center low frequency | 将低频移到中心

# Magnitude spectrum | 幅值谱
magnitude = 20 * np.log(cv2.magnitude(dft_shift[:, :, 0], dft_shift[:, :, 1]))

2.9.2 IDFT (cv2.idft) | 逆傅里叶变换

Syntax
dst = cv2.idft(src)

Low-pass Filter Example (低通滤波示例):

rows, cols = img.shape
crow, ccol = rows // 2, cols // 2  # Center | 中心

# Create low-pass mask | 创建低通掩码
mask = np.zeros((rows, cols, 2), np.uint8)
mask[crow-30:crow+30, ccol-30:ccol+30] = 1  # Keep central region | 保留中心区域

# Apply mask | 应用掩码
fshift = dft_shift * mask

# Inverse shift | 逆中心化
f_ishift = np.fft.ifftshift(fshift)

# Inverse DFT | 逆变换
img_back = cv2.idft(f_ishift)
img_back = cv2.magnitude(img_back[:, :, 0], img_back[:, :, 1])  # Get magnitude | 取模

3. Practical Tips & FAQ | 实用技巧与常见问题

Performance Tips for RM | RM性能优化建议

Tip Description (说明)
Avoid unnecessary copies Use img.copy() only when needed (避免不必要的拷贝)
Use in-place operations Some functions can modify in-place (使用原地操作)
Choose right kernel size Larger kernels are slower (大核更慢)
Prefer simple filters blur is faster than GaussianBlur (简单滤波更快)
Use uint8 when possible Avoid unnecessary type conversions (避免类型转换)

Common Issues & Solutions | 常见问题与解决

Issue Cause Solution
Image won’t display No waitKey() Add cv2.waitKey()
Video won’t open Wrong path or codec Check file path, install codecs
findContours error Wrong number of return values Check OpenCV version
Colors look wrong BGR vs RGB confusion Use cv2.COLOR_BGR2RGB
Noise in result Input image too noisy Apply smoothing filter first

Troubleshooting Guide | 故障排除指南

Q: Image reading returns None? | 图像读取返回None? A: Check if the file path is correct and the file exists. 检查文件路径是否正确,文件是否存在。

Q: How to handle 4-channel images (RGBA)? | 如何处理4通道图像? A: Use cv2.IMREAD_UNCHANGED to preserve alpha channel, or convert to 3-channel BGR. 使用cv2.IMREAD_UNCHANGED保留alpha通道,或转为3通道BGR。

Q: Morphology results are unexpected? | 形态学结果不符合预期? A: Try different kernel sizes and iterations. Start with (3,3) and iterations=1. 尝试不同的核大小和迭代次数。从(3,3)iterations=1开始。


Appendix: Quick Reference | 附录:常用函数速查

Basic Operations | 基础操作

Function Purpose (功能)
cv2.imread() Read image (读取图像)
cv2.imshow() Display image (显示图像)
cv2.imwrite() Save image (保存图像)
cv2.waitKey() Wait for key (等待按键)
cv2.destroyAllWindows() Close windows (销毁窗口)

Color Space | 颜色空间

Function Purpose (功能)
cv2.cvtColor() Convert color space (颜色空间转换)
cv2.split() Split channels (通道分离)
cv2.merge() Merge channels (通道合并)
cv2.inRange() Color range filter (颜色范围筛选)

Image Processing | 图像处理

Function Purpose (功能)
cv2.threshold() Thresholding (阈值处理)
cv2.blur() Mean filter (均值滤波)
cv2.GaussianBlur() Gaussian filter (高斯滤波)
cv2.medianBlur() Median filter (中值滤波)
cv2.erode() Erosion (腐蚀)
cv2.dilate() Dilation (膨胀)
cv2.morphologyEx() Morphological ops (形态学操作)
cv2.Sobel() Sobel operator (Sobel算子)
cv2.Canny() Canny edge detection (Canny边缘检测)

Contours & Features | 轮廓与特征

Function Purpose (功能)
cv2.findContours() Find contours (查找轮廓)
cv2.drawContours() Draw contours (绘制轮廓)
cv2.contourArea() Calculate area (计算面积)
cv2.arcLength() Calculate perimeter (计算周长)
cv2.approxPolyDP() Approximate contour (轮廓逼近)

Transformations | 变换

Function Purpose (功能)
cv2.resize() Resize image (图像缩放)
cv2.addWeighted() Blend images (图像融合)
cv2.pyrUp() Up-sample (向上采样)
cv2.pyrDown() Down-sample (向下采样)
cv2.dft() Fourier transform (傅里叶变换)
cv2.idft() Inverse Fourier transform (逆傅里叶变换)

Document Version | 文档版本: 1.0 Based on | 基于: OpenCV 4.x Last Updated | 最后更新: 2026-05-03