OpenCV Image Processing Complete Guide | OpenCV图像处理完全指南

Introduction: The Inner Logic of Image Processing | 引言：图像处理的内在逻辑

Fundamental Concepts | 核心概念

In computer vision, images are represented as matrices of pixel values. A color image typically consists of three channels (BGR in OpenCV), while a grayscale image has a single channel. Each pixel value ranges from 0 to 255, representing intensity.

在计算机视觉中，图像被表示为像素值矩阵。彩色图像通常包含三个通道（OpenCV中为BGR格式），而灰度图像只有一个通道。每个像素值的范围是0到255，表示亮度强度。

Processing Pipeline | 处理流程

The operations in this guide follow a logical sequence:

本指南中的操作遵循以下逻辑流程：

Image Acquisition → Reading images/videos from files or cameras 图像获取 → 从文件或摄像头读取图像/视频
Preprocessing → Basic operations like resizing, cropping, color space conversion 预处理 → 调整大小、裁剪、颜色空间转换等基本操作
Enhancement → Smoothing, sharpening, contrast adjustment 增强 → 平滑、锐化、对比度调整
Feature Extraction → Edge detection, contour finding, gradient calculation 特征提取 → 边缘检测、轮廓查找、梯度计算
Transformation → Pyramids, Fourier transform for frequency domain analysis 变换 → 金字塔、傅里叶变换用于频域分析

Core Principles | 核心原理

Convolution (卷积): The foundation of many operations (smoothing, edge detection). A kernel slides over the image, computing weighted sums. 卷积：许多操作的基础（平滑、边缘检测）。一个核在图像上滑动，计算加权和。
Thresholding (阈值化): Converts grayscale images to binary by comparing pixel values to a threshold. 阈值化：通过将像素值与阈值比较，将灰度图像转换为二值图像。
Morphology (形态学): Based on set theory, operations like erosion and dilation modify shape. 形态学：基于集合论，腐蚀和膨胀等操作修改形状。
Frequency Domain (频域): Transforms spatial information to frequency domain for filtering. 频域：将空间信息转换到频域进行滤波。

Why These Operations? | 为什么需要这些操作？

These operations form the building blocks of computer vision systems. They enable:

这些操作构成了计算机视觉系统的基础模块，使我们能够：

Object Detection (目标检测): Identify and locate objects in images 目标检测：识别并定位图像中的物体
Image Segmentation (图像分割): Separate foreground from background 图像分割：分离前景和背景
Feature Matching (特征匹配): Compare and match patterns across images 特征匹配：在图像间比较和匹配模式
Image Restoration (图像恢复): Remove noise and artifacts 图像恢复：去除噪声和伪影

Understanding these operations is essential for developing robust computer vision applications, especially for RM (RoboMaster) competition scenarios where real-time processing and accuracy are critical.

理解这些操作对于开发强大的计算机视觉应用至关重要，特别是在RM（RoboMaster）竞赛场景中，实时处理和准确性至关重要。

Version Compatibility | 版本兼容性

This guide is based on OpenCV 4.x. Note that:

OpenCV 4.x: cv2.findContours() returns (contours, hierarchy)
OpenCV 3.x: cv2.findContours() returns (image, contours, hierarchy)

本指南基于OpenCV 4.x编写。注意：

OpenCV 4.x：cv2.findContours()返回(contours, hierarchy)
OpenCV 3.x：cv2.findContours()返回(image, contours, hierarchy)

1. Basic Image Operations | 图像基本操作

1.1 Reading Images (cv2.imread) | 图像读取

Syntax
`img = cv2.imread(filename, flags)`

Parameter	Description (参数)
`filename`	Path to image file (图像文件路径)
`flags`	Reading mode (读取模式)

flags Parameter Details (flags参数详解):

Value (值)	Constant (常量)	Description (说明)
`1`	`cv2.IMREAD_COLOR`	Color image (default), ignores transparency (彩色图像，默认，忽略透明度)
`0`	`cv2.IMREAD_GRAYSCALE`	Grayscale image (灰度图像)
`-1`	`cv2.IMREAD_UNCHANGED`	Keep original channels including alpha (保留原始通道，含透明度)

Note (注意): OpenCV reads images in BGR format, not RGB! When displaying with Matplotlib, convert to RGB using cv2.cvtColor(img, cv2.COLOR_BGR2RGB).

OpenCV读取图像的格式是BGR，不是RGB！用Matplotlib显示时，需要用cv2.cvtColor(img, cv2.COLOR_BGR2RGB)转换。

import cv2

img = cv2.imread('cat.jpg')  # Read color image (BGR format) | 读取彩色图像（BGR格式）
img_gray = cv2.imread('cat.jpg', cv2.IMREAD_GRAYSCALE)  # Read grayscale | 读取灰度图

1.2 Displaying Images (cv2.imshow) | 图像显示

Syntax
`cv2.imshow(winname, mat)`

Parameter	Description (参数)
`winname`	Window name (窗口名称)
`mat`	Image array to display (要显示的图像)

Supporting Functions (配合使用的函数):

Function (函数)	Syntax (语法)	Description (说明)
`cv2.waitKey()`	`cv2.waitKey(delay)`	Wait for key press, delay=0 means infinite (等待按键，0=无限等待)
`cv2.destroyAllWindows()`	`cv2.destroyAllWindows()`	Destroy all windows (销毁所有窗口)

cv2.imshow('image', img)  # Display image | 显示图像
cv2.waitKey(0)  # Press any key to continue | 按任意键继续
cv2.destroyAllWindows()  # Close windows | 关闭窗口

# Convenience function | 便捷函数
def cv_show(name, img):
    cv2.imshow(name, img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

1.3 Saving Images (cv2.imwrite) | 图像保存

Syntax
`retval = cv2.imwrite(filename, img)`

Parameter	Description (参数)
`filename`	Save path and filename (保存路径)
`img`	Image to save (要保存的图像)
`retval`	Returns True if successful (返回值)

cv2.imwrite('my_cat.png', img)  # Save as PNG | 保存为PNG格式

1.4 Image Properties | 图像属性

Property/Method (属性)	Description (说明)	Example (示例)
`img.shape`	Image shape: (height, width, channels) (图像形状：高度, 宽度, 通道数)	`(414, 500, 3)`
`img.size`	Total number of pixels (总像素数)	`621000`
`img.dtype`	Data type (数据类型)	`uint8` (0-255)
`type(img)`	Object type (对象类型)	`numpy.ndarray`

print(img.shape)   # (414, 500, 3)
print(img.size)    # 621000
print(img.dtype)   # uint8
print(type(img))   # <class 'numpy.ndarray'>

1.5 Reading Video (cv2.VideoCapture) | 视频读取

Syntax
`cap = cv2.VideoCapture(index/filename)`

Parameter	Description (参数)
`index`	Camera index, 0=default (摄像头索引，0=默认)
`filename`	Path to video file (视频文件路径)

Common Methods (常用方法):

Method (方法)	Description (说明)
`cap.isOpened()`	Check if video opened successfully (检查视频是否成功打开)
`cap.read()`	Read one frame, returns (ret, frame) (读取一帧，返回ret, frame)
`cap.release()`	Release resources (释放资源)

vc = cv2.VideoCapture('test.mp4')  # Open video file | 打开视频文件

if not vc.isOpened():
    print("Cannot open video")  # Cannot open video | 无法打开视频
    exit()

while True:
    ret, frame = vc.read()  # Read frame | 读取帧
    if not ret:
        break
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    cv2.imshow('result', gray)
    if cv2.waitKey(100) & 0xFF == 27:  # ESC to exit | ESC键退出
        break

vc.release()
cv2.destroyAllWindows()

1.6 ROI Extraction | ROI截取

ROI (Region of Interest) | ROI（感兴趣区域）

img = cv2.imread('cat.jpg')
cat_face = img[0:50, 0:200]  # [row range, column range] | [行范围, 列范围]
cv_show('cat_face', cat_face)

1.7 Color Channel Operations | 颜色通道操作

Channel Split (cv2.split) | 通道分离

Syntax
`b, g, r = cv2.split(img)`

Note (注意): Order is B, G, R because OpenCV uses BGR format. 返回顺序是B、G、R，因为OpenCV使用BGR格式。

Channel Merge (cv2.merge) | 通道合并

Syntax
`img = cv2.merge((b, g, r))`

b, g, r = cv2.split(img)  # Split channels | 分离通道
img = cv2.merge((b, g, r))  # Merge channels | 合并通道

# Keep only red channel | 只保留红色通道
cur_img = img.copy()
cur_img[:, :, 0] = 0  # Set B channel to 0 | B通道置0
cur_img[:, :, 1] = 0  # Set G channel to 0 | G通道置0

1.8 Border Padding (cv2.copyMakeBorder) | 边界填充

Syntax
`dst = cv2.copyMakeBorder(src, top, bottom, left, right, borderType, value)`

Parameter	Description (参数)
`top/bottom/left/right`	Padding pixels in each direction (各方向填充像素数)
`borderType`	Padding type (填充类型)
`value`	Constant value for CONSTANT type (常量填充的值)

borderType Options (borderType填充类型):

Type (类型)	Description (说明)	Example (示例)
`BORDER_REPLICATE`	Replicate edge pixels (复制边缘像素)	`aaaaaa\|abcdefgh\|hhhhhhh`
`BORDER_REFLECT`	Mirror reflection (镜像反射)	`fedcba\|abcdefgh\|hgfedcb`
`BORDER_REFLECT_101`	Reflect around edge (以边缘为轴镜像)	`gfedcb\|abcdefgh\|gfedcba`
`BORDER_WRAP`	Wrap around (外包装)	`cdefgh\|abcdefgh\|abcdefg`
`BORDER_CONSTANT`	Constant value (常量填充)	`iiiiii\|abcdefgh\|iiiiii`

top_size, bottom_size, left_size, right_size = (50, 50, 50, 50)

replicate = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REPLICATE)
reflect = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT)
reflect101 = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT_101)
wrap = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_WRAP)
constant = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_CONSTANT, value=0)

1.9 Arithmetic Operations | 数值计算

Direct Arithmetic (numpy) | 直接算术运算（numpy方式）

img_cat = cv2.imread('cat.jpg')
img_cat2 = img_cat + 10  # Add 10 to each pixel | 每个像素加10

# Numpy wraps around on overflow | numpy加法溢出取模
result = img_cat + img_cat2  # 256 becomes 0, 257 becomes 1

OpenCV Addition (cv2.add) | OpenCV加法

Syntax
`dst = cv2.add(src1, src2)`

Feature (特点): Clamps values at 255 (saturation operation) | 超过255时取255（饱和操作）

result = cv2.add(img_cat, img_cat2)  # Values > 255 become 255 | 超过255变为255

1.10 Image Blending (cv2.addWeighted) | 图像融合

Syntax
`dst = cv2.addWeighted(src1, alpha, src2, beta, gamma)`

Parameter	Description (参数)	说明
`src1, src2`	Images to blend, must be same size (要融合的两幅图像，大小必须相同)	要融合的两幅图像
`alpha`	Weight for src1 (src1的权重)	src1的权重
`beta`	Weight for src2 (src2的权重)	src2的权重
`gamma`	Bias value (偏置值)	偏置值

Formula (公式): dst = src1 * alpha + src2 * beta + gamma

img_cat = cv2.imread('cat.jpg')
img_dog = cv2.imread('dog.jpg')

# Resize to same size | 调整到相同大小
img_dog = cv2.resize(img_dog, (500, 414))

# Blend: cat 40%, dog 60% | 融合：cat占40%，dog占60%
res = cv2.addWeighted(img_cat, 0.4, img_dog, 0.6, 0)

1.11 Image Resizing (cv2.resize) | 图像缩放

Syntax
`dst = cv2.resize(src, dsize, fx, fy)`

Parameter	Description (参数)	说明
`dsize`	Target size (width, height) (目标大小：宽, 高)	目标大小
`fx`	Horizontal scaling factor (水平缩放因子)	水平缩放因子
`fy`	Vertical scaling factor (垂直缩放因子)	垂直缩放因子

res = cv2.resize(img, (500, 400))  # Specify target size | 指定目标大小
res = cv2.resize(img, (0, 0), fx=2, fy=2)  # Double size | 放大2倍
res = cv2.resize(img, (0, 0), fx=0.5, fy=0.5)  # Half size | 缩小一半

2. Image Transformations | 图像变换处理

2.1 Color Space Conversion (cv2.cvtColor) | 颜色空间转换

Syntax
`dst = cv2.cvtColor(src, code)`

Parameter	Description (参数)	说明
`src`	Input image (输入图像)	输入图像
`code`	Conversion code (转换代码)	转换代码

Common Conversion Codes (常用转换代码):

Code (代码)	Description (说明)
`cv2.COLOR_BGR2GRAY`	BGR to Grayscale (BGR转灰度)
`cv2.COLOR_BGR2HSV`	BGR to HSV (BGR转HSV)
`cv2.COLOR_BGR2RGB`	BGR to RGB (for Matplotlib) (BGR转RGB，用于Matplotlib)
`cv2.COLOR_GRAY2BGR`	Grayscale to BGR (灰度转BGR)

2.1.1 Grayscale Conversion | 灰度转换

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

2.1.2 HSV Color Space | HSV颜色空间

HSV Channel Meanings (HSV三通道含义):

Channel	Name (名称)	Range (范围)	Description (说明)
H	Hue (色调)	0-180	Color type (颜色种类，与光照无关)
S	Saturation (饱和度)	0-255	Color purity (颜色纯度，越高越鲜艳)
V	Value (明度)	0-255	Brightness (亮度，越大越亮)

Why HSV for RM? (为什么RM常用HSV?)

RGB对光照敏感，同样颜色在亮暗环境下R/G/B值变化大

HSV更鲁棒：H通道只表示颜色种类，光照变化影响小

HSV对光照更鲁棒，H通道色调不受亮度影响

hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Extract red region | 提取红色区域
lower_red = np.array([0, 100, 100])
upper_red = np.array([10, 255, 255])
mask = cv2.inRange(hsv, lower_red, upper_red)

2.2 Image Thresholding (cv2.threshold) | 图像阈值

Syntax
`ret, dst = cv2.threshold(src, thresh, maxval, type)`

Parameter	Description (参数)	说明
`src`	Input image, single channel grayscale (单通道灰度图)	输入图像（单通道灰度图）
`thresh`	Threshold value (阈值)	阈值
`maxval`	Value when threshold exceeded (超过阈值赋予的值)	超过阈值赋予的值
`type`	Threshold type (阈值类型)	阈值类型

Five Threshold Types (五种阈值类型):

Type (类型)	Formula (公式)	Visual Effect (视觉效果)	说明
`THRESH_BINARY`	dst = maxval if src > thresh else 0	超过阈值变白，否则变黑	最常用，提取亮区
`THRESH_BINARY_INV`	dst = 0 if src > thresh else maxval	超过阈值变黑，否则变白	THRESH_BINARY的反转
`THRESH_TRUNC`	dst = thresh if src > thresh else src	超过阈值变阈值，否则不变	限幅，保护过曝
`THRESH_TOZERO`	dst = src if src > thresh else 0	超过阈值不变，否则变黑	保留亮区细节
`THRESH_TOZERO_INV`	dst = 0 if src > thresh else src	超过阈值变黑，否则不变	保留暗区细节

Visual Diagram (可视化图示):

Original:     ████████████████████
BINARY:       ████████████████▓▓▓▓  (>127变白)
BINARY_INV:   ▓▓▓▓███████████████  (>127变黑)
TRUNC:        ████████████████████  (>127变127)
TOZERO:       ??????????██████████  (<127变黑)

ret, thresh1 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)
ret, thresh2 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY_INV)
ret, thresh3 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TRUNC)
ret, thresh4 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TOZERO)
ret, thresh5 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TOZERO_INV)

2.3 Image Smoothing | 图像平滑

2.3.1 Mean Filter (cv2.blur) | 均值滤波

Syntax
`dst = cv2.blur(src, ksize)`

Feature (特点): Simple average, fast but blurs edges | 简单平均，速度快，但边缘模糊严重

blur = cv2.blur(img, (3, 3))

2.3.2 Box Filter (cv2.boxFilter) | 方框滤波

Syntax
`dst = cv2.boxFilter(src, ddepth, ksize, normalize)`

Parameter	Description (参数)	说明
`normalize=True`	Normalized, equivalent to blur (归一化，等价于均值滤波)	归一化
`normalize=False`	Not normalized, may overflow (不归一化，可能溢出)	不归一化

box = cv2.boxFilter(img, -1, (3, 3), normalize=True)  # Equivalent to blur | 等价于blur

2.3.3 Gaussian Filter (cv2.GaussianBlur) | 高斯滤波

Syntax
`dst = cv2.GaussianBlur(src, ksize, sigmaX)`

Feature (特点): Gaussian weights, better edge preservation | 高斯权重，边缘保留较好

gaussian = cv2.GaussianBlur(img, (5, 5), 1)

2.3.4 Median Filter (cv2.medianBlur) | 中值滤波

Syntax
`dst = cv2.medianBlur(src, ksize)`

Feature (特点): Uses median value, best for salt-and-pepper noise | 用中值替代，对椒盐噪声效果最好

median = cv2.medianBlur(img, 5)

Comparison (对比):

Filter (滤波器)	Best For (最适合)	Feature (特点)
`blur` (均值)	General smoothing (一般平滑)	Fast but blurs edges (速度快，但模糊边缘)
`GaussianBlur` (高斯)	Gaussian noise (高斯噪声)	Better edge preservation (边缘保留较好)
`medianBlur` (中值)	Salt-and-pepper noise (椒盐噪声)	Best for impulse noise (对脉冲噪声最好)

2.4 Morphological Operations | 形态学操作

2.4.1 Erosion (cv2.erode) | 腐蚀

Syntax
`dst = cv2.erode(src, kernel, iterations)`

Effect (效果): Shrinks foreground, removes small noise | 前景缩小，去除毛刺

kernel = np.ones((3, 3), np.uint8)
erosion = cv2.erode(img, kernel, iterations=1)

2.4.2 Dilation (cv2.dilate) | 膨胀

Syntax
`dst = cv2.dilate(src, kernel, iterations)`

Effect (效果): Expands foreground, fills holes | 前景扩大，填补空洞

dilation = cv2.dilate(img, kernel, iterations=1)

2.4.3 Advanced Operations (cv2.morphologyEx) | 高级形态学操作

Syntax
`dst = cv2.morphologyEx(src, op, kernel)`

Operation (操作)	Formula (公式)	Description (说明)
`MORPH_OPEN` (开运算)	Erode then Dilate	Removes noise (去噪点)
`MORPH_CLOSE` (闭运算)	Dilate then Erode	Fills holes (填空洞)
`MORPH_GRADIENT` (梯度)	Dilate - Erode	Extracts edges (提取轮廓)
`MORPH_TOPHAT` (礼帽)	Original - Open	Extracts small objects (提取小物体)
`MORPH_BLACKHAT` (黑帽)	Close - Original	Extracts small holes (提取小空洞)

kernel = np.ones((5, 5), np.uint8)
opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)  # 开运算
closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)  # 闭运算
gradient = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)  # 梯度

2.5 Image Gradient | 图像梯度

2.5.1 Sobel Operator (cv2.Sobel) | Sobel算子

Syntax
`dst = cv2.Sobel(src, ddepth, dx, dy, ksize)`

Parameter	Description (参数)	说明
`ddepth`	Output depth, commonly CV_64F (输出深度，常用CV_64F)	输出深度
`dx`	X derivative (1=compute, 0=ignore) (X方向导数)	X方向导数
`dy`	Y derivative (Y方向导数)	Y方向导数
`ksize`	Kernel size (1, 3, 5, 7) (核大小)	核大小

Important (重要): Sobel计算时白到黑是负数，会被截断为0。解决：取绝对值 + convertScaleAbs

正确做法：分离计算x和y方向，然后加权融合

# Separate calculation | 分离计算
sobelx = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3)
sobelx = cv2.convertScaleAbs(sobelx)  # Take absolute value | 取绝对值

sobely = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3)
sobely = cv2.convertScaleAbs(sobely)

# Combine | 融合
sobelxy = cv2.addWeighted(sobelx, 0.5, sobely, 0.5, 0)

2.5.2 Scharr Operator (cv2.Scharr) | Scharr算子

Syntax
`dst = cv2.Scharr(src, ddepth, dx, dy)`

Feature (特点): More sensitive than Sobel, fixed ksize=3 | 比Sobel更灵敏，ksize固定为3

scharrx = cv2.Scharr(img, cv2.CV_64F, 1, 0)
scharry = cv2.Scharr(img, cv2.CV_64F, 0, 1)

2.5.3 Laplacian Operator (cv2.Laplacian) | Laplacian算子

Syntax
`dst = cv2.Laplacian(src, ddepth)`

Feature (特点): Second derivative, noise sensitive, usually needs prior smoothing | 二阶导数，对噪声敏感，通常需要先滤波

blurred = cv2.GaussianBlur(img, (3, 3), 0)  # Prior smoothing | 先去噪
laplacian = cv2.Laplacian(blurred, cv2.CV_64F)
laplacian = cv2.convertScaleAbs(laplacian)

2.6 Edge Detection (cv2.Canny) | 边缘检测

Syntax
`dst = cv2.Canny(src, threshold1, threshold2)`

Parameter	Description (参数)	说明
`threshold1`	Lower threshold (低阈值)	低阈值
`threshold2`	Upper threshold (高阈值)	高阈值

Canny Algorithm Steps (Canny算法步骤):

Gaussian blur for noise reduction (高斯滤波去噪)
Calculate gradient magnitude and direction (计算梯度)
Non-maximum suppression (非极大值抑制)
Double threshold detection (双阈值检测)
Edge tracking by hysteresis (抑制孤立弱边缘)

Threshold Selection Guide (阈值选择指南):

threshold2 ≈ 2~3倍 threshold1 | 最佳平衡

阈值都低 → 检测多（可能含噪声）

阈值都高 → 检测少（可能丢失真正边缘）

调试建议：先从(50, 150)开始，根据结果调整

v1 = cv2.Canny(img, 50, 100)   # Loose (宽松)
v2 = cv2.Canny(img, 80, 150)   # Moderate (适中) - commonly used for RM
v3 = cv2.Canny(img, 100, 200)  # Strict (严格)

2.7 Image Pyramid | 图像金字塔

2.7.1 Up-sampling (cv2.pyrUp) | 向上采样

Syntax
`dst = cv2.pyrUp(src)`

Effect (效果): Image size doubles (图像放大2倍)

up = cv2.pyrUp(img)

2.7.2 Down-sampling (cv2.pyrDown) | 向下采样

Syntax
`dst = cv2.pyrDown(src)`

Effect (效果): Image size halves (图像缩小2倍)

down = cv2.pyrDown(img)

2.7.3 Laplacian Pyramid | 拉普拉斯金字塔

# Laplacian = Original - pyrUp(pyrDown(Original))
down = cv2.pyrDown(img)
down_up = cv2.pyrUp(down)
laplacian = img - down_up

2.8 Image Contours | 图像轮廓

2.8.1 Find Contours (cv2.findContours) | 查找轮廓

Syntax
`contours, hierarchy = cv2.findContours(src, mode, method)`

OpenCV Version Note (版本注意):

OpenCV 4.x: returns (contours, hierarchy)

OpenCV 3.x: returns (image, contours, hierarchy)

Parameter	Description (参数)	说明
`mode`	Contour retrieval mode (轮廓检索模式)	轮廓检索模式
`method`	Contour approximation method (轮廓逼近方法)	轮廓逼近方法

Retrieval Modes (检索模式):

Mode (模式)	Description (说明)
`RETR_EXTERNAL`	Only outer contours (只检索最外层)
`RETR_LIST`	All contours, no hierarchy (所有轮廓，无层级)
`RETR_CCOMP`	Two-level hierarchy (两层层级)
`RETR_TREE`	Full hierarchy tree (完整层级) - most commonly used

Approximation Methods (逼近方法):

Method (方法)	Description (说明)
`CHAIN_APPROX_NONE`	Keep all points (保留所有点)
`CHAIN_APPROX_SIMPLE`	Compress to endpoints (压缩只保留端点) - recommended

ret, thresh = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

2.8.2 Draw Contours (cv2.drawContours) | 绘制轮廓

Syntax
`image = cv2.drawContours(image, contours, contourIdx, color, thickness)`

draw_img = img.copy()
res = cv2.drawContours(draw_img, contours, -1, (0, 0, 255), 2)  # -1 = all contours

2.8.3 Contour Features | 轮廓特征

Function (函数)	Description (说明)
`cv2.contourArea(cnt)`	Calculate area (计算面积)
`cv2.arcLength(cnt, closed)`	Calculate perimeter (计算周长)
`cv2.approxPolyDP(cnt, epsilon, closed)`	Approximate polygon (多边形逼近)
`cv2.boundingRect(cnt)`	Get bounding rectangle (边界矩形)
`cv2.minEnclosingCircle(cnt)`	Get minimum enclosing circle (外接圆)

cnt = contours[0]
area = cv2.contourArea(cnt)  # Area | 面积
perimeter = cv2.arcLength(cnt, True)  # Perimeter | 周长

# Polygon approximation | 多边形逼近
epsilon = 0.15 * perimeter  # 0.01-0.1 for fine to rough | 精细到粗糙
approx = cv2.approxPolyDP(cnt, epsilon, True)

# Bounding rectangle | 边界矩形
x, y, w, h = cv2.boundingRect(cnt)

# Minimum enclosing circle | 外接圆
(x, y), radius = cv2.minEnclosingCircle(cnt)

2.9 Fourier Transform | 傅里叶变换

2.9.1 DFT (cv2.dft) | 离散傅里叶变换

Syntax
`dst = cv2.dft(src, flags)`

Key Concepts (核心概念):

Concept	Description (说明)
High Frequency (高频)	Rapid changes: edges, noise (快速变化：边缘、噪声)
Low Frequency (低频)	Slow changes: background (缓慢变化：背景)
Low-pass Filter (低通滤波)	Keep low frequency → Blur (保留低频 → 模糊)
High-pass Filter (高通滤波)	Keep high frequency → Sharpen (保留高频 → 锐化)

import numpy as np

img = cv2.imread('lena.jpg', 0)
img_float = np.float32(img)  # Must be float32 | 必须是浮点型

# DFT | 傅里叶变换
dft = cv2.dft(img_float, flags=cv2.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft)  # Center low frequency | 将低频移到中心

# Magnitude spectrum | 幅值谱
magnitude = 20 * np.log(cv2.magnitude(dft_shift[:, :, 0], dft_shift[:, :, 1]))

2.9.2 IDFT (cv2.idft) | 逆傅里叶变换

Syntax
`dst = cv2.idft(src)`

Low-pass Filter Example (低通滤波示例):

rows, cols = img.shape
crow, ccol = rows // 2, cols // 2  # Center | 中心

# Create low-pass mask | 创建低通掩码
mask = np.zeros((rows, cols, 2), np.uint8)
mask[crow-30:crow+30, ccol-30:ccol+30] = 1  # Keep central region | 保留中心区域

# Apply mask | 应用掩码
fshift = dft_shift * mask

# Inverse shift | 逆中心化
f_ishift = np.fft.ifftshift(fshift)

# Inverse DFT | 逆变换
img_back = cv2.idft(f_ishift)
img_back = cv2.magnitude(img_back[:, :, 0], img_back[:, :, 1])  # Get magnitude | 取模

3. Practical Tips & FAQ | 实用技巧与常见问题

Performance Tips for RM | RM性能优化建议

Tip	Description (说明)
Avoid unnecessary copies	Use `img.copy()` only when needed (避免不必要的拷贝)
Use in-place operations	Some functions can modify in-place (使用原地操作)
Choose right kernel size	Larger kernels are slower (大核更慢)
Prefer simple filters	`blur` is faster than `GaussianBlur` (简单滤波更快)
Use uint8 when possible	Avoid unnecessary type conversions (避免类型转换)

Common Issues & Solutions | 常见问题与解决

Issue	Cause	Solution
Image won’t display	No `waitKey()`	Add `cv2.waitKey()`
Video won’t open	Wrong path or codec	Check file path, install codecs
`findContours` error	Wrong number of return values	Check OpenCV version
Colors look wrong	BGR vs RGB confusion	Use `cv2.COLOR_BGR2RGB`
Noise in result	Input image too noisy	Apply smoothing filter first

Troubleshooting Guide | 故障排除指南

Q: Image reading returns None? | 图像读取返回None？ A: Check if the file path is correct and the file exists. 检查文件路径是否正确，文件是否存在。

Q: How to handle 4-channel images (RGBA)? | 如何处理4通道图像？ A: Use cv2.IMREAD_UNCHANGED to preserve alpha channel, or convert to 3-channel BGR. 使用cv2.IMREAD_UNCHANGED保留alpha通道，或转为3通道BGR。

Q: Morphology results are unexpected? | 形态学结果不符合预期？ A: Try different kernel sizes and iterations. Start with (3,3) and iterations=1. 尝试不同的核大小和迭代次数。从(3,3)和iterations=1开始。

Appendix: Quick Reference | 附录：常用函数速查

Basic Operations | 基础操作

Function	Purpose (功能)
`cv2.imread()`	Read image (读取图像)
`cv2.imshow()`	Display image (显示图像)
`cv2.imwrite()`	Save image (保存图像)
`cv2.waitKey()`	Wait for key (等待按键)
`cv2.destroyAllWindows()`	Close windows (销毁窗口)

Color Space | 颜色空间

Function	Purpose (功能)
`cv2.cvtColor()`	Convert color space (颜色空间转换)
`cv2.split()`	Split channels (通道分离)
`cv2.merge()`	Merge channels (通道合并)
`cv2.inRange()`	Color range filter (颜色范围筛选)

Image Processing | 图像处理

Function	Purpose (功能)
`cv2.threshold()`	Thresholding (阈值处理)
`cv2.blur()`	Mean filter (均值滤波)
`cv2.GaussianBlur()`	Gaussian filter (高斯滤波)
`cv2.medianBlur()`	Median filter (中值滤波)
`cv2.erode()`	Erosion (腐蚀)
`cv2.dilate()`	Dilation (膨胀)
`cv2.morphologyEx()`	Morphological ops (形态学操作)
`cv2.Sobel()`	Sobel operator (Sobel算子)
`cv2.Canny()`	Canny edge detection (Canny边缘检测)

Contours & Features | 轮廓与特征

Function	Purpose (功能)
`cv2.findContours()`	Find contours (查找轮廓)
`cv2.drawContours()`	Draw contours (绘制轮廓)
`cv2.contourArea()`	Calculate area (计算面积)
`cv2.arcLength()`	Calculate perimeter (计算周长)
`cv2.approxPolyDP()`	Approximate contour (轮廓逼近)

Transformations | 变换

Function	Purpose (功能)
`cv2.resize()`	Resize image (图像缩放)
`cv2.addWeighted()`	Blend images (图像融合)
`cv2.pyrUp()`	Up-sample (向上采样)
`cv2.pyrDown()`	Down-sample (向下采样)
`cv2.dft()`	Fourier transform (傅里叶变换)
`cv2.idft()`	Inverse Fourier transform (逆傅里叶变换)

Document Version | 文档版本: 1.0 Based on | 基于: OpenCV 4.x Last Updated | 最后更新: 2026-05-03

OpenCV Image Processing Complete Guide | OpenCV图像处理完全指南#

Introduction: The Inner Logic of Image Processing | 引言：图像处理的内在逻辑#

Fundamental Concepts | 核心概念#

Processing Pipeline | 处理流程#

Core Principles | 核心原理#

Why These Operations? | 为什么需要这些操作？#

Version Compatibility | 版本兼容性#

Table of Contents | 目录#

1. Basic Image Operations | 图像基本操作#

1.1 Reading Images (cv2.imread) | 图像读取#

1.2 Displaying Images (cv2.imshow) | 图像显示#

1.3 Saving Images (cv2.imwrite) | 图像保存#

1.4 Image Properties | 图像属性#

1.5 Reading Video (cv2.VideoCapture) | 视频读取#

1.6 ROI Extraction | ROI截取#

1.7 Color Channel Operations | 颜色通道操作#

Channel Split (cv2.split) | 通道分离#

Channel Merge (cv2.merge) | 通道合并#

1.8 Border Padding (cv2.copyMakeBorder) | 边界填充#

1.9 Arithmetic Operations | 数值计算#

Direct Arithmetic (numpy) | 直接算术运算（numpy方式）#

OpenCV Addition (cv2.add) | OpenCV加法#

1.10 Image Blending (cv2.addWeighted) | 图像融合#

1.11 Image Resizing (cv2.resize) | 图像缩放#

2. Image Transformations | 图像变换处理#

2.1 Color Space Conversion (cv2.cvtColor) | 颜色空间转换#

2.1.1 Grayscale Conversion | 灰度转换#

2.1.2 HSV Color Space | HSV颜色空间#

2.2 Image Thresholding (cv2.threshold) | 图像阈值#

2.3 Image Smoothing | 图像平滑#

2.3.1 Mean Filter (cv2.blur) | 均值滤波#

2.3.2 Box Filter (cv2.boxFilter) | 方框滤波#

2.3.3 Gaussian Filter (cv2.GaussianBlur) | 高斯滤波#

2.3.4 Median Filter (cv2.medianBlur) | 中值滤波#

2.4 Morphological Operations | 形态学操作#

2.4.1 Erosion (cv2.erode) | 腐蚀#

2.4.2 Dilation (cv2.dilate) | 膨胀#

2.4.3 Advanced Operations (cv2.morphologyEx) | 高级形态学操作#

2.5 Image Gradient | 图像梯度#

2.5.1 Sobel Operator (cv2.Sobel) | Sobel算子#

2.5.2 Scharr Operator (cv2.Scharr) | Scharr算子#

2.5.3 Laplacian Operator (cv2.Laplacian) | Laplacian算子#

2.6 Edge Detection (cv2.Canny) | 边缘检测#

2.7 Image Pyramid | 图像金字塔#

2.7.1 Up-sampling (cv2.pyrUp) | 向上采样#

2.7.2 Down-sampling (cv2.pyrDown) | 向下采样#

2.7.3 Laplacian Pyramid | 拉普拉斯金字塔#

2.8 Image Contours | 图像轮廓#

2.8.1 Find Contours (cv2.findContours) | 查找轮廓#

2.8.2 Draw Contours (cv2.drawContours) | 绘制轮廓#

2.8.3 Contour Features | 轮廓特征#

2.9 Fourier Transform | 傅里叶变换#

2.9.1 DFT (cv2.dft) | 离散傅里叶变换#

2.9.2 IDFT (cv2.idft) | 逆傅里叶变换#

3. Practical Tips & FAQ | 实用技巧与常见问题#

Performance Tips for RM | RM性能优化建议#

Common Issues & Solutions | 常见问题与解决#

Troubleshooting Guide | 故障排除指南#

Appendix: Quick Reference | 附录：常用函数速查#

Basic Operations | 基础操作#

Color Space | 颜色空间#

Image Processing | 图像处理#

Contours & Features | 轮廓与特征#

Transformations | 变换#