OpenCV Image Processing Complete Guide | OpenCV图像处理完全指南#
Introduction: The Inner Logic of Image Processing | 引言:图像处理的内在逻辑#
Fundamental Concepts | 核心概念#
In computer vision, images are represented as matrices of pixel values. A color image typically consists of three channels (BGR in OpenCV), while a grayscale image has a single channel. Each pixel value ranges from 0 to 255, representing intensity.
在计算机视觉中,图像被表示为像素值矩阵。彩色图像通常包含三个通道(OpenCV中为BGR格式),而灰度图像只有一个通道。每个像素值的范围是0到255,表示亮度强度。
Processing Pipeline | 处理流程#
The operations in this guide follow a logical sequence:
本指南中的操作遵循以下逻辑流程:
- Image Acquisition → Reading images/videos from files or cameras
图像获取 → 从文件或摄像头读取图像/视频
- Preprocessing → Basic operations like resizing, cropping, color space conversion
预处理 → 调整大小、裁剪、颜色空间转换等基本操作
- Enhancement → Smoothing, sharpening, contrast adjustment
增强 → 平滑、锐化、对比度调整
- Feature Extraction → Edge detection, contour finding, gradient calculation
特征提取 → 边缘检测、轮廓查找、梯度计算
- Transformation → Pyramids, Fourier transform for frequency domain analysis
变换 → 金字塔、傅里叶变换用于频域分析
Core Principles | 核心原理#
- Convolution (卷积): The foundation of many operations (smoothing, edge detection). A kernel slides over the image, computing weighted sums.
卷积:许多操作的基础(平滑、边缘检测)。一个核在图像上滑动,计算加权和。
- Thresholding (阈值化): Converts grayscale images to binary by comparing pixel values to a threshold.
阈值化:通过将像素值与阈值比较,将灰度图像转换为二值图像。
- Morphology (形态学): Based on set theory, operations like erosion and dilation modify shape.
形态学:基于集合论,腐蚀和膨胀等操作修改形状。
- Frequency Domain (频域): Transforms spatial information to frequency domain for filtering.
频域:将空间信息转换到频域进行滤波。
Why These Operations? | 为什么需要这些操作?#
These operations form the building blocks of computer vision systems. They enable:
这些操作构成了计算机视觉系统的基础模块,使我们能够:
- Object Detection (目标检测): Identify and locate objects in images
目标检测:识别并定位图像中的物体
- Image Segmentation (图像分割): Separate foreground from background
图像分割:分离前景和背景
- Feature Matching (特征匹配): Compare and match patterns across images
特征匹配:在图像间比较和匹配模式
- Image Restoration (图像恢复): Remove noise and artifacts
图像恢复:去除噪声和伪影
Understanding these operations is essential for developing robust computer vision applications, especially for RM (RoboMaster) competition scenarios where real-time processing and accuracy are critical.
理解这些操作对于开发强大的计算机视觉应用至关重要,特别是在RM(RoboMaster)竞赛场景中,实时处理和准确性至关重要。
Version Compatibility | 版本兼容性#
This guide is based on OpenCV 4.x. Note that:
- OpenCV 4.x:
cv2.findContours() returns (contours, hierarchy)
- OpenCV 3.x:
cv2.findContours() returns (image, contours, hierarchy)
本指南基于OpenCV 4.x编写。注意:
- OpenCV 4.x:
cv2.findContours()返回(contours, hierarchy)
- OpenCV 3.x:
cv2.findContours()返回(image, contours, hierarchy)
Table of Contents | 目录#
- Basic Image Operations
- Image Transformations
- Practical Tips & FAQ
1. Basic Image Operations | 图像基本操作#
1.1 Reading Images (cv2.imread) | 图像读取#
| Syntax |
img = cv2.imread(filename, flags) |
| Parameter |
Description (参数) |
filename |
Path to image file (图像文件路径) |
flags |
Reading mode (读取模式) |
flags Parameter Details (flags参数详解):
| Value (值) |
Constant (常量) |
Description (说明) |
1 |
cv2.IMREAD_COLOR |
Color image (default), ignores transparency (彩色图像,默认,忽略透明度) |
0 |
cv2.IMREAD_GRAYSCALE |
Grayscale image (灰度图像) |
-1 |
cv2.IMREAD_UNCHANGED |
Keep original channels including alpha (保留原始通道,含透明度) |
Note (注意): OpenCV reads images in BGR format, not RGB! When displaying with Matplotlib, convert to RGB using cv2.cvtColor(img, cv2.COLOR_BGR2RGB).
OpenCV读取图像的格式是BGR,不是RGB!用Matplotlib显示时,需要用cv2.cvtColor(img, cv2.COLOR_BGR2RGB)转换。
import cv2
img = cv2.imread('cat.jpg') # Read color image (BGR format) | 读取彩色图像(BGR格式)
img_gray = cv2.imread('cat.jpg', cv2.IMREAD_GRAYSCALE) # Read grayscale | 读取灰度图
1.2 Displaying Images (cv2.imshow) | 图像显示#
| Syntax |
cv2.imshow(winname, mat) |
| Parameter |
Description (参数) |
winname |
Window name (窗口名称) |
mat |
Image array to display (要显示的图像) |
Supporting Functions (配合使用的函数):
| Function (函数) |
Syntax (语法) |
Description (说明) |
cv2.waitKey() |
cv2.waitKey(delay) |
Wait for key press, delay=0 means infinite (等待按键,0=无限等待) |
cv2.destroyAllWindows() |
cv2.destroyAllWindows() |
Destroy all windows (销毁所有窗口) |
cv2.imshow('image', img) # Display image | 显示图像
cv2.waitKey(0) # Press any key to continue | 按任意键继续
cv2.destroyAllWindows() # Close windows | 关闭窗口
# Convenience function | 便捷函数
def cv_show(name, img):
cv2.imshow(name, img)
cv2.waitKey(0)
cv2.destroyAllWindows()
1.3 Saving Images (cv2.imwrite) | 图像保存#
| Syntax |
retval = cv2.imwrite(filename, img) |
| Parameter |
Description (参数) |
filename |
Save path and filename (保存路径) |
img |
Image to save (要保存的图像) |
retval |
Returns True if successful (返回值) |
cv2.imwrite('my_cat.png', img) # Save as PNG | 保存为PNG格式
1.4 Image Properties | 图像属性#
| Property/Method (属性) |
Description (说明) |
Example (示例) |
img.shape |
Image shape: (height, width, channels) (图像形状:高度, 宽度, 通道数) |
(414, 500, 3) |
img.size |
Total number of pixels (总像素数) |
621000 |
img.dtype |
Data type (数据类型) |
uint8 (0-255) |
type(img) |
Object type (对象类型) |
numpy.ndarray |
print(img.shape) # (414, 500, 3)
print(img.size) # 621000
print(img.dtype) # uint8
print(type(img)) # <class 'numpy.ndarray'>
1.5 Reading Video (cv2.VideoCapture) | 视频读取#
| Syntax |
cap = cv2.VideoCapture(index/filename) |
| Parameter |
Description (参数) |
index |
Camera index, 0=default (摄像头索引,0=默认) |
filename |
Path to video file (视频文件路径) |
Common Methods (常用方法):
| Method (方法) |
Description (说明) |
cap.isOpened() |
Check if video opened successfully (检查视频是否成功打开) |
cap.read() |
Read one frame, returns (ret, frame) (读取一帧,返回ret, frame) |
cap.release() |
Release resources (释放资源) |
vc = cv2.VideoCapture('test.mp4') # Open video file | 打开视频文件
if not vc.isOpened():
print("Cannot open video") # Cannot open video | 无法打开视频
exit()
while True:
ret, frame = vc.read() # Read frame | 读取帧
if not ret:
break
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.imshow('result', gray)
if cv2.waitKey(100) & 0xFF == 27: # ESC to exit | ESC键退出
break
vc.release()
cv2.destroyAllWindows()
ROI (Region of Interest) | ROI(感兴趣区域)
img = cv2.imread('cat.jpg')
cat_face = img[0:50, 0:200] # [row range, column range] | [行范围, 列范围]
cv_show('cat_face', cat_face)
1.7 Color Channel Operations | 颜色通道操作#
Channel Split (cv2.split) | 通道分离#
| Syntax |
b, g, r = cv2.split(img) |
Note (注意): Order is B, G, R because OpenCV uses BGR format.
返回顺序是B、G、R,因为OpenCV使用BGR格式。
Channel Merge (cv2.merge) | 通道合并#
| Syntax |
img = cv2.merge((b, g, r)) |
b, g, r = cv2.split(img) # Split channels | 分离通道
img = cv2.merge((b, g, r)) # Merge channels | 合并通道
# Keep only red channel | 只保留红色通道
cur_img = img.copy()
cur_img[:, :, 0] = 0 # Set B channel to 0 | B通道置0
cur_img[:, :, 1] = 0 # Set G channel to 0 | G通道置0
1.8 Border Padding (cv2.copyMakeBorder) | 边界填充#
| Syntax |
dst = cv2.copyMakeBorder(src, top, bottom, left, right, borderType, value) |
| Parameter |
Description (参数) |
top/bottom/left/right |
Padding pixels in each direction (各方向填充像素数) |
borderType |
Padding type (填充类型) |
value |
Constant value for CONSTANT type (常量填充的值) |
borderType Options (borderType填充类型):
| Type (类型) |
Description (说明) |
Example (示例) |
BORDER_REPLICATE |
Replicate edge pixels (复制边缘像素) |
aaaaaa|abcdefgh|hhhhhhh |
BORDER_REFLECT |
Mirror reflection (镜像反射) |
fedcba|abcdefgh|hgfedcb |
BORDER_REFLECT_101 |
Reflect around edge (以边缘为轴镜像) |
gfedcb|abcdefgh|gfedcba |
BORDER_WRAP |
Wrap around (外包装) |
cdefgh|abcdefgh|abcdefg |
BORDER_CONSTANT |
Constant value (常量填充) |
iiiiii|abcdefgh|iiiiii |
top_size, bottom_size, left_size, right_size = (50, 50, 50, 50)
replicate = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REPLICATE)
reflect = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT)
reflect101 = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT_101)
wrap = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_WRAP)
constant = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_CONSTANT, value=0)
1.9 Arithmetic Operations | 数值计算#
Direct Arithmetic (numpy) | 直接算术运算(numpy方式)#
img_cat = cv2.imread('cat.jpg')
img_cat2 = img_cat + 10 # Add 10 to each pixel | 每个像素加10
# Numpy wraps around on overflow | numpy加法溢出取模
result = img_cat + img_cat2 # 256 becomes 0, 257 becomes 1
OpenCV Addition (cv2.add) | OpenCV加法#
| Syntax |
|
dst = cv2.add(src1, src2) |
|
Feature (特点): Clamps values at 255 (saturation operation) | 超过255时取255(饱和操作)
result = cv2.add(img_cat, img_cat2) # Values > 255 become 255 | 超过255变为255
1.10 Image Blending (cv2.addWeighted) | 图像融合#
| Syntax |
|
dst = cv2.addWeighted(src1, alpha, src2, beta, gamma) |
|
| Parameter |
Description (参数) |
说明 |
src1, src2 |
Images to blend, must be same size (要融合的两幅图像,大小必须相同) |
要融合的两幅图像 |
alpha |
Weight for src1 (src1的权重) |
src1的权重 |
beta |
Weight for src2 (src2的权重) |
src2的权重 |
gamma |
Bias value (偏置值) |
偏置值 |
Formula (公式): dst = src1 * alpha + src2 * beta + gamma
img_cat = cv2.imread('cat.jpg')
img_dog = cv2.imread('dog.jpg')
# Resize to same size | 调整到相同大小
img_dog = cv2.resize(img_dog, (500, 414))
# Blend: cat 40%, dog 60% | 融合:cat占40%,dog占60%
res = cv2.addWeighted(img_cat, 0.4, img_dog, 0.6, 0)
1.11 Image Resizing (cv2.resize) | 图像缩放#
| Syntax |
|
dst = cv2.resize(src, dsize, fx, fy) |
|
| Parameter |
Description (参数) |
说明 |
dsize |
Target size (width, height) (目标大小:宽, 高) |
目标大小 |
fx |
Horizontal scaling factor (水平缩放因子) |
水平缩放因子 |
fy |
Vertical scaling factor (垂直缩放因子) |
垂直缩放因子 |
res = cv2.resize(img, (500, 400)) # Specify target size | 指定目标大小
res = cv2.resize(img, (0, 0), fx=2, fy=2) # Double size | 放大2倍
res = cv2.resize(img, (0, 0), fx=0.5, fy=0.5) # Half size | 缩小一半
2.1 Color Space Conversion (cv2.cvtColor) | 颜色空间转换#
| Syntax |
|
dst = cv2.cvtColor(src, code) |
|
| Parameter |
Description (参数) |
说明 |
src |
Input image (输入图像) |
输入图像 |
code |
Conversion code (转换代码) |
转换代码 |
Common Conversion Codes (常用转换代码):
| Code (代码) |
Description (说明) |
cv2.COLOR_BGR2GRAY |
BGR to Grayscale (BGR转灰度) |
cv2.COLOR_BGR2HSV |
BGR to HSV (BGR转HSV) |
cv2.COLOR_BGR2RGB |
BGR to RGB (for Matplotlib) (BGR转RGB,用于Matplotlib) |
cv2.COLOR_GRAY2BGR |
Grayscale to BGR (灰度转BGR) |
2.1.1 Grayscale Conversion | 灰度转换#
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
2.1.2 HSV Color Space | HSV颜色空间#
HSV Channel Meanings (HSV三通道含义):
| Channel |
Name (名称) |
Range (范围) |
Description (说明) |
| H |
Hue (色调) |
0-180 |
Color type (颜色种类,与光照无关) |
| S |
Saturation (饱和度) |
0-255 |
Color purity (颜色纯度,越高越鲜艳) |
| V |
Value (明度) |
0-255 |
Brightness (亮度,越大越亮) |
Why HSV for RM? (为什么RM常用HSV?)
- RGB对光照敏感,同样颜色在亮暗环境下R/G/B值变化大
- HSV更鲁棒:H通道只表示颜色种类,光照变化影响小
- HSV对光照更鲁棒,H通道色调不受亮度影响
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Extract red region | 提取红色区域
lower_red = np.array([0, 100, 100])
upper_red = np.array([10, 255, 255])
mask = cv2.inRange(hsv, lower_red, upper_red)
2.2 Image Thresholding (cv2.threshold) | 图像阈值#
| Syntax |
|
ret, dst = cv2.threshold(src, thresh, maxval, type) |
|
| Parameter |
Description (参数) |
说明 |
src |
Input image, single channel grayscale (单通道灰度图) |
输入图像(单通道灰度图) |
thresh |
Threshold value (阈值) |
阈值 |
maxval |
Value when threshold exceeded (超过阈值赋予的值) |
超过阈值赋予的值 |
type |
Threshold type (阈值类型) |
阈值类型 |
Five Threshold Types (五种阈值类型):
| Type (类型) |
Formula (公式) |
Visual Effect (视觉效果) |
说明 |
THRESH_BINARY |
dst = maxval if src > thresh else 0 |
超过阈值变白,否则变黑 |
最常用,提取亮区 |
THRESH_BINARY_INV |
dst = 0 if src > thresh else maxval |
超过阈值变黑,否则变白 |
THRESH_BINARY的反转 |
THRESH_TRUNC |
dst = thresh if src > thresh else src |
超过阈值变阈值,否则不变 |
限幅,保护过曝 |
THRESH_TOZERO |
dst = src if src > thresh else 0 |
超过阈值不变,否则变黑 |
保留亮区细节 |
THRESH_TOZERO_INV |
dst = 0 if src > thresh else src |
超过阈值变黑,否则不变 |
保留暗区细节 |
Visual Diagram (可视化图示):
Original: ████████████████████
BINARY: ████████████████▓▓▓▓ (>127变白)
BINARY_INV: ▓▓▓▓███████████████ (>127变黑)
TRUNC: ████████████████████ (>127变127)
TOZERO: ??????????██████████ (<127变黑)
ret, thresh1 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)
ret, thresh2 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY_INV)
ret, thresh3 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TRUNC)
ret, thresh4 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TOZERO)
ret, thresh5 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_TOZERO_INV)
2.3 Image Smoothing | 图像平滑#
2.3.1 Mean Filter (cv2.blur) | 均值滤波#
| Syntax |
|
dst = cv2.blur(src, ksize) |
|
Feature (特点): Simple average, fast but blurs edges | 简单平均,速度快,但边缘模糊严重
blur = cv2.blur(img, (3, 3))
2.3.2 Box Filter (cv2.boxFilter) | 方框滤波#
| Syntax |
|
dst = cv2.boxFilter(src, ddepth, ksize, normalize) |
|
| Parameter |
Description (参数) |
说明 |
normalize=True |
Normalized, equivalent to blur (归一化,等价于均值滤波) |
归一化 |
normalize=False |
Not normalized, may overflow (不归一化,可能溢出) |
不归一化 |
box = cv2.boxFilter(img, -1, (3, 3), normalize=True) # Equivalent to blur | 等价于blur
2.3.3 Gaussian Filter (cv2.GaussianBlur) | 高斯滤波#
| Syntax |
|
dst = cv2.GaussianBlur(src, ksize, sigmaX) |
|
Feature (特点): Gaussian weights, better edge preservation | 高斯权重,边缘保留较好
gaussian = cv2.GaussianBlur(img, (5, 5), 1)
| Syntax |
|
dst = cv2.medianBlur(src, ksize) |
|
Feature (特点): Uses median value, best for salt-and-pepper noise | 用中值替代,对椒盐噪声效果最好
median = cv2.medianBlur(img, 5)
Comparison (对比):
| Filter (滤波器) |
Best For (最适合) |
Feature (特点) |
blur (均值) |
General smoothing (一般平滑) |
Fast but blurs edges (速度快,但模糊边缘) |
GaussianBlur (高斯) |
Gaussian noise (高斯噪声) |
Better edge preservation (边缘保留较好) |
medianBlur (中值) |
Salt-and-pepper noise (椒盐噪声) |
Best for impulse noise (对脉冲噪声最好) |
2.4 Morphological Operations | 形态学操作#
2.4.1 Erosion (cv2.erode) | 腐蚀#
| Syntax |
|
dst = cv2.erode(src, kernel, iterations) |
|
Effect (效果): Shrinks foreground, removes small noise | 前景缩小,去除毛刺
kernel = np.ones((3, 3), np.uint8)
erosion = cv2.erode(img, kernel, iterations=1)
2.4.2 Dilation (cv2.dilate) | 膨胀#
| Syntax |
|
dst = cv2.dilate(src, kernel, iterations) |
|
Effect (效果): Expands foreground, fills holes | 前景扩大,填补空洞
dilation = cv2.dilate(img, kernel, iterations=1)
2.4.3 Advanced Operations (cv2.morphologyEx) | 高级形态学操作#
| Syntax |
|
dst = cv2.morphologyEx(src, op, kernel) |
|
| Operation (操作) |
Formula (公式) |
Description (说明) |
MORPH_OPEN (开运算) |
Erode then Dilate |
Removes noise (去噪点) |
MORPH_CLOSE (闭运算) |
Dilate then Erode |
Fills holes (填空洞) |
MORPH_GRADIENT (梯度) |
Dilate - Erode |
Extracts edges (提取轮廓) |
MORPH_TOPHAT (礼帽) |
Original - Open |
Extracts small objects (提取小物体) |
MORPH_BLACKHAT (黑帽) |
Close - Original |
Extracts small holes (提取小空洞) |
kernel = np.ones((5, 5), np.uint8)
opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel) # 开运算
closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel) # 闭运算
gradient = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel) # 梯度
2.5 Image Gradient | 图像梯度#
2.5.1 Sobel Operator (cv2.Sobel) | Sobel算子#
| Syntax |
|
dst = cv2.Sobel(src, ddepth, dx, dy, ksize) |
|
| Parameter |
Description (参数) |
说明 |
ddepth |
Output depth, commonly CV_64F (输出深度,常用CV_64F) |
输出深度 |
dx |
X derivative (1=compute, 0=ignore) (X方向导数) |
X方向导数 |
dy |
Y derivative (Y方向导数) |
Y方向导数 |
ksize |
Kernel size (1, 3, 5, 7) (核大小) |
核大小 |
Important (重要): Sobel计算时白到黑是负数,会被截断为0。解决:取绝对值 + convertScaleAbs
正确做法:分离计算x和y方向,然后加权融合
# Separate calculation | 分离计算
sobelx = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3)
sobelx = cv2.convertScaleAbs(sobelx) # Take absolute value | 取绝对值
sobely = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3)
sobely = cv2.convertScaleAbs(sobely)
# Combine | 融合
sobelxy = cv2.addWeighted(sobelx, 0.5, sobely, 0.5, 0)
2.5.2 Scharr Operator (cv2.Scharr) | Scharr算子#
| Syntax |
|
dst = cv2.Scharr(src, ddepth, dx, dy) |
|
Feature (特点): More sensitive than Sobel, fixed ksize=3 | 比Sobel更灵敏,ksize固定为3
scharrx = cv2.Scharr(img, cv2.CV_64F, 1, 0)
scharry = cv2.Scharr(img, cv2.CV_64F, 0, 1)
2.5.3 Laplacian Operator (cv2.Laplacian) | Laplacian算子#
| Syntax |
|
dst = cv2.Laplacian(src, ddepth) |
|
Feature (特点): Second derivative, noise sensitive, usually needs prior smoothing | 二阶导数,对噪声敏感,通常需要先滤波
blurred = cv2.GaussianBlur(img, (3, 3), 0) # Prior smoothing | 先去噪
laplacian = cv2.Laplacian(blurred, cv2.CV_64F)
laplacian = cv2.convertScaleAbs(laplacian)
2.6 Edge Detection (cv2.Canny) | 边缘检测#
| Syntax |
|
dst = cv2.Canny(src, threshold1, threshold2) |
|
| Parameter |
Description (参数) |
说明 |
threshold1 |
Lower threshold (低阈值) |
低阈值 |
threshold2 |
Upper threshold (高阈值) |
高阈值 |
Canny Algorithm Steps (Canny算法步骤):
- Gaussian blur for noise reduction (高斯滤波去噪)
- Calculate gradient magnitude and direction (计算梯度)
- Non-maximum suppression (非极大值抑制)
- Double threshold detection (双阈值检测)
- Edge tracking by hysteresis (抑制孤立弱边缘)
Threshold Selection Guide (阈值选择指南):
threshold2 ≈ 2~3倍 threshold1 | 最佳平衡
- 阈值都低 → 检测多(可能含噪声)
- 阈值都高 → 检测少(可能丢失真正边缘)
- 调试建议:先从(50, 150)开始,根据结果调整
v1 = cv2.Canny(img, 50, 100) # Loose (宽松)
v2 = cv2.Canny(img, 80, 150) # Moderate (适中) - commonly used for RM
v3 = cv2.Canny(img, 100, 200) # Strict (严格)
2.7 Image Pyramid | 图像金字塔#
2.7.1 Up-sampling (cv2.pyrUp) | 向上采样#
| Syntax |
|
dst = cv2.pyrUp(src) |
|
Effect (效果): Image size doubles (图像放大2倍)
2.7.2 Down-sampling (cv2.pyrDown) | 向下采样#
| Syntax |
|
dst = cv2.pyrDown(src) |
|
Effect (效果): Image size halves (图像缩小2倍)
2.7.3 Laplacian Pyramid | 拉普拉斯金字塔#
# Laplacian = Original - pyrUp(pyrDown(Original))
down = cv2.pyrDown(img)
down_up = cv2.pyrUp(down)
laplacian = img - down_up
2.8 Image Contours | 图像轮廓#
2.8.1 Find Contours (cv2.findContours) | 查找轮廓#
| Syntax |
|
contours, hierarchy = cv2.findContours(src, mode, method) |
|
OpenCV Version Note (版本注意):
- OpenCV 4.x: returns
(contours, hierarchy)
- OpenCV 3.x: returns
(image, contours, hierarchy)
| Parameter |
Description (参数) |
说明 |
mode |
Contour retrieval mode (轮廓检索模式) |
轮廓检索模式 |
method |
Contour approximation method (轮廓逼近方法) |
轮廓逼近方法 |
Retrieval Modes (检索模式):
| Mode (模式) |
Description (说明) |
RETR_EXTERNAL |
Only outer contours (只检索最外层) |
RETR_LIST |
All contours, no hierarchy (所有轮廓,无层级) |
RETR_CCOMP |
Two-level hierarchy (两层层级) |
RETR_TREE |
Full hierarchy tree (完整层级) - most commonly used |
Approximation Methods (逼近方法):
| Method (方法) |
Description (说明) |
CHAIN_APPROX_NONE |
Keep all points (保留所有点) |
CHAIN_APPROX_SIMPLE |
Compress to endpoints (压缩只保留端点) - recommended |
ret, thresh = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
2.8.2 Draw Contours (cv2.drawContours) | 绘制轮廓#
| Syntax |
|
image = cv2.drawContours(image, contours, contourIdx, color, thickness) |
|
draw_img = img.copy()
res = cv2.drawContours(draw_img, contours, -1, (0, 0, 255), 2) # -1 = all contours
2.8.3 Contour Features | 轮廓特征#
| Function (函数) |
Description (说明) |
cv2.contourArea(cnt) |
Calculate area (计算面积) |
cv2.arcLength(cnt, closed) |
Calculate perimeter (计算周长) |
cv2.approxPolyDP(cnt, epsilon, closed) |
Approximate polygon (多边形逼近) |
cv2.boundingRect(cnt) |
Get bounding rectangle (边界矩形) |
cv2.minEnclosingCircle(cnt) |
Get minimum enclosing circle (外接圆) |
cnt = contours[0]
area = cv2.contourArea(cnt) # Area | 面积
perimeter = cv2.arcLength(cnt, True) # Perimeter | 周长
# Polygon approximation | 多边形逼近
epsilon = 0.15 * perimeter # 0.01-0.1 for fine to rough | 精细到粗糙
approx = cv2.approxPolyDP(cnt, epsilon, True)
# Bounding rectangle | 边界矩形
x, y, w, h = cv2.boundingRect(cnt)
# Minimum enclosing circle | 外接圆
(x, y), radius = cv2.minEnclosingCircle(cnt)
2.9.1 DFT (cv2.dft) | 离散傅里叶变换#
| Syntax |
|
dst = cv2.dft(src, flags) |
|
Key Concepts (核心概念):
| Concept |
Description (说明) |
| High Frequency (高频) |
Rapid changes: edges, noise (快速变化:边缘、噪声) |
| Low Frequency (低频) |
Slow changes: background (缓慢变化:背景) |
| Low-pass Filter (低通滤波) |
Keep low frequency → Blur (保留低频 → 模糊) |
| High-pass Filter (高通滤波) |
Keep high frequency → Sharpen (保留高频 → 锐化) |
import numpy as np
img = cv2.imread('lena.jpg', 0)
img_float = np.float32(img) # Must be float32 | 必须是浮点型
# DFT | 傅里叶变换
dft = cv2.dft(img_float, flags=cv2.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft) # Center low frequency | 将低频移到中心
# Magnitude spectrum | 幅值谱
magnitude = 20 * np.log(cv2.magnitude(dft_shift[:, :, 0], dft_shift[:, :, 1]))
2.9.2 IDFT (cv2.idft) | 逆傅里叶变换#
| Syntax |
|
dst = cv2.idft(src) |
|
Low-pass Filter Example (低通滤波示例):
rows, cols = img.shape
crow, ccol = rows // 2, cols // 2 # Center | 中心
# Create low-pass mask | 创建低通掩码
mask = np.zeros((rows, cols, 2), np.uint8)
mask[crow-30:crow+30, ccol-30:ccol+30] = 1 # Keep central region | 保留中心区域
# Apply mask | 应用掩码
fshift = dft_shift * mask
# Inverse shift | 逆中心化
f_ishift = np.fft.ifftshift(fshift)
# Inverse DFT | 逆变换
img_back = cv2.idft(f_ishift)
img_back = cv2.magnitude(img_back[:, :, 0], img_back[:, :, 1]) # Get magnitude | 取模
3. Practical Tips & FAQ | 实用技巧与常见问题#
| Tip |
Description (说明) |
| Avoid unnecessary copies |
Use img.copy() only when needed (避免不必要的拷贝) |
| Use in-place operations |
Some functions can modify in-place (使用原地操作) |
| Choose right kernel size |
Larger kernels are slower (大核更慢) |
| Prefer simple filters |
blur is faster than GaussianBlur (简单滤波更快) |
| Use uint8 when possible |
Avoid unnecessary type conversions (避免类型转换) |
Common Issues & Solutions | 常见问题与解决#
| Issue |
Cause |
Solution |
| Image won’t display |
No waitKey() |
Add cv2.waitKey() |
| Video won’t open |
Wrong path or codec |
Check file path, install codecs |
findContours error |
Wrong number of return values |
Check OpenCV version |
| Colors look wrong |
BGR vs RGB confusion |
Use cv2.COLOR_BGR2RGB |
| Noise in result |
Input image too noisy |
Apply smoothing filter first |
Troubleshooting Guide | 故障排除指南#
Q: Image reading returns None? | 图像读取返回None?
A: Check if the file path is correct and the file exists.
检查文件路径是否正确,文件是否存在。
Q: How to handle 4-channel images (RGBA)? | 如何处理4通道图像?
A: Use cv2.IMREAD_UNCHANGED to preserve alpha channel, or convert to 3-channel BGR.
使用cv2.IMREAD_UNCHANGED保留alpha通道,或转为3通道BGR。
Q: Morphology results are unexpected? | 形态学结果不符合预期?
A: Try different kernel sizes and iterations. Start with (3,3) and iterations=1.
尝试不同的核大小和迭代次数。从(3,3)和iterations=1开始。
Appendix: Quick Reference | 附录:常用函数速查#
Basic Operations | 基础操作#
| Function |
Purpose (功能) |
cv2.imread() |
Read image (读取图像) |
cv2.imshow() |
Display image (显示图像) |
cv2.imwrite() |
Save image (保存图像) |
cv2.waitKey() |
Wait for key (等待按键) |
cv2.destroyAllWindows() |
Close windows (销毁窗口) |
Color Space | 颜色空间#
| Function |
Purpose (功能) |
cv2.cvtColor() |
Convert color space (颜色空间转换) |
cv2.split() |
Split channels (通道分离) |
cv2.merge() |
Merge channels (通道合并) |
cv2.inRange() |
Color range filter (颜色范围筛选) |
Image Processing | 图像处理#
| Function |
Purpose (功能) |
cv2.threshold() |
Thresholding (阈值处理) |
cv2.blur() |
Mean filter (均值滤波) |
cv2.GaussianBlur() |
Gaussian filter (高斯滤波) |
cv2.medianBlur() |
Median filter (中值滤波) |
cv2.erode() |
Erosion (腐蚀) |
cv2.dilate() |
Dilation (膨胀) |
cv2.morphologyEx() |
Morphological ops (形态学操作) |
cv2.Sobel() |
Sobel operator (Sobel算子) |
cv2.Canny() |
Canny edge detection (Canny边缘检测) |
Contours & Features | 轮廓与特征#
| Function |
Purpose (功能) |
cv2.findContours() |
Find contours (查找轮廓) |
cv2.drawContours() |
Draw contours (绘制轮廓) |
cv2.contourArea() |
Calculate area (计算面积) |
cv2.arcLength() |
Calculate perimeter (计算周长) |
cv2.approxPolyDP() |
Approximate contour (轮廓逼近) |
| Function |
Purpose (功能) |
cv2.resize() |
Resize image (图像缩放) |
cv2.addWeighted() |
Blend images (图像融合) |
cv2.pyrUp() |
Up-sample (向上采样) |
cv2.pyrDown() |
Down-sample (向下采样) |
cv2.dft() |
Fourier transform (傅里叶变换) |
cv2.idft() |
Inverse Fourier transform (逆傅里叶变换) |
Document Version | 文档版本: 1.0
Based on | 基于: OpenCV 4.x
Last Updated | 最后更新: 2026-05-03