Document Scanner OPENCV PYTHON

Introduction

In this article, we will guide you through the process of creating a simple document scanner using OpenCV in Python. This project is perfect for beginners, as it covers the core principles of OpenCV and allows you to run the scanner in real-time. You will also learn how to save scanned images by simply pressing a button on your keyboard.

Project Workflow

The project consists of several clear steps:

Capture Image from Webcam: We will start by getting the input image from your webcam or you can use a preset image for reference.
Grayscale Conversion: The captured image is converted into grayscale to simplify further processing.
Edge Detection: We will apply an edge detector, specifically Canny Edge Detection, to identify edges in the image.
Find Contours: All contours present in the image will be detected. Among these contours, we will filter to find the largest one, which should hopefully correspond to the document's edges.
Perspective Transformation: Using the corner points of the largest contour, we will apply a perspective transformation to get a properly aligned image of the document.
Adaptive Thresholding: We will enhance the scanned document feel by applying adaptive thresholding.
Save Functionality: We will also implement a feature to save the processed image to a specific folder.

Getting Started with the Code

We will be using two scripts for this project:

document_scanner.py: This script contains the main functionality of the document scanner.
utilities.py: This file has the supporting functions which are helpful for various processing steps.

Required Libraries

Make sure to import the necessary libraries. For this project, you will need OpenCV and NumPy:

import cv2
import numpy as np
from utilities import *

If you haven't installed OpenCV yet, you can do so via pip:

pip install opencv-python

Initializing Parameters

We will set up parameters for the webcam, including brightness, width, and height of the captured image. The code below initializes a camera object and allows for toggling between using a webcam or a reference image.

Image Pre-processing

In the main loop, we will convert the input image into grayscale and apply Gaussian blur to reduce noise. The Canny edge detector will then be employed to find edges within the image. Real-time adjustments can be made to the blur and threshold values via trackbars.

Contour Detection

Using OpenCV's findContours function, we will detect all contours in the pre-processed image. We will filter these contours to find the largest one, which should correspond to the document. To ensure accuracy, we will check if it forms a rectangle.

Perspective Transformation

Next, we will reorder the points of the largest contour to match the expected order needed by OpenCV's warpPerspective function. This step will give us a rectangular output of the document.

Final Adjustments

To enhance the scanned document's appearance, we will apply adaptive thresholding after converting the image back into grayscale. Finally, we will use a custom stacking function to display the original image alongside the processed versions.

Saving the Document

The script will monitor for a key press (specifically the ‘s’ key) to save the processed image into a predefined folder. When the image is saved, a confirmation message will be displayed.

Conclusion

By following the steps outlined above, you will have a functional document scanner leveraging OpenCV in Python. This beginner-friendly project is a fantastic way to solidify your understanding of image processing concepts. Stay tuned for more OpenCV projects coming up in future articles!

Keyword

OpenCV
Document Scanner
Python
Real-time Processing
Grayscale Conversion
Edge Detection
Contours
Perspective Transformation
Adaptive Thresholding
Save Functionality

FAQ

Q1: What is OpenCV?
A1: OpenCV is an open-source computer vision and machine learning software library that provides a range of tools for image and video analysis.

Q2: Is this project suitable for beginners?
A2: Yes! This document scanner project is designed specifically for beginners and covers fundamental concepts in image processing.

Q3: What libraries do I need to install?
A3: You need to install opencv-python and numpy libraries. Use pip to install these packages.

Q4: Can I use an image instead of a webcam for input?
A4: Yes, you can use a preset image by modifying the code to load an image file instead of capturing from a webcam.

Q5: How can I modify the output quality of the scanned document?
A5: You can adjust the resolution settings of your webcam or the processing parameters in the code to improve output quality.

Document Scanner OPENCV PYTHON | Beginner Project