Document Deskewer

From DigitWiki
Jump to: navigation, search

Introduction

The scanning of printed documents involves several challenges that result from the quality of the material to be scanned and the scanning process itself. One of these challenges is the accidental production of skewed images due to an imprecise alignment of the printed document (Figure 1).

Figure 1: Skewed image due to an imprecise alignment of the document

The Document Deskewer is a simple and easy to use command-line tool for automatically correcting skewed pages. Given a skewed input image the tool detects the skew angle and automatically corrects the rotation for the full range of 0-360 degrees. The processing can be adjusted using parameters which allow the user to select the resampling method or the colour used to fill blank areas introduced by the image rotation.

This tool is particularly interesting for institutions with printed material which is difficult to scan and thus might result in skewed images. By correcting skewed images it is possible to improve the results of subsequent post-processing steps such as OCR or to improve the overall visual appearance of an image that is supposed to be shown to users.

The Document Deskewer can be integrated into the existing scanning and post-processing workflow to correct these images. The best results are achieved for documents written in Roman scripts.

Requirements

Operating system Windows, Linux
Hardware dependencies -
Software dependencies -

Non-technical requirements

The Document Deskewer does not have a graphical user interface (GUI) and therefore requires some basic knowledge on how to use and execute the command-line programs.

Licensing

The tool is produced at the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) in Sankt Augustin, Germany. For more information on terms and conditions to use the tool, please contact Fraunhofer IAIS, Department NetMedia: Dr. Joachim Köhler Joachim.koehler(at)iais.fraunhofer.de

Usage

Installation

Files

The Installation package of this tool contains the following files:

Filename: deskew.exe (Windows) / deskew (Linux)

Description: The Document Deskewer command-line executable. The executable is available for Windows and Linux operating systems.

Installation Instructions

The Document Deskewer is a tool and does not require any installation. After extracting the executable files from the installation package, it can be used by calling it from the command-line.

Quick Start Guide

An easy way to start using the Document Deskewer is choosing one of the examples in the “Examples” section. A detailed description of the parameters for the configuration of the tool can be found in the following section.

Documentation

To effectively use the Document Deskewer it is important to understand the possible configuration options explained in the following sections. For example the Document Deskewer will refuse to rotate images where the confidence in the calculated angle is below a certain value. It will produce the following error message:

Error: confidence in calculated skew angle -90 is low, use --force to deskew anyway

However it is possible to use the “-f” parameter to force deskew even if the confidence in the calculated angle is low.

Configuration and Customization

The Document Deskewer can be configured using command-line parameters. The following table gives an overview on the possible parameters and explains how they affect the output image.

Parameter Req. Description
-h [ --help ] Print usage help
--version Print version info to show which version of the tool is used.
-v [ --verbose ] Print verbose status messages while processing (See “Examples”).
-f [ --force ] The Document Deskewer will refuse to rotate images where the confidence in the calculated angle is below a certain value:

Error: confidence in calculated skew angle -90 is low, use --force to deskew anyway Use this parameter to force deskew even if confidence in the calculated angle is low.

--fill-black Use black instead of white for pixels where the gray value cannot be computed from the input image (See “Examples”).
--resampling-method {arg} After the rotation angle has been calculated, the image needs to be rotated and resampled . This parameter allows to choose the resampling method: none, triangle or cubic (default: cubic)
-i [--input-file] {arg} X Input file
-o [--output-file] {arg} X Output file

Workflow Integration

The Document Deskewer can be integrated into any workflow or application that allows the execution of command-line tools. See section “Configuration” for details on how the tool can be configured using parameters.

Examples

1. Deskew a sample image, display verbose output:

deskew -i sample.tif -o result.tif –v

Output:

Loading input file: sample.tif Determining skew angle maxpos 0 max0 24.47806 max1 18.97955 max2 4.14367 res 1 Determined skew angle: -1.1 Rotating image Writing output file: result.tif

2. Deskew a sample image, force deskew and fill pixels for which the gray value cannot be computed with black:

deskew -i sample.tif -o result.tif –f --fill-black

Output:

without –fill-black with --fill-black


3. Deskew a sample image, choose “triangle” as the resampling method

deskew -i sample.tif -o result.tif --resampling-method triangle