Single Document Multiple Images

Single Document Multiple Images

25-Aug-2004

Updated: 15-Aug-2006

Table of Contents

Overview

ICAP_PIXELTYPE and DAT_FILESYSTEM Overview

ICAP_PIXELTYPE

DAT_FILESYSTEM

DAT_FILESYSTEM vs. ICAP_PIXELTYPE

CAP_CAMERAENABLE vs. CAP_DUPLEXENABLED

CAP_CAMERAORDER

Entire session (i.e. machine) vs. a single "camera"

METADATA

Overview

This document talks about the Single Document Multiple Images (SDMI) behavior in TWAIN, describing how it is negotiated and what additional metadata associated with the image needs to be collected during image capture. Since this is a moderately complex subject a number of other capabilities that have uses beyond SDMI will be discussed as well.

SDMI behavior is easy to view graphically:

Original Color Bitonal

Document Image Image

+-------+ +-------+ +-------+

| | | | | |

| R | | R | | R |

| | | | | |

| G | ---> | G | + | G |

| | | | | |

| B | | B | | B |

| | | | | |

+-------+        +-------+   +-------+

In this example a color document results in the capture of two images, one that is color and one that is bitonal (black & white). Configurations of this form have a variety of applications, but the most common is when the application needs a faithful replication of the document for archival purposes and an image suitable for data collection, such as OCR.

SDMI puts no limit on the number of images that can result from a document.

SDMI is not the same as image segmentation. Image segmentation divides a document into sub-images that are optimized for quality and compression. In this example the driver could save the text images as Group-4 and the picture as JPEG:

Original

Document

+-------+

| a bit | Segment 1 Segment 2 Segment 3

|of text| Text Picture Text

| $#$#$ | +-------+ +-------+ +-------+

| #$#$# | ---> | a bit | | #$#$# | | more |

| $#$#$ | |of text| + | #$#$# | + | text |

| more | +-------+ | #$#$# | +-------+

| text | +-------+

+-------+

Image segmentation is typically used to efficiently store images. SDMI is used to capture images that are then directed to different parts of the workflow. It is possible to mix image segmentation with SDMI (ex: using image segmentation to produce the faithful replication image). The two technologies have different goals, though, so it’s not advisable to use one to replace the other.

ICAP_PIXELTYPE and DAT_FILESYSTEM Overview

TWAIN did not start with duplex scanning built into the standard; this was added in version 1.7. So prior to 1.7 ICAP_PIXELTYPE selected the pixel type (i.e. color vs. grayscale vs. bitonal) for the entire session. This could also be considered the "color space". An application could configure a driver to output color or grayscale or bitonal images, but only one of the three, so you can not get multiple images for a side via ICAP_PIXELTYPE.

TWAIN 1.8 introduced DAT_FILESYSTEM. This allows an application to setup multiple images for a side. It also helps support setting of different values for the front and rear. For example, getting color on the front and grayscale on the rear.

To help maintain backwards compatibility, ICAP_PIXELTYPE needs to continue to apply to the entire session. This means ICAP_PIXELTYPE should never be negotiated with DAT_FILESYSTEM.

ICAP_PIXELTYPE

Setting ICAP_PIXELTYPE will set both the front and rear images to the given TWPT_ value and automatically set CAP_DUPLEXENABLED to true. Use ICAP_BITDEPTH to determine how many bits make a single pixel, such as 8 for 8-bit grayscale or 24 for 3-channel/8-bits-per-channel RGB.

DAT_FILESYSTEM

DAT_FILESYSTEM addresses individual “cameras”. The term “camera” doesn’t mean that the image capture device uses a camera; rather it’s a generic term for an image capture source. DAT_FILESYSTEM calls the front side of the paper as the 'top' "camera", and the rear as 'bottom'. This doesn't have anything to do with the physical position of the camera, it is being used to describe what the user considers the top (i.e. front) of the sheet of paper versus the bottom (i.e. rear).

The driver will output images based on CAP_CAMERAENABLED. So while a "camera" can be individually set via DAT_FILESYSTEM, you must also set CAP_CAMERAENABLED to true for each "camera" you want the driver to actually produce.

The values for DAT_FILESYSTEM are typically:

Camera name	Side	Image
/Camera_Color_Top	front	color or grayscale
/Camera_Color_Bottom	rear	color or grayscale
/Camera_Color_Both	front and rear	color or grayscale
/Camera_Bitonal_Top	front	bitonal
/Camera_Bitonal_Bottom	rear	bitonal
/Camera_Bitonal_Both	front and rear	bitonal

Using a camera that ends in '_Both' means future settings will be applied to both the front and rear images.

Sample source code

DAT_FILESYSTEM vs. ICAP_PIXELTYPE

If DAT_FILESYSTEM is set, then ICAP_PIXELTYPE must reflect the current value of the "camera". For instance, if DAT_FILESYSTEM is set to /Camera_Color_Both, then ICAP_PIXELTYPE should be set to TWPT_RGB (this is a basic sanity check for the driver to prevent DAT_FILESYSTEM and ICAP_PIXELTYPE from ever reporting conflicting values).

However, if ICAP_PIXELTYPE is set, then the following things must happen to DAT_FILESYSTEM and CAP_CAMERAENABLE:

If ICAP_PIXELTYPE is set to TWPT_RGB

DAT_FILESYSTEM changes to /Camera_Color_Both

CAP_CAMERAENABLE changes to:

/Camera_Color_Top: TRUE

/Camera_Color_Bottom: TRUE

/Camera_Bitonal_Top: FALSE

/Camera_Bitonal_Bottom: FALSE

If ICAP_PIXELTYPE is set to TWPT_BW

DAT_FILESYSTEM changes to /Camera_Bitonal_Both

CAP_CAMERAENABLE changes to:

/Camera_Color_Top: FALSE

/Camera_Color_Bottom: FALSE

/Camera_Bitonal_Top: TRUE

/Camera_Bitonal_Bottom: TRUE

The behavior guarantees that older applications and newer applications can work with the same driver. Application writers need to decide if they want to use ICAP_PIXELTYPE or DAT_FILESYSTEM when negotiating with a particular driver, never use both together. As a guideline, if DAT_FILESYSTEM are supported by a driver, use them, since they offer more functionality than ICAP_PIXELTYPE.

CAP_CAMERAENABLE vs. CAP_DUPLEXENABLED

Care needs to be taken when mixing CAP_CAMERAENABLE and CAP_DUPLEXENABLED. The recommendation is to use one or the other. Here is an example of the interdependency:

Table-1 shows an example of creating one color and one bitonal image from the front of every sheet of paper fed during the scanning session. In this case, CAP_DUPLEXENABLED would have been set to False.
 
Table-1
DAT_FILESYSTEM
CAP_CAMERAENABLE
/Camera_Color_Top
TRUE
/Camera_Color_Bottom
FALSE
/Camera_Bitonal_Top
TRUE
/Camera_Bitonal_Bottom
FALSE
If the application then sets CAP_DUPLEXENABLED to True, we would expect the table to change to the following:
 
Table-2
DAT_FILESYSTEM
CAP_CAMERAENABLE
/Camera_Color_Top
TRUE
/Camera_Color_Bottom
TRUE
/Camera_Bitonal_Top
TRUE
/Camera_Bitonal_Bottom
TRUE

NOTE: Rear only scanning is considered to be a special duplex operation. So for the following table CAP_DUPLEXENABLED would be True:

Table-3

DAT_FILESYSTEM	CAP_CAMERAENABLE
/Camera_Color_Top	FALSE
/Camera_Color_Bottom	TRUE
/Camera_Bitonal_Top	FALSE
/Camera_Bitonal_Bottom	TRUE

CAP_CAMERAORDER

The output order of the images can be adjusted using CAP_CAMERAORDER (using the CAP_CAMERA TWCM_*_BOTH values). This is a TW_ARRAY container that has the name of each of the cameras in the order they will be transferred from the driver to the application. For example, if CAP_CAMERAORDER is set to TWCM_BW_BOTH TWCM_CL_BOTH, then the bitonal image will be transferred before the color image. For a duplex session this would look like the following:

Bitonal Front

Color Front

Bitonal Rear

Color Rear

To simplify the validation rules between CAP_CAMERAENABLED and CAP_CAMERAORDER do the following:

1) If CAP_CAMERAORDER includes a "camera" that is set to False, then the driver will ignore it.

2) If CAP_CAMERAORDER does not include a "camera" that is set to True, then the driver is free to output the images in whatever ordering it wants.

Entire session (i.e. machine) vs. a single "camera"

The addition of independent front and rear capability negotiation immediately raises the question: which capabilities belong to the machine (like CAP_DUPLEX) and which ones belong to a "camera" (like CAP_COMPRESSION). There is no easy answer to this, since the hardware of the device dictates the capabilities. For instance scanner ABC may allow independent selection of ICAP_COMPRESSION for front and rear cameras because the designers put in dedicated compression chips for each side. Whereas scanner XYZ, in an effort to save costs, only used one chip for this operation, and they have no way to independently set the front from the rear for this one capability.

So, to help figure out where each capability goes, Kodak scanners have enhancement DG_CONTROL / DAT_CAPABILITY / MSG_QUERYSUPPORT with additional TWQC_ flags:

#define TWQC_MACHINE 0x1000 // applies to entire session/machine

#define TWQC_BITONAL 0x2000 // applies to Bitonal "cameras"

#define TWQC_COLOR 0x4000 // applies to Color "cameras"

A capability cannot mix TWQC_MACHINE with any of the other items listed above; otherwise all combinations are valid (e.g. a capability could have TWQC_BITONAL and TWQC_COLOR).

Capabilities that describe themselves as TWQC_MACHINE are accessible at all times, regardless of the current setting of DAT_FILESYSTEM. This means that a capability like CAP_DUPLEXENABLED can always be negotiated, (i.e., even if the current camera is set to something like /Camera_Bitonal_Rear).

METADATA

Metadata is the descriptive data that accompanies an image. TWAIN has two primary ways of communicating this information to an application: DAT_IMAGEINFO and DAT_EXTIMAGEINFO. Since DAT_EXTIMAGEINFO is extensible it’s the only way to introduce new metadata items to the TWAIN specification without creating a new DAT operation (and we don’t really need any more of those right now).

SDMI presents a bit of a problem for the application because the stream of images makes it difficult to tell which ones go with which document. This problem becomes compounded with things like automatic color detection (imagine not knowing if the application will get color or bitonal data on the next image).

Since the problem takes the form of a lack-of-communication problem, the solution is more data. With the Kodak drivers the following additional items are added to the list of DAT_EXTIMAGEINFO fields:

#define TWEI_HDR_PAGESIDE 0x8001

#define TWEI_HDR_IMAGENUMBER 0x8017

#define TWEI_HDR_PAGENUMBER 0x8018

#define TWEI_HDR_PAGEIMAGENUMBER 0x8019

TWEI_HDR_PAGESIDE returns 0 for a front image and 1 for a rear image.

TWEI_HDR_IMAGENUMBER counts from 1 to 2^32-1 the number of images captured since the application first MSG_OPENDS’d the driver.

TWEI_HDR_PAGENUMBER counts from 1 to 2^32-1 the number of pages of paper captures since the application first MSG_OPENDS’d the driver.

TWEI_HDR_PAGEIMAGENUMBER counts from 1 to the number of images captured from the document. For instance, given an SDMI session where the driver is transferring a color and a bitonal image for the front and a bitonal image for the rear we get the following sequence:

Image	Page Side	Image Number	Page Number	PageImageNumber
Color	Front	1	1	1
Bitonal	Front	2	1	2
Color	Rear	3	1	3
Color	Front	4	2	1
Bitonal	Front	5	2	2
Color	Rear	6	2	3

- note, if TWAIN standardizes on these names it will most likely lose the _HDR in the names.