Single Document Multiple Images

25-Aug-2004

Updated: 15-Aug-2006

 

 

Table of Contents

Overview

ICAP_PIXELTYPE and DAT_FILESYSTEM Overview

ICAP_PIXELTYPE

DAT_FILESYSTEM

DAT_FILESYSTEM vs. ICAP_PIXELTYPE

CAP_CAMERAENABLE vs. CAP_DUPLEXENABLED

CAP_CAMERAORDER

Entire session (i.e. machine) vs. a single "camera"

METADATA

 

 

Overview

 

This document talks about the Single Document Multiple Images (SDMI) behavior in TWAIN, describing how it is negotiated and what additional metadata associated with the image needs to be collected during image capture.  Since this is a moderately complex subject a number of other capabilities that have uses beyond SDMI will be discussed as well.

 

SDMI behavior is easy to view graphically:

 

Original         Color       Bitonal

Document         Image       Image

+-------+        +-------+   +-------+

|       |        |       |   |       |

|   R   |        |   R   |   |   R   |

|       |        |       |   |       |

|   G   |  --->  |   G   | + |   G   |

|       |        |       |   |       |

|   B   |        |   B   |   |   B   |

|       |        |       |   |       |

+-------+        +-------+   +-------+
 

In this example a color document results in the capture of two images, one that is color and one that is bitonal (black & white). Configurations of this form have a variety of applications, but the most common is when the application needs a faithful replication of the document for archival purposes and an image suitable for data collection, such as OCR.

 

SDMI puts no limit on the number of images that can result from a document.

 

SDMI is not the same as image segmentation.  Image segmentation divides a document into sub-images that are optimized for quality and compression.  In this example the driver could save the text images as Group-4 and the picture as JPEG:

 

Original

Document

+-------+

| a bit |        Segment 1   Segment 2   Segment 3

|of text|        Text        Picture     Text

| $#$#$ |        +-------+   +-------+   +-------+

| #$#$# |  --->  | a bit |   | #$#$# |   | more  |

| $#$#$ |        |of text| + | #$#$# | + | text  |

| more  |        +-------+   | #$#$# |   +-------+

| text  |                    +-------+

+-------+        
 

Image segmentation is typically used to efficiently store images.  SDMI is used to capture images that are then directed to different parts of the workflow.  It is possible to mix image segmentation with SDMI (ex: using image segmentation to produce the faithful replication image).  The two technologies have different goals, though, so it’s not advisable to use one to replace the other.

 
 
 
ICAP_PIXELTYPE and DAT_FILESYSTEM Overview

 

TWAIN did not start with duplex scanning built into the standard; this was added in version 1.7.  So prior to 1.7 ICAP_PIXELTYPE selected the pixel type (i.e. color vs. grayscale vs. bitonal) for the entire session. This could also be considered the "color space". An application could configure a driver to output color or grayscale or bitonal images, but only one of the three, so you can not get multiple images for a side via ICAP_PIXELTYPE.

 

TWAIN 1.8 introduced DAT_FILESYSTEM.  This allows an application to setup multiple images for a side. It also helps support setting of different values for the front and rear. For example, getting color on the front and grayscale on the rear.

 

To help maintain backwards compatibility, ICAP_PIXELTYPE needs to continue to apply to the entire session. This means ICAP_PIXELTYPE should never be negotiated with DAT_FILESYSTEM.

 

 

 

ICAP_PIXELTYPE
 

Setting ICAP_PIXELTYPE will set both the front and rear images to the given TWPT_ value and automatically set CAP_DUPLEXENABLED to true. Use ICAP_BITDEPTH to determine how many bits make a single pixel, such as 8 for 8-bit grayscale or 24 for 3-channel/8-bits-per-channel RGB. 

 
 
 
DAT_FILESYSTEM
 

DAT_FILESYSTEM addresses individual “cameras”.  The term “camera” doesn’t mean that the image capture device uses a camera; rather it’s a generic term for an image capture source.  DAT_FILESYSTEM calls the front side of the paper as the 'top' "camera", and the rear as 'bottom'. This doesn't have anything to do with the physical position of the camera, it is being used to describe what the user considers the top (i.e. front) of the sheet of paper versus the bottom (i.e. rear).

 

The driver will output images based on CAP_CAMERAENABLED. So while a "camera" can be individually set via DAT_FILESYSTEM, you must also set CAP_CAMERAENABLED to true for each "camera" you want the driver to actually produce.

 

The values for DAT_FILESYSTEM are typically:

Camera name
Side
Image
/Camera_Color_Top
front color or grayscale
/Camera_Color_Bottom
rear color or grayscale
/Camera_Color_Both
front and rear color or grayscale
/Camera_Bitonal_Top
front bitonal
/Camera_Bitonal_Bottom
rear bitonal
/Camera_Bitonal_Both
front and rear bitonal

 

Using a camera that ends in '_Both' means future settings will be applied to both the front and rear images.

 

Sample source code

 
 

 

DAT_FILESYSTEM vs. ICAP_PIXELTYPE
 

If DAT_FILESYSTEM is set, then ICAP_PIXELTYPE must reflect the current value of the "camera".  For instance, if DAT_FILESYSTEM is set to /Camera_Color_Both, then ICAP_PIXELTYPE should be set to TWPT_RGB (this is a basic sanity check for the driver to prevent DAT_FILESYSTEM and ICAP_PIXELTYPE from ever reporting conflicting values).

 

However, if ICAP_PIXELTYPE is set, then the following things must happen to DAT_FILESYSTEM and CAP_CAMERAENABLE:

 

If ICAP_PIXELTYPE is set to                         TWPT_RGB

DAT_FILESYSTEM changes to                      /Camera_Color_Both

CAP_CAMERAENABLE changes to:

/Camera_Color_Top:                           TRUE

/Camera_Color_Bottom:                      TRUE

/Camera_Bitonal_Top:                         FALSE

/Camera_Bitonal_Bottom:                    FALSE

 

If ICAP_PIXELTYPE is set to                         TWPT_BW

DAT_FILESYSTEM changes to                      /Camera_Bitonal_Both

CAP_CAMERAENABLE changes to:

/Camera_Color_Top:                           FALSE

/Camera_Color_Bottom:                      FALSE

/Camera_Bitonal_Top:                         TRUE

/Camera_Bitonal_Bottom:                    TRUE

 

The behavior guarantees that older applications and newer applications can work with the same driver.  Application writers need to decide if they want to use ICAP_PIXELTYPE or DAT_FILESYSTEM when negotiating with a particular driver, never use both together.  As a guideline, if DAT_FILESYSTEM are supported by a driver, use them, since they offer more functionality than ICAP_PIXELTYPE.

 

 

 

CAP_CAMERAENABLE vs. CAP_DUPLEXENABLED
 

Care needs to be taken when mixing CAP_CAMERAENABLE and CAP_DUPLEXENABLED. The recommendation is to use one or the other. Here is an example of the interdependency:

Table-1 shows an example of creating one color and one bitonal image from the front of every sheet of paper fed during the scanning session. In this case, CAP_DUPLEXENABLED would have been set to False.

 
Table-1
DAT_FILESYSTEM
CAP_CAMERAENABLE
/Camera_Color_Top
TRUE
/Camera_Color_Bottom
FALSE
/Camera_Bitonal_Top
TRUE
/Camera_Bitonal_Bottom
FALSE

 

If the application then sets CAP_DUPLEXENABLED to True, we would expect the table to change to the following:

 
Table-2
DAT_FILESYSTEM
CAP_CAMERAENABLE
/Camera_Color_Top
TRUE
/Camera_Color_Bottom
TRUE
/Camera_Bitonal_Top
TRUE
/Camera_Bitonal_Bottom
TRUE
 

NOTE: Rear only scanning is considered to be a special duplex operation.  So for the following table CAP_DUPLEXENABLED would be True:

Table-3
DAT_FILESYSTEM
CAP_CAMERAENABLE
/Camera_Color_Top
FALSE
/Camera_Color_Bottom
TRUE
/Camera_Bitonal_Top
FALSE
/Camera_Bitonal_Bottom
TRUE
 
 
CAP_CAMERAORDER
 

The output order of the images can be adjusted using CAP_CAMERAORDER (using the CAP_CAMERA TWCM_*_BOTH values).  This is a TW_ARRAY container that has the name of each of the cameras in the order they will be transferred from the driver to the application.  For example, if CAP_CAMERAORDER is set to TWCM_BW_BOTH TWCM_CL_BOTH, then the bitonal image will be transferred before the color image.  For a duplex session this would look like the following:

 

        Bitonal Front

        Color Front

        Bitonal Rear

        Color Rear

 

To simplify the validation rules between CAP_CAMERAENABLED and CAP_CAMERAORDER do the following:

 

1)    If CAP_CAMERAORDER includes a "camera" that is set to False, then the driver will ignore it.

2)    If CAP_CAMERAORDER does not include a "camera" that is set to True, then the driver is free to output the images in whatever ordering it wants.

 

 

 
Entire session (i.e. machine) vs. a single "camera"
 

The addition of independent front and rear capability negotiation immediately raises the question: which capabilities belong to the machine (like CAP_DUPLEX) and which ones belong to a "camera" (like CAP_COMPRESSION).  There is no easy answer to this, since the hardware of the device dictates the capabilities.  For instance scanner ABC may allow independent selection of ICAP_COMPRESSION for front and rear cameras because the designers put in dedicated compression chips for each side.  Whereas scanner XYZ, in an effort to save costs, only used one chip for this operation, and they have no way to independently set the front from the rear for this one capability.

 

So, to help figure out where each capability goes, Kodak scanners have enhancement DG_CONTROL / DAT_CAPABILITY / MSG_QUERYSUPPORT with additional TWQC_ flags:

 

#define TWQC_MACHINE        0x1000    // applies to entire session/machine

#define TWQC_BITONAL        0x2000    // applies to Bitonal "cameras"

#define TWQC_COLOR          0x4000    // applies to Color "cameras"

A capability cannot mix TWQC_MACHINE with any of the other items listed above; otherwise all combinations are valid (e.g. a capability could have TWQC_BITONAL and TWQC_COLOR).

 

Capabilities that describe themselves as TWQC_MACHINE are accessible at all times, regardless of the current setting of DAT_FILESYSTEM.  This means that a capability like CAP_DUPLEXENABLED can always be negotiated, (i.e., even if the current camera is set to something like /Camera_Bitonal_Rear).

 
 
METADATA
 

Metadata is the descriptive data that accompanies an image.  TWAIN has two primary ways of communicating this information to an application: DAT_IMAGEINFO and DAT_EXTIMAGEINFO.  Since DAT_EXTIMAGEINFO is extensible it’s the only way to introduce new metadata items to the TWAIN specification without creating a new DAT operation (and we don’t really need any more of those right now).

 

SDMI presents a bit of a problem for the application because the stream of images makes it difficult to tell which ones go with which document.  This problem becomes compounded with things like automatic color detection (imagine not knowing if the application will get color or bitonal data on the next image).

 

Since the problem takes the form of a lack-of-communication problem, the solution is more data.  With the Kodak drivers the following additional items are added to the list of DAT_EXTIMAGEINFO fields:

 

#define TWEI_HDR_PAGESIDE          0x8001

#define TWEI_HDR_IMAGENUMBER       0x8017

#define TWEI_HDR_PAGENUMBER        0x8018

#define TWEI_HDR_PAGEIMAGENUMBER   0x8019

 

TWEI_HDR_PAGESIDE returns 0 for a front image and 1 for a rear image.

 

TWEI_HDR_IMAGENUMBER counts from 1 to 2^32-1 the number of images captured since the application first MSG_OPENDS’d the driver.

 

TWEI_HDR_PAGENUMBER counts from 1 to 2^32-1 the number of pages of paper captures since the application first MSG_OPENDS’d the driver.

 

TWEI_HDR_PAGEIMAGENUMBER counts from 1 to the number of images captured from the document.  For instance, given an SDMI session where the driver is transferring a color and a bitonal image for the front and a bitonal image for the rear we get the following sequence:

 

Image

Page Side

Image Number

Page Number

PageImageNumber

Color

Front

1

1

1

Bitonal

Front

2

1

2

Color

Rear

3

1

3

Color

Front

4

2

1

Bitonal

Front

5

2

2

Color

Rear

6

2

3