TextCaptureX SDK

From Screen Scraper Studio

Jump to: navigation, search

Contents

Introduction

TextCaptureX is a COM library that facilitates screen scraping for Windows. It i also exposed as .Net managed interface part of ScreenScraper.dll assembly. The entire object model is a little bit messy but hey it was a 6 years customer driven evolution. We plan to bring some clarity with the next version of the interfaces.

Creating the TextCaptureX object

Prior to calling any method the COM object should be created. See below the examples in different languages.


EndCaptureSession method

DEPRECATED: replaced by Cleanup method

Declared in ITextCaptureX7 interface

HRESULT EndCaptureSession(void);

Clears up capturing session data. It is highly recommended that you call this function before destroying the TextCaptureX object.

TextCaptureX Properties

BringWindowToTop

Declared in the ITextCaptureX5 interface.

VARIANT_BOOL BringWindowToTop;

The default value is FALSE.

If set to TRUE, the target window is brought into the foreground before capturing.

FormattedText

Declared in the ITextCaptureX6 interface.

VARIANT_BOOL FormattedText; 

The default value is FALSE.

If set to TRUE, the captured text is formatted using a fixed font, to preserve the screen layout. It is useful when capturing tables, folder content etc.

NativeCaptureTimeoutSec

Declared in the ITextCaptureX14 interface.

VARIANT_BOOL NativeCaptureTimeoutSec;

It represents the maximum duration in seconds in which the TextCaptureX object should finish capturing using the Native method.

The default value is 2.

Valid values range from 2 to 31. Setting the property outside this interval will not issue any error and the internal value will be set to the specified one. TextCaptureX will clamp the value internally.


Common properties (TextCaptureX, GetOCRText and GetAAText)

ExtractHighlightInfo

Declared in the ITextCaptureX13, IGetAAText2 and IGetOCRText5 interfaces.

VARIANT_BOOL ExtractHighlightInfo; 

Default set to FALSE.

If this property is set to TRUE, the selected capture method will extract text highlighting information into the HighlightInfo property.

This property is supported by all the capturing methods: Native, FullText and OCR.

HighlightInfo

Declared in the ITextCaptureX13, IGetAAText2 and IGetOCRText5 interfaces.

ITHighlightInfo* HighlightInfo;

This property will contain text highlighting information after calling the selected capturing method, if the ExtractHighlightInfo property is set to TRUE.

The highlight information consists of an array of texts and their bounding rectangles.

This property is supported by all the capturing methods: Native, FullText and OCR.

Hwnd property

long Hwnd;

This property specifies the handle of the window to which the information in this structure belongs.

Count property

long Count;

Read-only property. Indicates the number of elements in the array of texts and rectangles.

Add method

HRESULT Add([in] BSTR text, [in] LONG left, [in] LONG top, [in] LONG right, [in] LONG bottom);

This method adds a text and its corresponding rectangle to the array of text information.

Remove method

HRESULT Remove([in] LONG index);

This method removes the array element specified by the index.

Get method

HRESULT Get([in] LONG index, [out] BSTR* text, [out] LONG* left, [out] LONG* top, [out] LONG* right, [out] LONG* bottom);

Obtains the text and rectangle information found at the specified index in the array.

The returned rectangle is consistent with the UseClientCoordinates property.

PerformHighlight method

HRESULT PerformHighlight([in] LONG delay_milliseconds);

Shows a visual highlighting rectangle for each member of the array. The highlight will last for the number of milliseconds specified in the delay_milliseconds parameter.

The visual highlight is performed by a separate thread, so make sure that the thread that calls this function is put to sleep to allow the highlight to last as long as specified.

PerformHighlightFull method

HRESULT PerformHighlightFull([in] LONG delay_milliseconds);

Shows a visual highlighting rectangle containing all the members of the array.

The visual highlight is performed by a separate thread, so make sure that the thread that calls this function is put to sleep to allow the highlight to last as long as specified.

UseClientCoordinates

Declared in the ITextCaptureX5, IGetAAText3 and IGetOCRText5 interfaces.

VARIANT_BOOL UseClientCoordinates; 

This property specifies whether the selected capturing method should use client or screen coordinates. In the case of client coordinates, the rectangles or points specified as input or putput parameters will be in the client space of a window handle parameter that the function receives.

This property is supported by all the capturing methods: Native, FullText and OCR.

For ITextCaptureX5 and IGetAAText3, the default value is FALSE.

For IGetOCRText5, the default value is TRUE for compatibility reasons.


Sample code for using highlight info


Screen Scraping methods

Note - for all methods the coordinates are in pixels.

Native methods

GetTextFromRect

Declared in Interface ITextCaptureX

HRESULT GetTextFromRect([in] LONG hwnd, [in] LONG left, [in] LONG top, [in] LONG width, [in] LONG height, [out,retval] BSTR* result);

This method captures the text from the rectangle specified (left, top, width, height) in screen coordinates, in the window specified by hwnd.

Parameters

LONG hwnd   : handle of the window to capture from
LONG left   : X coordinate of the upper left point of the rectangle
LONG top    : Y coordinate of the upper left point of the rectangle
LONG width  : width of the rectangle
LONG height : height of the rectangle

Remarks One may use the CaptureInteractive method to grab these parameters, that is window handle and coordinates.

Examples


GetTextFromRectWithFont

Declared in Interface ITextCaptureX5

HRESULT GetTextFromRectWithFont([in] LONG hwnd, [in] LONG left, [in] LONG top, [in] LONG width, 
                                [in] LONG height, [out] IFontDisp** font, [out,retval] BSTR* result);

Same as GetTextFromRect plus returning font information.


CaptureWindow

Declared in the ITextCaptureX5 interface

HRESULT CaptureWindow([in] LONG hwnd, [out,retval] BSTR* result);

This method captures the text from the window specified by hwnd. Only visible text is captured.


CaptureWindowWithFont

Declared in Interface ITextCaptureX5

HRESULT CaptureWindowWithFont([in] LONG hwnd, [out] IFontDisp** font, [out,retval] BSTR* result);

Same as CaptureWindow plus returning font information.


CaptureActiveWindow

Declared in Interface ITextCaptureX

HRESULT CaptureActiveWindow([out,retval] BSTR* result);

This method captures the text from the window that is currently active, that is the window returned by the GetActiveWindowHwnd method.

Examples


CaptureActiveWindowWithFont

Declared in Interface TextCaptureX4

HRESULT CaptureActiveWindowWithFont([out] IFontDisp** font, [out,retval] BSTR* result);

Same as CaptureActiveWindow plus returning font information.


CaptureActiveWindow2

Declared in Interface TextCaptureX4

HRESULT CaptureActiveWindow2([in] LONG pid, [out,retval] BSTR* result);

Same as CaptureActiveWindow but they allow specifying a process identifier, so that if the active window belongs to the specified process, it will be skipped and the next window in the z-order stack is returned. Useful to get the window that was active before your own process was put into foreground.


CaptureActiveWindow2WithFont

Declared in Interface TextCaptureX5

HRESULT CaptureActiveWindow2WithFont([in] LONG pid, [out] IFontDisp** font, [out,retval] BSTR* result);

Same as CaptureActiveWindow2 plus returning font information.


GetTextFromPoint

Declared in Interface ITextcaptureX

HRESULT GetTextFromPoint([in] LONG hwnd, [in] LONG x, [in] LONG y, [out,retval] BSTR* result);

This method captures the word specified by the point (x, y) in screen coordinates, in the window specified by hwnd.

Parameters

LONG hwnd : handle of the window to capture from
LONG  x,y : coordinate of the point, depending on the UseClientCoordinates property.

Remarks One may use the CaptureInteractive method to grab these parameters, that is window handle and coordinates.

Examples


GetTextFromPointWithFont

Declared in Interface ITextCaptureX5

HRESULT GetTextFromPointWithFont([in] LONG hwnd, [in] LONG x, [in] LONG y, [out] IFontDisp** font, [out,retval] BSTR* result);

Same as GetTextFromPoint plus returning font information.


GetTextFromPoint2

Declared in Interface ITextCaptureX2

HRESULT GetTextFromPoint2([in] LONG hwnd, [in] LONG x, [in] LONG y, 
                          [in] BSTR separators, [out,retval] BSTR* result);

GetTextFromPoint2 is similar GetTextFromPoint, except it allows to supply a custom list of separators. The separators are the characters that will be considered word delimiters when detecting words.


GetTextFromPoint2WithFont

Declared in Interface ITextCaptureX5

HRESULT GetTextFromPoint2WithFont([in] LONG hwnd, [in] LONG x, [in] LONG y, [in] BSTR separators, [out] IFontDisp** font, [out,retval] BSTR* result);


GetTextFromUIElem

Declared in Interface ITextCaptureX11

HRESULT GetTextFromUIElem([in] IDispatch *pUIElemDisp, [out,retval] BSTR* result);

Returns the text from inside a UIElement object.

Parameters

IDispatch* pUIElemDisp : pointer to an IDispatch object which has to support the UIElement interface.

Examples


GetTextFromUIElemWithFont

Declared in Interface ITextCaptureX11

HRESULT GetTextFromUIElemWithFont([in] IDispatch *pUIElemDisp, [out] IFontDisp** font, [out,retval] BSTR* result);

Returns the text from inside a UIElement object. In addition to GetTextFromUIElem, it also gets font information.


FullText methods

GetFullTextAA

Declared in ITextCaptureX5 interface

HRESULT GetFullTextAA([in] LONG hwnd, [out,retval] BSTR* result);

Parameters

LONG hwnd : the handle of the window you want to capture from.


GetFullTextFromRectangle

Declared in the IGetAAText3 interface.

HRESULT GetFullTextFromRectangle([in] LONG hwnd, [in] LONG left, [in] LONG top, [in] LONG width, [in] LONG height, 
                                 [out,retval] BSTR* result);

Returns the text from the rectangle inside a window using the document object model provided by the underlying technology used to create the target app.

Parameters

LONG hwnd                  : the handle of the window you want to capture from.
LONG left,top,width,height : the rectangle of interest. The rectangle is interpreted according to the UseClientCoordinates property.


GetFullTextFromUIElem

Declared in the IGetAAText4 interface.

HRESULT GetFullTextFromUIElem([in] IDispatch *uielem, [out,retval] BSTR* result);

Returns the text from inside a UIElement object using the underlying technology used to create the target app.

Parameters

IDispatch* uielem - pointer to an IDispatch object which has to support the UIElement interface.


OCR methods

We are using the OCR engine that comes with Microsoft Office, specifically you need to have installed Microsoft Office Document Imaging component to use these methods.

IsMODIAvailable

Declared in the IGetOCRText interface.

HRESULT IsMODIAvailable([out, retval] VARIANT_BOOL* pVal);

Returns VARIANT_TRUE if the Microsoft Office Document Imaging (MODI) component is installed and available for recognizing text in bitmaps.

Examples


GetTextFromRectUsingMODI

Declared in the IGetOCRText interface.

HRESULT GetTextFromRectUsingMODI([in] LONG hwnd, [in] LONG left, [in] LONG top, [in] LONG width, [in] LONG height,
                                 [in] BSTR language, [out,retval] BSTR* result);

Returns the text that appears in specified client rectangle of the specified window.

Parameters

LONG   hwnd     : the handle of the window which the function will operate
LONG   left, 
       top, 
       width, 
       height   : the client rectangle. If the rectangle is empty, the whole client area is used.
BSTR   language : the language used by the MODI engine. The MODI language component must be installed with Office in
                  order to work. The string can be empty, in which case the function will use the default system language,
                  or it can have one of the following values: "chinese simplified", "chinese traditional", "czech", "dutch",
                  "english", "finnish", "french", "german", "greek", "hungarian", "italian", "japanese", "korean", "norwegian",
                  "polish", "portuguese", "russian", "spanish", "swedish", "turkish". The string is not case sensitive.

Examples


GetTextFromUIElemUsingMODI

Declared in the IGetOCRText3 interface.

HRESULT GetTextFromUIElemUsingMODI([in] IDispatch *pUIElemDisp, [in] BSTR language, [out,retval] BSTR* result);

Captures text from a UIElement object using Microsoft Office Document Imaging.

Parameters

IDispatch* pUIElemDisp : the UIElement object whose rectangle will be used for capturing the text
BSTR       language    : the language used by the MODI engine. See GetTextFromRectUsingMODI for possible values.

Examples


IsTesseractAvailable

Declared in the IGetOCRText4 interface.

HRESULT IsTesseractAvailable([out, retval] VARIANT_BOOL* pVal);

Returns TRUE if the integrated TextCaptureX OCR module is installed and available for recognizing text in bitmaps, FALSE otherwise.

Examples


GetTextFromRect OCR

Declared in the IGetOCRText4 interface.

HRESULT GetTextFromRect([in] LONG hwnd, [in] LONG left, [in] LONG top, [in] LONG width, [in] LONG height,
                        [in] BSTR language, [in] VARIANT_BOOL invert, [out,retval] BSTR* result);

Returns the text that appears in specified client rectangle of the specified window.

Parameters

LONG hwnd                  : the handle of the window which the function will operate.
LONG left,top,width,height : the client rectangle. If the rectangle is empty, the whole client area is used.
BSTR language              : the language used by the integrated OCR engine. The language component must be installed in the
                             installation directory. The string can be empty, in which case the function will use the English
                             language. If you want to specify a language, look into the installation directory and observe the
                             files having the 'TRAINEDDATA' extension. The names of these files are the language string (for
                             example 'eng' or 'fra').
                             This string is not case sensitive.
VARIANT_BOOL invert        : set this parameter to true/VARIANT_TRUE to invert the bitmap before recognizing it. Useful for cases 
                             in which the background color is darker than the text color (white on black).

Examples


GetTextFromUIElem OCR

Declared in the IGetOCRText4 interface.

HRESULT GetTextFromUIElem([in] IDispatch *uielem, [in] BSTR language, [in] VARIANT_BOOL invert, [out,retval] BSTR* result);

Captures text from a UIElement object using the integrated OCR engine.

The parameters have the same meaning as for the IGetOCRText4::GetTextFromRect method.


Find text in a window/region methods

GetRectFromText

Declared in the ITextCaptureX8 interface.

HRESULT GetRectFromText([in] LONG hwnd, [in] BSTR text, [in] IRect *target_rect, [in] LONG occurrence, 
                        [out,retval] IRect **rect);

This function searches for text in a given window rectangle and returns the bounding rectangle of the found text.

Parameters

LONG           hwnd : the window which this function will work on.
BSTR           text : the text that you want to search for. It is case sensitive.
IRect  *target_rect : pointer to an IRect object that belongs to the TextCapture library. It specifies the rectangle
                      in which this function will search for the desired text. If the UseClientCoordinates
                      property is set to VARIANT_TRUE, the rectangle specifies client coordinates, else, it specifies
                      screen coordinates.
LONG     occurrence : specifies how many times the text has to be found in order to have a final match. This value is
                      0 based, which means that you must specify 0 for the first occurrence. If you want to locate the 3rd 
                      occurrence of the text, set this parameter to 2.
IRect        **rect : the rectangle of the located text. It will contain client or screen coordinates, depending
                      on the UseClientCoordinates property.

Examples


GetRectFromTextUsingMODI

This function belongs to the IGetOCRText2 interface.

HRESULT GetRectFromTextUsingMODI([in] LONG hwnd, [in] BSTR text, [in] IRect* target_rect, [in] BSTR bstrLanguage,
                                 [in] LONG occurence, [out,retval] IRect** rect);

Searches for the specified text in the client rectangle of the specified window. It uses the Microsoft Office Document Imaging (MODI) engine to recognize the text.

Parameters

LONG           hwnd : the window which this function will work on.
BSTR           text : the text that you want to search for. It is case sensitive.
IRect  *target_rect : pointer to an IRect object that belongs to the TextCapture library. It specifies the rectangle
                      in which this function will search for the desired text. This rectangle must be given in client
                      coordinates relative to the window identified by the hwnd parameter.
LONG     occurrence : specifies how many times the text has to be found in order to have a final match. This value is
                      0 based, which means that you must specify 0 for the first occurrence. If you want to locate the 3rd 
                      occurrence of the text, set this parameter to 2.
BSTR   bstrLanguage : the language string used by the MODI engine. It is case insensitive. Can be empty, in which case the
                      function will consider the default system language. For a complete list of supported languages, please
                      read the GetTextFromRectUsingMODI section.
IRect        **rect : the rectangle of the located text. It will contain client coordinates.

Examples


GetRectFromText with integrated OCR

This function belongs to the IGetOCRText4 interface.

HRESULT GetRectFromText([in] LONG hwnd, [in] BSTR text, [in] IRect* target_rect, [in] BSTR language, [in] VARIANT_BOOL invert,
                        [in] LONG occurence, [out,retval] IRect** rect);

Searches for the specified text in the client rectangle of the specified window. It uses the integrated OCR engine to recognize the text.

Parameters

LONG         hwnd         : the window which this function will work on.
BSTR         text         : the text that you want to search for. It is case sensitive.
IRect*       target_rect  : pointer to an IRect object that belongs to the TextCapture library. It specifies the rectangle
                            in which this function will search for the desired text. This rectangle must be given in client
                            coordinates relative to the window identified by the hwnd parameter.
LONG         occurrence   : specifies how many times the text has to be found in order to have a final match. This value is
                            0 based, which means that you must specify 0 for the first occurrence. If you want to locate the 3rd 
                            occurrence of the text, set this parameter to 2.
BSTR         language     : the language string used by the MODI engine. It is case insensitive. Can be empty, in which case the
                            function will consider the default system language. For a complete list of supported languages, please
                            read the GetTextFromRect OCR section.
VARIANT_BOOL invert       : set this parameter to 'true' to invert the bitmap before recognizing it. Useful for cases in which
                            the background color is darker than the text color (white on black).
IRect*       rect         : the rectangle of the located text. It will contain client coordinates.

Examples


Get active window handle methods

These methods allow obtaining the handle of the window that is in foreground.

GetActiveWindowHwnd

Declared in Interface ITextcaptureX

HRESULT GetActiveWindowHwnd([out,retval] LONG* hwnd);

It returns the handle of the active window. If the active window is a MDI container, it returns the handle of the active MDI child.

Examples


GetActiveWindowHwnd2

Declared in Interface ITextCaptureX3

HRESULT GetActiveWindowHwnd2([in] LONG pid, [out,retval] LONG* hwnd);

Same as GetActiveWindowHwnd but they allow specifying a process identifier, so that if the active window belongs to the specified process, it will be skipped and the next window in the z-order stack is returned. Useful to get the window that was active before your own process was put into foreground.


Interactive selection of a window or region on the screen methods

These methods allow adding to your application the ability to interactively select region/windows/controls on the screen similar with the selection methods offered by ScreenScraper Studio.

CaptureInteractive

Declared in Interface ITextCaptureX

HRESULT CaptureInteractive([out] LONG* hwnd, [out] LONG* left, [out] LONG* top, 
                           [out] LONG* width, [out] LONG* height, [out,retval] LONG* selection);

This method enables an automatic selection of the capture window and coordinates. When calling this method, the user is able to select with the mouse a rectangle on the screen, no matter what window. First left click specifies the point where the rectangle starts, the second left click specifies where the selection ends. At any time, pressing ESC key, aborts the selection.

Parameters

LONG* hwnd     : will contain the handle of the window
LONG* left     : the X coordinate of the selected rectangle
LONG* top      : the Y coordinate of the selected rectangle
LONG* width    : the width of the selected rectangle
LONG* height   : the width of the selected rectangle
LONG* selection: a custom value specifying the capture result:
                                             # 0, if the selection was ok
                                             # 1, if the user canceled the capture
                                             # 2, if an error occurred

Examples


CaptureInteractiveFreeHand

Declared in ITextCapture5 interface

HRESULT CaptureInteractiveFreeHand([out] LONG* hwnd, [out] VARIANT* arPoints, 
                                   [out,retval] LONG* selection);

Same as CaptureInteractive, excepts it allows to perform a freehand selection. The area is returned as an array of points.


Caret methods

These methods allow getting the caret coordinates.

GetCaret

Declared in ITextCapture4 interface

HRESULT GetCaret([in] LONG hwnd, [out] LONG* x, [out] LONG* y); 

Returns the caret position in a given window.


GetCaret2

Declared in ITextCapture4 interface

HRESULT GetCaret2([out] LONG* hwnd, [out] LONG* x, [out] LONG* y); 

Detects the active window and finds caret coordinate within it.


Clipboard operation methods

SetClipboardText

Declared in the ITextCaptureX12 interface.

HRESULT SetClipboardText([in] BSTR text);

Clears the clipboard and sets the Unicode text content of the Windows clipboard to the specified string value.

GetClipboardText

Declared in the ITextCaptureX12 interface.

HRESULT GetClipboardText([out,retval] BSTR* result);

Returns the Unicode text content of the Windows clipboard.


Global synchronization methods

These functions are used to synchronize applications that use the TextCaptureX library. For example, if you have a program P1 that launches a JavaScript process P2 which captures text in a loop and you want to end that loop from P1 without killing P2, you can set the TextCaptureX sychronization event in P1. This event is globally available from any application using TextCaptureX. P2 will periodically check this event and if set, it will end the loop in a cooperative manner.

CheckStopEvent

Declared in the ITextCaptureX10 interface.

HRESULT CheckStopEvent([out,retval] VARIANT_BOOL *result);

The TextCaptureX object has an internal stop event flag. This function returns TRUE if the internal stop event is set and false otherwise.


SignalStopEvent

Declared in the ITextCaptureX10 interface.

HRESULT SignalStopEvent([in] VARIANT_BOOL bSignal);

The TextCaptureX object has an internal stop event flag. You can set this flag to TRUE with this function and then use the CheckStopEvent function to perform synchronization actions.


Error Codes

0x80040211: Trial version expired. Please activate the library.
0x80040212: Capture timeout.
0x80040213: Capture error.
0x80040214: Text not found.
0x80040215: The x64 function server is not available.
0x80040216: The 'Stop' event handle is invalid.
0x80040217: Cannot instantiate the MODI object. Please install Microsoft Office Document Imaging.
0x80040218: Cannot instantiate the TSelection object.
0x80040219: Language not available.
0x8004022A: OCR not available.
Personal tools
Namespaces
Variants
Actions
Documentation
SDK reference
Support