UIElement SDK
From Screen Scraper Studio
Introduction
UIElement has been designed with the goal to programmatically manipulate user interface objects. User interface objects can be in various shapes and can be built using various technologies. UIElement aims to be a unique and simple screen scraping interface to the myriad of UI objects you see on your screen.
What is a UIElement object?
This interface is used to interact with any visual object that the user can interact with (editable control, button, static text etc.). It works with major technologies to create UI, such as Windows common controls, GDI, HTML, .Net Forms, Java, WPF, Flash, Flex.
What is a Selector?
A selector is basically a plain text query used to find a particular GUI element among the running applications. A selector is actually an XML fragment specifying few attributes of the GUI element you are looking for and of some of its parents. It is if you want very similar to the notion of locating an element into an XML document using an xpath query. We have chosen to have our own simplified syntax instead of the very complex jquery that sometimes can be hard to read.
Selectors are generated automatically by the Selection tool in the ScreenScraper Studio (either Window Selection or Region Selection), but you can also write them manually. We recommend starting from an automatically generated selector then changing few attributes if you want to create a customized selector.
Why Selectors?
Screen scraping and GUI automation works of course with graphical user interface elements. These elements usually change position between different runs of their application because of many factors that are out of your control: Windows being resized and moved, OS version, Screen color and resolution, user preferences, framework versions etc…
A selector is a mean to find the target element regardless of its position of the screen. The selector use only attributes that are independent of GUI element position on the screen.
Also a selector is a useful mean to separate the code that deal with GUI elements by their representation. Using most libraries one would write this code to type some text in notepad:
GUIElement gui = new GUIElement Gui.FindWindow(‘title = “Notepad”’).Edit.Type(“mytext”)
Nothing wrong but if the title change from Notepad to MyNotepad you need to come back to your code, change it and recompile it. Imagine if you have 100s scripts that start with Notepad window.
Using selectors you write:
String selector = “<wnd app="notepad.exe" cls="Notepad" title="Untitled - Notepad"/> <wnd cls="Edit"/> <ctrl role="editable text"/>”
UIElement uiElem = new UIElement uiElem.InitializeFromSelector(selector, 0);
uiElem.WriteText(1, “mytext”);
Now imagine that instead of setting the selector in source code you load it from a database:
String selector = LoadNotepadSelectorFromDatabase();
When the app changes you only need to change the selector in database.
Syntax
A selector has the following format:
<node_1><node_2>...<node_N>
Each node represents a parent of the desired GUI element, last one identifying the element itself.
<node_1> is the root of all the nodes. <node_K> is an ancestor of <node_(K+1)>, but not necessarily a direct parent. <node_(K+1)> uniquely describes the child node in the context of the entire sub-tree of <node_K>.
Each node has the following format:
<ui_system attr_name_1="attr_value_1" ... attr_name_N="attr_value_N"/>
The starting "<" and ending "/>" sequences are part of the syntax. They delimit the node identifiers.
UI systems and attributes
ui_system represents the user interface technology that is used to create the specific node. Better to understand with an example: Imagine a java application. Every java application has a toplevel window. The slector starts naturally from this top level window that is created using Windows API thus the firts node is something like this: <wnd app="javaw.exe" cls="SunAwtFrame" title="Java Control Panel"/>. Now inside this window we have a tab control with few buttons. The tab and the buttons are created using java Swing framework. Their node in selector is <java name="General" role="page tab list"/>
Here is a complete list of all the supported UI systems and their specific attributes:
WND - Win32 window object
The parent nodes of these nodes can only be wnd nodes.
They can be root nodes.
The specific attributes for a Win32 window are:
app - the name of the application which owns the window. It is the lowercase name of the EXE.
- only the root node (see <node_1> above) can have this attribute. It is ignored in child nodes.
Example: app="notepad.exe".
cls - the class name of the window.
Example: cls="Edit".
title - the title or text content of the window.
Example: title="Untitled - Notepad".
aaname - a special internal attribute used for better differentiation. The meaning depends on the control implementation.
HTML - Internet Explorer window object
Nodes which belong to this UI system indicate the Internet Explorer main browser window that normally has the address bar and url attributes. They are usually the root node for webctrl nodes, which are described below
The parent nodes for this type of node can only be wnd.
They can be root nodes.
The valid attributes for the html nodes are:
app - the name of the process which owns the window.
- if this attribute is missing, the default value of "iexplore.exe" is taken.
title - the page title as it appears in the browser caption area.
Example: title="Yahoo!".
url - the URL pf the page as it appears in the address bar (if any).
Example: url="http://www.yahoo.com/".
WEBCTRL - Web control
These nodes describe HTML elements inside an Internet Explorer browser window.
The parent nodes can be webctrl or html, but there has be exactly one html parent.
As a result, they cannot be root nodes.
The attributes for a Web control are the following:
tag - the tag name of the HTML element.
Example: tag="DIV".
type - the type of the HTML element.
Example: type="BUTTON".
CTRL - generic control
This is an internal interpretation of controls and it depends on the control implementation.
The parent nodes can be ctrl or wnd, but there has to be at least one wnd parent.
In consequence, they cannot be root nodes.
The valid attributes are:
name - the name of the control.
Example: name="Manufacturer".
role - the role of the control.
Example: role="combobox".
JAVA - Java specific control
This indicates a UI element which is part of a Java window.
The parent nodes can be java or wnd, but there has to be at least one wnd parent.
In consequence, they cannot be root nodes.
The valid attributes are:
name - the name of the control. role - the role of the control.
The INDEX attribute
Sometimes, the desired UI element does not provide enough information to make it unique in the context of its node hierarchy. In this case, a special attribute named index is used to reach the node, and in consequence the UI element.
The syntax is:
idx="numeric_value"
Example: idx="3".
If the index is not specified, the default value is "1".
In the sequence "<wnd app='myapp.exe'/><wnd cls='Edit' idx='5'/>", we will look for the 5-th node with the cls attribute set to "Edit" in the whole sub-tree of the parent node "<wnd app='myapp.exe'/>".
If the idx attribute is specified for the root node as in "<wnd cls='Explorer' idx='5'/>", it will indicate the 5-th top-level window in the internal enumeration which has the cls attribute set to "Explorer".
This attribute applies to all UI systems.
Wildcards: * and ?
Sometimes, attributes change their values from time to time due to the fact that the application UI changes. In this case, part of the dynamic attributes can remain unchanged and we have to specify the variable parts somehow.
There are wildcards for that:
* - this character means zero or more characters can be there. Example: title="* - No*epad". ? - this characters means exactly one character. Example: title="Expl?rer".
A direct effect of the wildcards is the increasing probability that there are more and more nodes matching the Selector as we replace characters with wildcards. That's why sometimes adding an idx attribute becomes mandatory when using wildcards, because we have to reach the same element.
Example 1: we have two windows with the following titles: "Report <current_time>" and "Status <current_date>". The variable parts can be easily noticed. We want to choose the second value, and somehow our original identifier "<wnd cls='AfxWnd' title='Status 7/10/2010'/>" does not work any more. We can reach it with wildcards like this: "<wnd cls='AfxWnd' title='Status *'/>".
Example 2: we have three windows with the following titles: "Report <current_time>", "Status <current_date>" and "Status <tomorrow_date>". We can reach the third value like this: "<wnd cls='AfxWnd' title='Status *' idx='2'/>".
Attributes which support wildcards: app, cls, title, aaname, url.
Creating the UIElement object
Visual C++
In order to use the COM object in your C++ application, add the following import statement to your C++ source file, like this:
#import “<path>\UIElement.dll”
Optionally, you may add also:
using namespace UIElementLib;
in order to reference the objects without specifying the namespace.
Then create the object in one of the following two ways:
UIElementPtr obj(__uuidof(UIElement));
Or
UIElementPtr obj = NULL; HRESULT hr = obj.CreateInstance(__uuidof(UIElement));
In order to destroy the object, call:
obj.Release();
Remark: before creating the object, CoInitialize API function must be called.
UIElement Properties
autoRefresh
Declared in IUIElem3 interface.
VARIANT_BOOL autoRefresh;
When set to VARIANT_TRUE, it makes the UIElement to auto re-initialize from selector when the internal data of the object is no longer valid. Default value: VARIANT_FALSE.
The property is taken into account when calling the following methods: Activate, Click, GetChecked, GetRectangle, GetSelectedItem, GetState, GetValue, Highlight, IsFocused, IsInForeground, IsMinimized, SetChecked, SetFocus, SetSelectedItem, WriteText.
UIElements that do not have a selector will not be re-initialized (E.g. those obtained by InitializeFromPoint, FindAll, FindFirstElem).
hwnd
Declared in IUIElem interface.
LONG hwnd;
Specifies the handle of the window that visually contains the UIElement object. If you want to initialize an UIElement starting with a given handle you have to set this property to the window handle.
UIElement Methods
Activate
Declared in IUIElem interface.
HRESULT Activate();
Forces the top parent window of the current UI element into the foreground.
Cleanup
Declared in IUIElem3 interface.
HRESULT Cleanup();
Releases any resources used internally by an UIElement and brings the element back to uninitialized state.
Click
Declared in IUIElem interface.
HRESULT Click([in] LONG dx, [in] LONG dy, [in] LONG flags);
Clicks the current UI element using the specified mouse button.
Parameters
dx - specifies the horizontal displacement of the clicking position.
dy - specifies the vertical displacement of the clicking position.
flags - indicates how the mouse click will be simulated. Its significance is the following:
Bits 0-1 - specify whether the click is single or double. The bit mask for this bit is UIE_CF_CLICK_MASK = 3.
The corresponding flag values are:
UIE_CF_SINGLE = 0 - a single click is issued.
UIE_CF_DOUBLE = 1 - double click.
UIE_CF_HOVER = 2 - only change the mouse position, without clicking.
Bits 2-3 - they specify which button will be clicked. The bit mask for these bits is UIE_CF_BUTTON_MASK = 12.
The corresponding flag values are:
UIE_CF_LEFT = 0 - the left mouse button is clicked.
UIE_CF_RIGHT = 4 - right button.
UIE_CF_MIDDLE = 8 - middle button.
UIE_CF_SCREEN_COORDS = 16 - if this flag is set, the function interprets the given coordinates as screen coordinates.
If this flag is cleared, the coordinates are relative to the upper-left corner of the UI element.
UIE_CF_MOVE_CURSOR = 32 - if this flag is set, the position of the mouse cursor is moved to the clicking position.
FindFirstElem
Declared in IUIElem3 interface.
HRESULT FindFirstElem([in] LONG scope, [in] BSTR childSelector, [in] LONG timeout, [out,retval] IUIElem** ppResult);
Retrieves a child object of the current UIElement according to childSelector search criteria.
scope - if this parameter is zero then only direct children are taken into account. If it's 1 then all descendants are considered. childSelector - it's a Selector in XML format specifying the condition that the child UIElement to be found should meet.
FindAll
Declared in IUIElem interface.
HRESULT FindAll([in] LONG scope, [in] BSTR nodeID, [out,retval] IUIElemCollection** result);
Retrieves a collection of UIElement objects according to nodeID search criteria.
scope - if this parameter is zero then only direct children are taken into account. If it's 1 then all descendants are considered. nodeID - it's a Selector in XML format specifying the condition that UIElement objects in the collection should meet.
Note: more info about how to use FindAll method here and also a sample here
GetChecked
Declared in IUIElem3 interface.
HRESULT GetChecked([out,retval] VARIANT_BOOL* pIsChecked);
Retrieves the "checked" state of the UIElement. Applies to UIElements that are check-boxes or radios.
GetID
Declared in IUIElem interface. This method is deprecated, you should use GetSelector instead.
HRESULT GetID([in] VARIANT_BOOL refresh, [out,retval] BSTR* result);
Computes the Selector of the current UIElement object.
Parameters
refresh - if set to VARIANT_TRUE, the function will recompute the Selector. If the function is called for the first time, the Selector is calculated anyway.
GetJREPath
Declared in IUIElem2 interface.
HRESULT GetJREPath([out,retval] BSTR* result);
Call this function to find out the JRE path for the application of the current UI element.
GetRectangle
Declared in IUIElem interface.
HRESULT GetRectangle([out] LONG* left, [out] LONG* top, [out] LONG* right, [out] LONG* bottom);
Retrieves the dimensions of the bounding rectangle of the UIElement.
GetSelectedItem
Declared in IUIElem3 interface.
HRESULT GetSelectedItem([out,retval] BSTR* pBstrSelItem);
Retrieves the text of the selected item in a container (list-box, combo-box, tree).
GetSelector
Declared in IUIElem3 interface. This method is the new alias for GetID, which is deprecated.
HRESULT GetSelector([in] VARIANT_BOOL refresh, [out,retval] BSTR* result);
Computes the Selector of the current UIElement object.
Parameters
refresh - if set to VARIANT_TRUE, the function will recompute the Selector. If the function is called for the first time, the Selector is calculated anyway.
GetState
Declared in IUIElem interface.
HRESULT GetState([out, retval] BSTR* ppOutState);
Retrieves the "MS Active Accessibility" or "Java Accessibility" state of the object as a string. E.g.: "focusable, floating, checked".
GetValue
Declared in IUIElem interface.
Retrieves the text value of an UI element. E.g.: For an edit box it scrapes the value written on it.
HRESULT GetValue([out,retval] BSTR* result);
Highlight
Declared in IUIElem3 interface.
Highlight([in] LONG nDelay);
Shows a visual highlighting rectangle around the UIElement. The highlight will last for the number of milliseconds specified in the delay_nDelay parameter. The visual highlight is performed by a separate thread, so make sure that the thread that calls this function is put to sleep to allow the highlight to last as long as specified.
InitializeFromID
DEPRECATED: replaced by InitializeFromSelector
Declared in IUIElem interface.
HRESULT InitializeFromID([in] BSTR bstrID);
Initializes the internal data of the current object according to the given Selector.
InitializeFromPoint
Declared in IUIElem interface.
HRESULT InitializeFromPoint([in] LONG x, [in] LONG y, [in] LONG rootWindow);
Initializes the internal data of the current object according to the user interface object located at the specified screen location.
InitializeFromSelector
Declared in IUIElem3 interface.
HRESULT InitializeFromSelector([in] BSTR bstrSelector, [in] LONG nTimeout);
Tries to initialize the internal data of the current object according to bstrSelector in the given timeout.
InstallJavaBridge
Declared in IUIElem3 interface.
HRESULT InstallJavaBridge();
Installs the Java support files for the application corresponding to the current UIElement. The UIElement should point to a top-level Java app window. In order to instal Java Accessibility Bridge, the method invokes ScreenScrapeJavaSupport.exe
For more info see: Java screen scraping
IsFocused
Declared in IUIElem3 interface.
HRESULT IsFocused([out, retval] VARIANT_BOOL* pIsFocused);
Checks whether the UIElement has the focus.
IsInForeground
Declared in IUIElem interface.
HRESULT IsInForeground([out,retval] VARIANT_BOOL* pbRet);
Call this function to find out if the top level parent window of the current UI element is in the foreground.
IsJavaBridgeEnabled
Declared in IUIElem2 interface.
HRESULT IsJavaBridgeEnabled([out,retval] VARIANT_BOOL *pbIsJavaBridgeEnabled);
Checks whether the Java bridge is enabled for the current UI element.
IsJavaWindow
Declared in IUIElem2 interface.
HRESULT IsJavaWindow ([out,retval] VARIANT_BOOL *pbIsJavaWindow);
Call this function to find out if the window of the current UI element is a Java window.
IsMinimized
Declared in IUIElem3 interface.
HRESULT IsMinimized([out,retval] VARIANT_BOOL* pbIsMinimized);
Checks whether the top level window containing the UIElement is in minimized state.
IsValid
Declared in IUIElem interface.
HRESULT IsValid([out,retval] VARIANT_BOOL* pbIsValid);
Checks whether the internal data of the object designates a valid user interface object.
SetChecked
Declared in IUIElem3 interface.
HRESULT SetChecked([in] VARIANT_BOOL vbChecked);
Sets the "checked" state of a control. Applies to check-boxes and radios. Radio controls can only be checked. To un-check a radio you have to check another radio in the group.
SetFocus
Declared in IUIElem3 interface.
HRESULT SetFocus();
Sets the keyboard focus to the specified UIElement.
SetSelectedItem
Declared in IUIElem3 interface.
HRESULT SetSelectedItem([in] BSTR bstrSelItem);
Selects an item inside a container control (list-box, combo-box, tree).
Start
Declared in IUIElem3 interface.
HRESULT Start([in] BSTR bstrCmd, [in] BSTR bstrSelector, [in] LONG nTimeout);
Executes a shell command line and then tries to initialize the internal data of the current object according to bstrSelector in the given timeout. If the selector is NULL or empty then only the command line is executed.
WriteText
Declared in IUIElem interface.
HRESULT WriteText([in] LONG method, [in] BSTR text);
Puts text in the UI control identified by the current UIElement object.
Parameters:
method - specifies the way that the text is output into the control. The possible values are: UIE_WTM_NATIVE = 0 - meaning Control API method, which uses the API exposed by a particular control. UIE_WTM_SENDKEYS = 1 - meaning send keys method, which synthesize keyboard events to simulate the desired keys. text - specifies the text that the user want to write into the control.
Special characters
When SendKeys method is specified, special characters like Alt, Shift, Ctrl can be simulated. In order to specify such a character the following convention is used: 1. A special characters sequence is enclosed in [] 2. A sequence of special characters consists of a sequence of tokens that describe the key that is pressed or released in order to synthesize that character. A token has the form
TOKEN:=OPERATION(SPECIAL_CHARS) OPERATION:=k|d|u SPECIAL_CHARS:=alt|lalt|ralt|shift|lshift|rshift|ctrl|lctrl|rctrl|ins|del|home|end|pgup|pgdn|enter|left|right|up|down|tab|esc|back|pause|f1|f2|f3|f4|f5|f6|f7|f8|f9|f10|f11|f12|caps|num|add|sub|mul|div|decimal|break|num0|num1|num2|num3|num4|num5|num6|num7|num8|num9|scroll|sleep
k:=synthesize a key down followed by a key up event for the specified key. d:=synthesize a key down event for the specified SPECIAL_CHARS. u:=synthesize a key up event for the specified SPECIAL_CHARS.
Note: If you need to send [ char you need to escape it as [[.
Example: text = "AbC[d(ctrl)]a[u(ctrl)k(alt)]" This sample sequence outputs "AbC", then holds down (d) the Ctrl key, presses "a" and then releases Ctrl (u) and hits Alt (k = press and release).
UIElement Error Codes
0x80040211: Trial version expired. Please activate the library. 0x80040212: Cannot find the window corresponding to this ID. 0x80040213: Syntax error in the window ID. 0x80040214: Uninitialized UI element. 0x80040215: Invalid UI element. 0x80040216: Cannot use the Control API method on this UI element. Please use the SendKeys method. 0x80040217: Cannot get the screen rectangle of this UI element. 0x80040218: Syntax error in the key list. 0x80040219: Invalid FindAll call. 0x8004021F: Invalid command line in Start method.
SSSystemObj object
In order to use SSSystemObj COM object in your C++ application, add the following import statement to your C++ source file, like this:
#import “<path>\SSSystemObj.dll” using namespace SSSystemObjLib; // Optionally add this line in order to reference the objects without specifying the namespace.
Then create the object in one of the following two ways:
SSSystemObjPtr obj(__uuidof(SSSystemObj));
Or
SSSystemObjPtr obj = NULL; HRESULT hr = obj.CreateInstance(__uuidof(SSSystemObj));
In order to destroy the object, call:
obj.Release();
Remark: before creating the object, CoInitialize API function must be called.
Cleanup method
Declared in ISSSystem interface.
HRESULT Cleanup(void);
Performs a global clean-up of the resources loaded in target applications. It should be used at the end of your script/application.
Remark: use it with care because after clean up all UIElement objects will be invalidated.