The U-nome project

Uniface Underground -> The.Uniface.geNOME.project.

Introduction

Understanding the format of the Uniface binary files can be useful in order to design new tools and utilities to work with this kind of files. In fact, in the $UUU site there exists a tool that allows you switch off the MUM (the mark set by the /nodebug idf switch).

The goal of the UNOME (Uniface + Genome) project is to make a place where all the knowledge of the Uniface Community about binaries files would be collected and ordered.

The information provided here is unofficial and it corresponds to a study of binary files, and, maybe, it is not exact, wrong or it can contains some imprecision.

State of this document: ALIVE!!!.

Index

General Format

Segments description

First segment. File type
Second segment. Component information
Third segment. MUM and other stuff
A-segment. Item properties
K-segment. Initial values.
O-segment. Object enumeration
P-segment. Proc code.
U-segment. Operations enumeration
V-segment. Time stamp
Z-segment. Filler

Appendix A. Entry name for triggers

Appendix B. Proc instruction codes

General format

The binary files are divided into segments that have different kind of information useful for the execution to the uniface component.

The character 0x01 at the end of it delimits each segment. Several segments contain different attributes delimited by the 0x02 character (or others…).

Except when we specify another thing all the information here is related to NON ZIPPED components.

When a letter identifies a segment (as the P-segment or the V-segment) it would be the first character in the segment.

Segments description

First segment. File type.

The first three characters of a binary file are used to specify the kind of file in which the component is stored. This is:

NOZ. For NON ZIPPED files

ZIP. For compressed files.

Second segment. Component information.

This segment contains these attributes:

Attr.#	Description
1	!17
2
3	Component name
4	Name of the attached library
5
6

Third segment. Mum and other stuff.

The use of this segment is not well understand but there are some facts about it:

MUM lives here. MUM is the Miserable Undebug Mark. The end of this segment is marked with the string "i21" when it is compiled with the undocumented "/nodebug" switch from the idf command line. The "debugeable" component contains three other digits in this position.

A-segment. Item properties.

This segment contains the description of the properties for each item in the form. Notice that some properties need a "link" to others segment (eg. "Initial value" to "K-segment").

K-segment. Initial values.

Initial values for each item are stored (if needed) in a structure separated from others by character 0x03.

Each structure contains this information:

Character 0x04.
Length of the string used to assign the initial value to the item. Notice that length is "in charHex code" (see Numerical constants). Notice than length is not physical because "escaped" character as %%^ (0x07 0x2D) is count as 1 char.
Length and "string" separator (char 0x06).
Character 0x03 finishing the structure

O-segment. Objects enumeration.

All objects and their properties are enumerated here.

The structure of this segment consists in a set of values enclosed by the character 0x03. Between two 0x03 we will found the item "in charHex code" (see Numerical constants), the character 0x06 and the item name. Notice than length is not physical because "escaped" character as %%^ (0x07 0x2D) is count as 1 char.

0x03 0x37 0x06 0x45 0x4E 0x54 0x41 0x54 0x03 is the definition for an object called "ENTITAT".

The object is enumerated according to the position in the component (left to right and up to down).

The component variables are defined in this segment in the same way than other objects.

P-segment. Proc code.

General structure

The P-segment (proc code segment) use, basically two characters as separator between the different structures. The first character of this segment is the character "P"

The main structure in this segment is the BCB (binary code block) which is delimited by the character 0x7F. A BCB contains a set of "fields" which we have called "token" and each token is delimited by the character 0x08 (0x08 is the codification for the value 0 –zero- as small number, see how small number are codified in the "Numeric constant section"). A sentence can be one or more token.

String constant codification

The character 0xC8 indicates that the following characters, until the end of the token, are a constant string.

The substitution (it means something like "The current date is %$DATE$ ") is codified using, in the position where the substitution must be done, the character 0x09 followed by the variable address (2 bytes) and type (1 byte).

Notice that LIST are stored as normal string (the only difference is that it can contain <GOLD => and <GOLD ;>

GOLD characters

0x10: Group wildcard (<GOLD *> or *)

0x11: Single wildcard (<GOLD ?> or?)

0x12: Value and representation separator (<GOLD => or ?)

0x1B: List item separator (<GOLD ;> or ;)

Others

Carriage Return %%^ (0x07 0x2D)

Numeric constant codification

Numeric constants are codified in different ways (depending on their values). The first char indicates how is stored and it can be:

Store code	Codification
0XFD	Small numbers. The integers 0 to 99 are codified using this method. The value is the binary value of the code. The problem of this method is that numbers 0x01 to 0x07 are reserved as delimiters then the codification for values 0 to 15 is: If the byte is between 0x08 and 0x0F then subtract 0x08 to next byte and you will get de value. Notice than values If the byte is 0x07 then get next byte and subtract 0x20 from it. Using this method you will see that 0x63 = 99; 0x32 = 50; 0x09 = 1; 0x0B = 3; 0x08 = 0; 0x07 0x28 = 8; 0x07 0x2A = 10
0xFC	Characters The value is stored as an ASCII string (eg.2: 0x32, 0: 0x30…) The end of this string is 0x08
0xFE	2 bytes. Get the real value of each byte as it was a small number. Then calculate the value as: 1byte + 2byte256. Example: The number 1000 is stores as "0xE8; 0x0B" (after 0xFE ) correspond to "232; 3", then 232 + 3256 = 1000.
0xFF	4 bytes Get the real value of each byte as it was a small number. Then calculate the value as: 1byte + 2byte256 + 3byte256^2 + 4byte^256^3. Example: The number 1000 is stores as "0x40; 0xE2; 0x09; 0x08" (after 0xFF) correspond to "64; 226; 1; 0", then 64 + 226256 + 1256*256 = 123456.
0x04 ?	CharHEX. The value is codified using several bytes (until 0x06?). This two bytes contains the "character representation" of the hex values. It means that "0x19" is codified as "0x31 0x39" (the ascii values of char "1" and char "9". Be careful because characters "A" to "F" are represented as ":" to "?" (in an ASCII table you will see that these character are following the sequence "0" to "9": Then, as normal rule you can calculate the value with a formula like: NumValue := Sum(ascii(ByteN)-ascii("0")*15^N) {N=0..NumBytes}
0x02	Hex Value. The value is the hex value of the byte (until 0x06)

Data references

Global registers are codified using codes for small numbers (see table below) without 0xFD

0x64: $status

0x63: $result

The indirection is codified preceding the register code by the character 0xC6 (eg. 0xC6 0x0A is @$2)

Component variables are codified as 0xC1 0x0C followed by the "variable address". This "address" is the order (first is 0) in the next BCB for data declaration.

Global variables are codified as 0xC1 and next token is the "address"

Fields are identified by code 0xDD

BCB types

The first BCB we always found after the segment identification (the character "P") is composed by several tokens:

First: The value 0x06.suspicion -"the classical 1 done line at the beginning of the proc listing" or maybe is the BCB type).

The second piece is composed by 5 different tokens but, until now we have found, anytime the hex string "0x07 0x2A 0x08 0x08 0x0C 0x08 0x08 0x80 0x0B". (State: we have not idea about the use of this piece).

Following this BCB, the next ones correspond with any trigger/entry (sentence) or parameter/variable declaration (data definition). The BCB for parameter/variable declaration for a trigger/entry is stored next to the "sentences BCB".

BCB for data definition

- Global and local declaration

Each trigger contains an automatically generated declaration of each local ($V$) and global ($$V) used in the proc code for this trigger. The declaration (enumeration may be better) consists in the two token: name and "visibility". The visibility can be "0x0C" for global variables and "0x0D" for locals.

Eg. 0x59 0x36 0x08 0x0C 0x08 0x59 0x4C 0x31 0x08 0x0D 0x08 is the bin code for a trigger which uses $$V1 and $VL1$

- Parameters codification

The parameter definition has this token sequence:

Direction and type The first (and maybe second – see * in the table) byte. This byte is calculated assigning a value depending on the parameter type plus and value according to the direction. See the table below:

BASE VALUE DEPENDING ON THE TYPE
Type	Value
Any	1
String	2
Numeric	3
Float	4
Date	5
Time	6
Datetime	7
Raw (*)	8
Boolean (*)	9
Lineardate (*)	A
Lineartime (*)	B
Lineardatetime (*)	C
Image (*)	D

VALUE TO ADD DEPENDING ON THE DIRECTION
Direction	Value
IN (*)	0
OUT	C8
INOUT	64

(*) The value is stored using the "Low Numbers" method exposed in the "Numeric constants" section. Then the types marked with "*" in the "in" direction are codified as: 0x07 0x28; 0x07 0x29; 0x07 0x2A; 0x07 0x2B; 0x07 0x2D

Parameter name. Next bytes (until the end of the token 0x08)

End of the parameter. next token :0x0E

- Variable codification

Variables are codified exactly as " IN parameters" (including raw, Boolean, lineardate, lineartime, lineardatetime and image). The only difference is that next token "end of variable" is 0x0F (instead "0x0E" in parameters).

T-segment. Window Properties.

This segment contains a list with de window properties (each delimited by 0x1b). These items are:

Item	Description
CAPTION	These items are the "options" for the window. There can be set to "F" for false, "T" for true or null for "default"
CANRESIZE
CANZOOM
CANCLOSE
SYSMENU
CANICONIZE
IMAGE	Brackground Image. This item will contains the name preceded by a caret (^) or an at sign (@) depending on the image type (glyph or file)
MODAL	Modality (T, F or null)
ATTACHED	Attached (T, F or null) If UTYPE=TAB it is set to "PAGE"
DIRECTION	1 for RTL, 2 for LTR and null for default
SPLIT	This is a list of the schema of the "split bars" (if defined). The items are: (?, position?, orientation, style, Attached, lock, (left/up),(right/down)) Example: ( 58,16,v,r,t,n,(57,10,h,b,n,n,p,p),p)
UTYPE	Window type (PRIMARY, SECONDARY, DIALOG,TAB, null)
DIALOG	When UTYPE=DIALOG it is set to "T"

The rest of properties (More… command button) are added as other items at end of this list.

U-segment. List of operations.

This segment contains the list of operations available in the component.

V-segment. Time stamp.

In this segment the compilation time is stamped in the format YYYYMMDDhhmmss00 preceded by the string "ID". Example: VID2000042408040200

Z-segment. Filler

This segment contain the character 0x00 repeated until the component reached a 512 multiple size.

Appendix A. Entry name for triggers

The name of each trigger is "translated" to an entry name. This name is the "Yxxx" string than you can see when you generate a "proc listing".

In fact the triggers are converted to "entries" but internally managed. Anyway you can call it using something like "call Y1" to execute the code in the accept trigger. This section explains how the number after the "Y" is calculated.

Anyway, remember that a trigger can appears in a BCB or not depending on its functionality (eg. A LPMX can not be generate a Yxx code because it is not a piece of code, it contains "entries" and these entries are able of be generated).

Table 1. Start Up Shell

Y-code	Trigger code	Trigger
1	APPL	Application execution
2	SWIT	<Switch keyboard>
3	MNUA	<Menu>
4	UKYA	<User key>
5	PULA	<Pulldown>
6	ASYN	Asynchronous Interrupt

Table 2. Components

The numeration of these triggers has a particularity: If the trigger "Operation" is defined the Y-code specified is correct but if there are not OPER trigger then the code for triggers 7 to 17 must be decreased in 1 (CLR becomes 6, FRLF 10, and so on).

Y-code	Trigger code	Trigger
1	ACPT	<Accept>
2	QUIT	<Quit>
3	ASYS	Asynchronous Interrupt
4	LPMX	Local Proc Modules
5	EXEC	Execute
6	OPER	Operations
7	CLR	<Clear>
8	ERAS	<Erase>
9	PULS	<Pulldown>
10	FRGF	Form Gets Focus
11	FRLF	Form Leave Focus
12	UKYS	<User Key>
13	MNUS	<Menu>
14	RETS	<Retrieve Sequential>
15	RETR	<Retrieve>
16	PRNT	<Print>
17	STOR	<Store>

Table 3. Entity

See the note and example about numbering entities and field triggers at the end of this appendix

Y-code (Offset)	Trigger code	Trigger
0	DTLE	<Detail>
1	AIO	<Add/Insert Occurrence>
2	HLPE	<Help>
3	OGF	Occurrence Gets Focus
4	RMO	<Remove occurrence>
5	LOCK	Lock
7	ERRE	Error
8	LPO	Leaves printed occurrence
9	LMK	Leaves modified key
10	MNUE	<Menu>
11	READ	Read
12	DLUP	Delete Up
13	WRIT	Write
14	DELE	Delete
15	WRUP	Write Up
16	VLDO	Validate Occurrence
17	VLDK	Validate Key
19	LPME	Local Proc Modules

Table 4. Field

See the note and example about numbering entities and field triggers at the end of this appendix

Y-code (Offset)	Trigger code	Trigger
0	DTLF	<Detail>
1	ERRF	Error
2	HLPF	<Help>
3	FMT	Format
4	DFMT	Deformat
5	SMOD	Start Modificatio
7	ENCR	Encrypt
8	DECR	Dercrypt
9	NFLD	<Next Field>
10	PFLD	<Previous Field>
11	MNUF	<Menu>
12	FGF	Field Gets Focus
13	VALC	Value Change
16	VLDF	Validate Field
19	LPMF	Local Proc Modules

Note about numbering entities and field triggers.

For entities and fields the specified code is the "Offset" you must to add to the "object number". Think that a gap of twenty codes are reserved for every object", and that the codes 1 to 19 are for the component level triggers.

Example: Consider a Form with 3 entities (with the very original name ENTA, ENTB and ENTC). ENTA has two field FLDA_1 and FLDA_2; ENTB has only the FLDB_1 field and FLDC_1 and FLDC_2 are ENTC fields.

Code	Trigger
Y20	<DLTE> from ENTA
Y25	<LOCK> from ENTA
Y40	<DLTF> from FLDA_1.ENTA
Y60	<DLTF> from FLDA_2.ENTA
Y82	<HLPE> from ENTB
Y112	<FGG> from FLDB_1.ENTB
Y129	<LMK> from ENTC
Y140	<DLTF> from FLDBC_1.ENTC
Y160	<DLTF> from FDLBC_2.ENTC

Appendix B. Proc instruction codes (pic)

	PIC Hex Code	Sentence
18	0x12	Debug
19	0x13	Nodebug
	0x24	Message. Two tokens <instruction code and message text Eg. 0x24 0x08 0xC8 0x41 0x4E 0x59 is the binary translation of Message "ANY"
26	0x1A	Edit
36	0x21	Clear
54	0x36	Clear/e The string containing the entity name follows the pic
8	0x28	Assign (=) After the PIC, next token is the data reference. - a global register ($1 to $99). Remember how global register are referenced!!! - 0x64: $status - 0x63: $result - an Indirection if the character is by 0xC6 (eg. 0xC6 0X0A is @$2) - 0xC1 for component variable - 0xC1 0X08 global variable - 0xDD

unome.zip for 7.2.5 (00Kb)

Download a segment browser.