Uniface Underground -> The.Uniface.geNOME.project. |
Introduction
Understanding the format of the Uniface binary files can be useful in order to design new tools and utilities to work with this kind of files. In fact, in the $UUU site there exists a tool that allows you switch off the MUM (the mark set by the /nodebug idf switch).
The goal of the UNOME (Uniface + Genome) project is to make a place where all the knowledge of the Uniface Community about binaries files would be collected and ordered.
The information provided here is unofficial and it corresponds to a study of binary files, and, maybe, it is not exact, wrong or it can contains some imprecision.
State of this document: ALIVE!!!.
Index
Segments description
Appendix A. Entry name for triggers
Appendix B. Proc instruction codes
The binary files are divided into segments that have different kind of information useful for the execution to the uniface component.
The character 0x01 at the end of it delimits each segment. Several segments contain different attributes delimited by the 0x02 character (or others…).
Except when we specify another thing all the information here is related to NON ZIPPED components.
When a letter identifies a segment (as the P-segment or the V-segment) it would be the first character in the segment.
Segments description
The first three characters of a binary file are used to specify the kind of file in which the component is stored. This is:
NOZ. For NON ZIPPED files
ZIP. For compressed files.
This segment contains these attributes:
Attr.# |
Description |
1 |
!17 |
2 |
|
3 |
Component name |
4 |
Name of the attached library |
5 |
|
6 |
|
Third segment. Mum and other stuff.
The use of this segment is not well understand but there are some facts about it:
This segment contains the description of the properties for each item in the form. Notice that some properties need a "link" to others segment (eg. "Initial value" to "K-segment").
Initial values for each item are stored (if needed) in a structure separated from others by character 0x03.
Each structure contains this information:
O-segment. Objects enumeration.
All objects and their properties are enumerated here.
The structure of this segment consists in a set of values enclosed by the character 0x03. Between two 0x03 we will found the item "in charHex code" (see Numerical constants), the character 0x06 and the item name. Notice than length is not physical because "escaped" character as %%^ (0x07 0x2D) is count as 1 char.
0x03 0x37 0x06 0x45 0x4E 0x54 0x41 0x54 0x03 is the definition for an object called "ENTITAT".
The object is enumerated according to the position in the component (left to right and up to down).
The component variables are defined in this segment in the same way than other objects.
General structure
The P-segment (proc code segment) use, basically two characters as separator between the different structures. The first character of this segment is the character "P"
The main structure in this segment is the BCB (binary code block) which is delimited by the character 0x7F. A BCB contains a set of "fields" which we have called "token" and each token is delimited by the character 0x08 (0x08 is the codification for the value 0 –zero- as small number, see how small number are codified in the "Numeric constant section"). A sentence can be one or more token.
String constant codification
The character 0xC8 indicates that the following characters, until the end of the token, are a constant string.
The substitution (it means something like "The current date is %$DATE$ ") is codified using, in the position where the substitution must be done, the character 0x09 followed by the variable address (2 bytes) and type (1 byte).
Notice that LIST are stored as normal string (the only difference is that it can contain <GOLD => and <GOLD ;>
GOLD characters
0x10: Group wildcard (<GOLD *> or *)
0x11: Single wildcard (<GOLD ?> or?)
0x12: Value and representation separator (<GOLD => or ?)
0x1B: List item separator (<GOLD ;> or ;)
Others
Carriage Return %%^ (0x07 0x2D)
Numeric constant codification
Numeric constants are codified in different ways (depending on their values). The first char indicates how is stored and it can be:
Store code |
Codification |
0XFD |
Small numbers. The integers 0 to 99 are codified using this method. The value is the binary value of the code. The problem of this method is that numbers 0x01 to 0x07 are reserved as delimiters then the codification for values 0 to 15 is:
Using this method you will see that 0x63 = 99; 0x32 = 50; 0x09 = 1; 0x0B = 3; 0x08 = 0; 0x07 0x28 = 8; 0x07 0x2A = 10 |
0xFC |
Characters The value is stored as an ASCII string (eg.2: 0x32, 0: 0x30…) The end of this string is 0x08 |
0xFE |
2 bytes. Get the real value of each byte as it was a small number. Then calculate the value as: 1byte + 2byte*256. Example: The number 1000 is stores as "0xE8; 0x0B" (after 0xFE ) correspond to "232; 3", then 232 + 3*256 = 1000. |
0xFF |
4 bytes Get the real value of each byte as it was a small number. Then calculate the value as: 1byte + 2byte*256 + 3byte*256^2 + 4byte^256^3. Example: The number 1000 is stores as "0x40; 0xE2; 0x09; 0x08" (after 0xFF) correspond to "64; 226; 1; 0", then 64 + 226*256 + 1*256*256 = 123456. |
0x04 ? |
CharHEX. The value is codified using several bytes (until 0x06?). This two bytes contains the "character representation" of the hex values. It means that "0x19" is codified as "0x31 0x39" (the ascii values of char "1" and char "9". Be careful because characters "A" to "F" are represented as ":" to "?" (in an ASCII table you will see that these character are following the sequence "0" to "9": Then, as normal rule you can calculate the value with a formula like: NumValue := Sum(ascii(ByteN)-ascii("0")*15^N) {N=0..NumBytes} |
0x02 |
Hex Value. The value is the hex value of the byte (until 0x06) |
Data references
Global registers are codified using codes for small numbers (see table below) without 0xFD
0x64: $status
0x63: $result
The indirection is codified preceding the register code by the character 0xC6 (eg. 0xC6 0x0A is @$2)
Component variables are codified as 0xC1 0x0C followed by the "variable address". This "address" is the order (first is 0) in the next BCB for data declaration.
Global variables are codified as 0xC1 and next token is the "address"
Fields are identified by code 0xDD
BCB types
The first BCB we always found after the segment identification (the character "P") is composed by several tokens:
First: The value 0x06.suspicion -"the classical 1 done line at the beginning of the proc listing" or maybe is the BCB type).
The second piece is composed by 5 different tokens but, until now we have found, anytime the hex string "0x07 0x2A 0x08 0x08 0x0C 0x08 0x08 0x80 0x0B". (State: we have not idea about the use of this piece).
Following this BCB, the next ones correspond with any trigger/entry (sentence) or parameter/variable declaration (data definition). The BCB for parameter/variable declaration for a trigger/entry is stored next to the "sentences BCB".
BCB for data definition
- Global and local declaration
Each trigger contains an automatically generated declaration of each local ($V$) and global ($$V) used in the proc code for this trigger. The declaration (enumeration may be better) consists in the two token: name and "visibility". The visibility can be "0x0C" for global variables and "0x0D" for locals.
Eg. 0x59 0x36 0x08 0x0C 0x08 0x59 0x4C 0x31 0x08 0x0D 0x08 is the bin code for a trigger which uses $$V1 and $VL1$
- Parameters codification
The parameter definition has this token sequence:
Direction and type The first (and maybe second – see * in the table) byte. This byte is calculated assigning a value depending on the parameter type plus and value according to the direction. See the table below:
BASE VALUE DEPENDING ON THE TYPE |
|
Type |
Value |
Any |
1 |
String |
2 |
Numeric |
3 |
Float |
4 |
Date |
5 |
Time |
6 |
Datetime |
7 |
Raw (*) |
8 |
Boolean (*) |
9 |
Lineardate (*) |
A |
Lineartime (*) |
B |
Lineardatetime (*) |
C |
Image (*) |
D |
|
|
VALUE TO ADD DEPENDING ON THE DIRECTION |
|
Direction |
Value |
IN (*) |
0 |
OUT |
C8 |
INOUT |
64 |
(*) The value is stored using the "Low Numbers" method exposed in the "Numeric constants" section. Then the types marked with "*" in the "in" direction are codified as: 0x07 0x28; 0x07 0x29; 0x07 0x2A; 0x07 0x2B; 0x07 0x2D
Parameter name. Next bytes (until the end of the token 0x08)
End of the parameter. next token :0x0E
- Variable codification
Variables are codified exactly as " IN parameters" (including raw, Boolean, lineardate, lineartime, lineardatetime and image). The only difference is that next token "end of variable" is 0x0F (instead "0x0E" in parameters).
T-segment. Window Properties.
This segment contains a list with de window properties (each delimited by 0x1b). These items are:
Item |
Description |
CAPTION |
These items are the "options" for the window. There can be set to "F" for false, "T" for true or null for "default" |
CANRESIZE |
|
CANZOOM |
|
CANCLOSE |
|
SYSMENU |
|
CANICONIZE |
|
IMAGE |
Brackground Image. This item will contains the name preceded by a caret (^) or an at sign (@) depending on the image type (glyph or file) |
MODAL |
Modality (T, F or null) |
ATTACHED |
Attached (T, F or null) If UTYPE=TAB it is set to "PAGE" |
DIRECTION |
1 for RTL, 2 for LTR and null for default |
SPLIT |
This is a list of the schema of the "split bars" (if defined). The items are: (?, position?, orientation, style, Attached, lock, (left/up),(right/down)) Example: ( 58,16,v,r,t,n,(57,10,h,b,n,n,p,p),p) |
UTYPE |
Window type (PRIMARY, SECONDARY, DIALOG,TAB, null) |
DIALOG |
When UTYPE=DIALOG it is set to "T" |
The rest of properties (More… command button) are added as other items at end of this list.
U-segment. List of operations.
This segment contains the list of operations available in the component.
In this segment the compilation time is stamped in the format YYYYMMDDhhmmss00 preceded by the string "ID". Example: VID2000042408040200
This segment contain the character 0x00 repeated until the component reached a 512 multiple size.
Appendix A. Entry name for triggers
The name of each trigger is "translated" to an entry name. This name is the "Yxxx" string than you can see when you generate a "proc listing".
In fact the triggers are converted to "entries" but internally managed. Anyway you can call it using something like "call Y1" to execute the code in the accept trigger. This section explains how the number after the "Y" is calculated.
Anyway, remember that a trigger can appears in a BCB or not depending on its functionality (eg. A LPMX can not be generate a Yxx code because it is not a piece of code, it contains "entries" and these entries are able of be generated).
Table 1. Start Up Shell
Y-code |
Trigger code |
Trigger |
1 |
APPL |
Application execution |
2 |
SWIT |
<Switch keyboard> |
3 |
MNUA |
<Menu> |
4 |
UKYA |
<User key> |
5 |
PULA |
<Pulldown> |
6 |
ASYN |
Asynchronous Interrupt |
Table 2. Components
The numeration of these triggers has a particularity: If the trigger "Operation" is defined the Y-code specified is correct but if there are not OPER trigger then the code for triggers 7 to 17 must be decreased in 1 (CLR becomes 6, FRLF 10, and so on).
Y-code |
Trigger code |
Trigger |
1 |
ACPT |
<Accept> |
2 |
QUIT |
<Quit> |
3 |
ASYS |
Asynchronous Interrupt |
4 |
LPMX |
Local Proc Modules |
5 |
EXEC |
Execute |
6 |
OPER |
Operations |
7 |
CLR |
<Clear> |
8 |
ERAS |
<Erase> |
9 |
PULS |
<Pulldown> |
10 |
FRGF |
Form Gets Focus |
11 |
FRLF |
Form Leave Focus |
12 |
UKYS |
<User Key> |
13 |
MNUS |
<Menu> |
14 |
RETS |
<Retrieve Sequential> |
15 |
RETR |
<Retrieve> |
16 |
PRNT |
<Print> |
17 |
STOR |
<Store> |
Table 3. Entity
See the note and example about numbering entities and field triggers at the end of this appendix
Y-code (Offset) |
Trigger code |
Trigger |
0 |
DTLE |
<Detail> |
1 |
AIO |
<Add/Insert Occurrence> |
2 |
HLPE |
<Help> |
3 |
OGF |
Occurrence Gets Focus |
4 |
RMO |
<Remove occurrence> |
5 |
LOCK |
Lock |
7 |
ERRE |
Error |
8 |
LPO |
Leaves printed occurrence |
9 |
LMK |
Leaves modified key |
10 |
MNUE |
<Menu> |
11 |
READ |
Read |
12 |
DLUP |
Delete Up |
13 |
WRIT |
Write |
14 |
DELE |
Delete |
15 |
WRUP |
Write Up |
16 |
VLDO |
Validate Occurrence |
17 |
VLDK |
Validate Key |
19 |
LPME |
Local Proc Modules |
Table 4. Field
See the note and example about numbering entities and field triggers at the end of this appendix
Y-code (Offset) |
Trigger code |
Trigger |
0 |
DTLF |
<Detail> |
1 |
ERRF |
Error |
2 |
HLPF |
<Help> |
3 |
FMT |
Format |
4 |
DFMT |
Deformat |
5 |
SMOD |
Start Modificatio |
7 |
ENCR |
Encrypt |
8 |
DECR |
Dercrypt |
9 |
NFLD |
<Next Field> |
10 |
PFLD |
<Previous Field> |
11 |
MNUF |
<Menu> |
12 |
FGF |
Field Gets Focus |
13 |
VALC |
Value Change |
16 |
VLDF |
Validate Field |
19 |
LPMF |
Local Proc Modules |
Note about numbering entities and field triggers.
For entities and fields the specified code is the "Offset" you must to add to the "object number". Think that a gap of twenty codes are reserved for every object", and that the codes 1 to 19 are for the component level triggers.
Example: Consider a Form with 3 entities (with the very original name ENTA, ENTB and ENTC). ENTA has two field FLDA_1 and FLDA_2; ENTB has only the FLDB_1 field and FLDC_1 and FLDC_2 are ENTC fields.
Code |
Trigger |
Y20 |
<DLTE> from ENTA |
Y25 |
<LOCK> from ENTA |
Y40 |
<DLTF> from FLDA_1.ENTA |
Y60 |
<DLTF> from FLDA_2.ENTA |
Y82 |
<HLPE> from ENTB |
Y112 |
<FGG> from FLDB_1.ENTB |
Y129 |
<LMK> from ENTC |
Y140 |
<DLTF> from FLDBC_1.ENTC |
Y160 |
<DLTF> from FDLBC_2.ENTC |
Appendix B. Proc instruction codes (pic)
|
PIC Hex Code |
Sentence |
18 |
0x12 |
Debug |
19 |
0x13 |
Nodebug |
|
0x24 |
Message. Two tokens <instruction code and message text Eg. 0x24 0x08 0xC8 0x41 0x4E 0x59 is the binary translation of Message "ANY" |
26 |
0x1A |
Edit |
36 |
0x21 |
Clear |
54 |
0x36 |
Clear/e The string containing the entity name follows the pic |
8 |
0x28 |
Assign (=) After the PIC, next token is the data reference. - a global register ($1 to $99). Remember how global register are referenced!!! - 0x64: $status - 0x63: $result - an Indirection if the character is by 0xC6 (eg. 0xC6 0X0A is @$2) - 0xC1 for component variable - 0xC1 0X08 global variable - 0xDD |