PROCEDURES
Star4Win includes several standard procedures which
can simplify many of your tasks. Many procedures supported by the
DOS version (like COLOR, ADDCHAR etc.) are
not included, but many are still
there, often in an improved variant.
- COLUMN
- This procedure can rearrange your text file into a specified
number
of text columns, thus being useful for printing indices etc. The output
is rather rough, though; in Windows environment we recommend using
standard
means of word processors like Microsoft Word.
- COMPOSE
- Procedure
COMPOSE transforms two (or more) records
with partly identical content into one record where the coinciding
parts
of these two records remain intact while the content of the fields that
do not coincide will be COMPOSEd, i.e. enumerated inside one record and
separated with the chosen delimiter.
As a rule, procedure COMPOSE should be
applied
only to those files which are previously SORTed on some field (or
concatenation
of several fields); this field (these fields) should be declared as
"base",
i.e. containing "basic information".
If the content of all "basic" fields in two
neighbouring
records is identical, then the procedure COMPOSE affects
the content
of "composable" fields. The content of every "composable" field in all
neighbouring records will be COMPOSEd, i.e. replaced with the
concatenation
of all character data in these records, separated by the chosen
delimiter
(if the composable field is a character field) or with the
arithmetical/logical
sum of all data in these records (if the composable field is numeric or
logical). As far as composition of character fields is concerned, the
procedure
COMPOSE is reverse to the procedure DECOMPOSE.
Suppose we have a database file
germengl.dbf with
the character fields GERM and ENGL, previously SORTed on the field
GERM,
and choose the delimiter ";".
| Recno() |
GERM |
ENGL |
| 1 |
der Monat |
month |
| 2 |
der Monat |
moon |
| 3 |
die Stunde |
hour |
| 4 |
die Stunde |
lesson |
| 5 |
die Uhr |
hour |
| 6 |
die Uhr |
time |
| 7 |
die Uhr |
clock |
| 8 |
die Uhr |
watch |
Procedure COMPOSE will unite the following records:
(2) will be joined to (1); (4), (5) and (6) - to (3); (8) - to (7). As
a result, we get the following database file:
| Recno() |
GERM |
ENGL |
| 1 |
der Monat |
month; moon |
| 2 |
die Stunde |
hour; lesson |
| 3 |
die Uhr |
hour; time; clock; watch |
The default delimiter is comma (",").
Another example: suppose we have a database
payment.dbf
with two character fields FIRSTN, LASTN and a numeric field PAID,
previously
SORTed on LASTN+FIRSTN:
| Recno() |
FIRSTN |
LASTN |
PAID |
| 1 |
John |
Cook |
10 |
| 2 |
John |
Cook |
5 |
| 3 |
Mary |
Cook |
8 |
| 4 |
Samuel |
Harris |
20 |
| 5 |
Samuel |
Harris |
15 |
| 6 |
George |
Smith |
22 |
| 7 |
George |
Smith |
40 |
| 8 |
Peter |
Smith |
11 |
| 8 |
Peter |
Smith |
17 |
Assuming you have declared the field PAID as "composable",
the procedure COMPOSE will unite the following records: (2) will be
joined
to (1); (5) - to (4); (7) - to (6); (9) - to (8); and the corresponding
PAIDs will be summed. Thus, you should get:
| Recno() |
FIRSTN |
LASTN |
PAID |
| 1 |
John |
Cook |
15 |
| 2 |
Mary |
Cook |
8 |
| 3 |
Samuel |
Harris |
35 |
| 4 |
George |
Smith |
62 |
| 5 |
Peter |
Smith |
28 |
- CONVERT
CONVERT includes a number of programs for converting
database files to text files and vice versa. Also included are programs
(which may be of some use to Sinologists and Japanologists) for
converting
STAR4WIN files into various Chinese formats and vice versa.
There is also an option for converting text files
from the LEXICON text format into STAR4WIN, as well as for converting
STAR4WIN
files to HTML format.
For converting you can also use conversion tables.
A conversion table is a DBF with two fields (source and target). You
can
use such files for any symbolic conversions in your text or DBF files.
As for exporting and importing files from different
systems, STAR4WIN has a conversion procedure from FOXPRO format.
Utilities
supporting the Microsoft Excel and Access databases are now in
preparation.
- DECOMPOSE
- You may
DECOMPOSE any character field with a lot
of enumerated items. Suppose you have a database
germengl.dbf with two
character fields GERM and ENGL, and the field ENGL contains some
enumerated
data (delimited with ";"):
| Recno() |
GERM |
ENGL |
| 1 |
der Monat |
month; moon |
| 2 |
die Stunde |
hour; lesson |
| 3 |
die Uhr |
hour; time; clock; watch |
Procedure DECOMPOSE transforms such file into a
"simpler" database file where no delimiters are needed, while all the
corresponding
data in the chosen fields are related with a one-to-one correspondence:
| Recno() |
GERM |
ENGL |
| 1 |
der Monat |
month |
| 2 |
der Monat |
moon |
| 3 |
die Stunde |
hour |
| 4 |
die Stunde |
lesson |
| 5 |
die Uhr |
hour |
| 6 |
die Uhr |
time |
| 7 |
die Uhr |
clock |
| 8 |
die Uhr |
watch |
- ITALICIZE
- This procedure processes a text file and marks
unrecognized words with a specified style mark (by default - Italic).
You
may specify any other delimiters (e.g., \B for Bold, etc.).
The procedure
is based on the StarLing morphological analysis for Russian and
English.
The output, of course, still needs to be processed manually, but a
fairly
high proportion of words that need to be italicized are processed
correctly.
The output is stored in a file with the extension
.itc.
- LANGINDEX
- This procedure creates a word index to any paginated
text file. You must specify the opening and closing delimiters that you
use for marking needed words. If you do not specify the closing
delimiter,
' ' (space) is assumed as such.
The program needs a database file with three fields:
LANGNAME, WORD and PAGES (all of them must have the type Character). If
such a file exists, it is being opened and updated; if not, it is being
created automatically.
The program searches your text file for all cases
of delimited sequences and places them into the field WORD. The field
LANGNAME
is being occupied by the previous word in the text, and the field PAGES
- by the page where the word occurred. Thus, if your text on page 5
contains
the sequence "Turk. \Ikitab\i" and you have specified
"\I" and "\i"
as delimiters, you will have a record in the .dbf
file with "Turk." in
the field LANGNAME, "kitab" in the field WORD and "5"
in the field PAGES.
If, in addition, you have on page 10 the sequence
"Arab., Turk. \Ikitab\i",
you will obtain the following records in the .dbf
file:
| LANGNAME |
WORD |
PAGES |
| Arab. |
kitab |
10 |
| Turk. |
kitab |
5,10 |
- MERGE
- This procedure merges the contents
of two database files with identical structure, but containing
different
records (e. g., a database edited by two or more people on different
computers).
The database fields must have identical names and
follow in the same order in both database files. The contents of the
databases
are compared and written to the third database (for which you must
specify
a name).
All fields in the resulting database file are of
variable length. The contents of the fields / records are compared in
the
following way: if a record field in both databases is identical,
nothing
is done; if a record field in one database is included into the same
record
field of the second database (i. e. forms a substring of the latter),
the
larger record field is copied to the resulting database, followed by
the
marker "||" (to draw your attention to it); if the record fields are
just
different (not forming substrings of each other), they are
concatenated,
with the separator "||".
Note that the procedure does not perform any other
checking of the record contents. So if you process one database on
remote
computers, do not insert new records (you should rather append them to
the file bottom), do not delete records and do not sort the file - all
these operations will invalidate the MERGE procedure (it will work, but
with unpredictable results).
- NOTEFIELD
- This procedure will transfer the contents of
a .stm file into a character field in the
corresponding database file (you
are free to choose the name of the field). Although STAR4WIN no longer
supports .stm files, this procedure may be
used to
update old files.
- RECOVERY
- This procedure tries to recover
a damaged database file (.dbf) in case the
associated .var file (containing
fields with variable length) is preserved. The result of the procedure
is a file pair called temp.dbf and
temp.var, with one field (called FIELD1;
unfortunately, the field names are not stored in the
.var file). The field
entries within this field are delimited by the symbol "|".
- REFERENCE
- This procedure creates yet another type of index
(reference) files. You may mark any words within some particular field
by a special symbol (by simultaneously pressing Alt Q
on the keyboard).
After applying the
REFERENCE procedure, those words will
be picked out
and the contents of another correlated field will be enumerated next to
those words within another field of a newly created database file.
Suppose you have a file named
subjind.dbf with two
character fields SUBJECT and PAGES:
| SUBJECT |
PAGES |
| program |
53, 203, 222-237 |
| program body |
203 |
| program #development |
130, 151 |
| program #development system |
151 |
| public |
120, 254, 259-264 |
| public #data |
254 |
| public #data base |
260 |
| public #data network |
262 |
| public domain software |
259 |
The procedure works like this:
Enter file name: SUBJIND
Enter reference file name: SUBJIND2
Enter field name: SUBJECT
Enter reference field name: PAGES
After applying the procedure you obtain a new file
called subjind2.dbf with the following
contents:
| SUBJECT |
PAGES |
| development |
130, 151 |
| data |
254, 260, 262 |
This procedure may be useful for creating different
kinds of glossaries and subject indices.
- RENUMBER
- With this procedure it is now possible to renumber
entries in interrelated files without having to update references in
linked
files manually.
Suppose you have a file called
germet.dbf
with a subordinate file germ.dbf (Germanic
100 word list) and a superfile
called ieet.dbf containing numeric
references to
germet.dbf. If you, e. g.,
sort the germet.dbf you might wish to change
the numbers of the entries
in the sorted file. Earlier this was possible, but you had to change
the
referencing numbers of the entries in ieet.dbf
and
germ.dbf manually -
which is a very tedious process. Now all you have to do is summon the
RENUMBER
procedure, specifying the file name (germet.dbf).
The procedure will automatically
renumber all the entries in germet.dbf and
all the
references in ieet.dbf and germ.dbf.
If you choose to renumber ieet.dbf,
the procedure
will do that, taking care of all the references to ieet.dbf
in germet.dbf: even
if such references did not exist before, they will be now inserted into
germet.dbf.
RENUMBER will also renumber all
crossreferences in
nonetymological linked files. It takes into account all the alias data
contained in .inf files.
NB: RENUMBER does not make any
backups, and the changes introduced
are permanent. As in all other cases, it is strongly recommended to
make
a backup of your current files before applying the RENUMBER
procedure.
- REVERT
- This procedure transposes any database file,
converting records to fields and fields to records. Contents of the
first
field will be converted to field names in the resulting database. This
procedure has following restrictions:
- Since field names can only contain standard
Roman characters, record contents of the original database will be
modified
and sometimes significantly changed.
- A current restriction in Star4Win is a
maximum of 2046 fields per database. This means that if the original
database
has more than 2046 records, the contents of the converted database will
be truncated.
- SUBJINDEX
- This procedure creates a subject index to any
paginated text file. You must have a list ready (it should be a text
file
with each item on a new line). The program will check the occurrences
of
each word from the list in your paginated text and put the page numbers
into the list file. Note that you may enter only stems into the list
file:
this will result in finding all occurrences of these stems within
larger
units (thus, if you have 'rhym' in your list file, the program will
find
all occurrences of 'rhym' within 'rhyme', 'rhymes', 'rhymed', 'rhyming'
etc.).
- SUBSTITUTE
- The
SUBSTITUTE procedure should be applied only to
those files which are previously SORTed on some field (or concatenation
of some fields). It allows to avoid repetition of long strings by
replacing
the second (third etc.) occurrence of this string with the chosen
symbol
of repetition (e.g. "-" or "=").
Suppose you have a file named
subjind.dbf with two
character fields SUBJECT and PAGES:
| SUBJECT |
PAGES |
| program |
53, 203, 222-237 |
| program body |
203 |
| program development |
130, 151 |
| program development system |
151 |
| public |
120, 254, 259-264 |
| public data |
254 |
| public data base |
260 |
| public data network |
262 |
| public domain software |
259 |
Enter file name: SUBJIND
Enter field name: SUBJECT
Enter replica: -
Then the field SUBJECT will be SUBSTITUTEd, and you
get the following result:
| SUBJECT |
PAGES |
| program |
53, 203, 222-237 |
| - body |
203 |
| - development |
130, 151 |
| - - system |
151 |
| public |
120, 254, 259-264 |
| - data |
254 |
| - - base |
260 |
| - - network |
262 |
| - domain software |
259 |
As a result, you get shorter strings in the field
SUBJECT, while their meaning is clear. This way of abbreviation is
widely
used in subject indexes.