Memory Management Extensions
to the SRI Micro Operating System
for PDP-11/23/34/35/40
IEN 136
1st MAY 1980
S.R. Wiseman
B.H. Davies
ROYAL SIGNALS & RADAR ESTABLISHMENT
North Site, Leigh Sinton Rd., Malvern, Worcs., UK.
1
CONTENTS
________
0. Introduction
PART 1 - SYSTEM DESIGN
1. Requirement
1.2 Overview of solution
1.3 Use of Memory Management Facility
PART 2 - TARGET MACHINE ASPECTS
2.1 Alterations to MOS
2.1.1 The Process Control Table
2.1.2 The Scheduler
2.1.3 The $CREAP Macro
2.1.4 Input and Output Routines
2.1.5 Memory Sizing
2.1.6 Protection Routines
2.1.7 System Configuration
2.2 The EMMOS Debugger - MUD
2.2.1 A Summary of MUD Commands
2.2.2 MUD Command Syntax
2.2.3 The Relocation Mechanism
2.2.4 Addressing Modes
2.2.5 Command Specifications
2.2.6 Using MUD
2.3 Linker and Loader Functions
2.3.1 The Linker
2.3.2 The Loader
2
PART 3 - HOST MACHINE ASPECTS
3.1 Linking Using the RSX-11M Task Builder
3.1.1 Use of the Task Builder's Task Image File
3.1.2 The Format of the Task Image File
3.1.3 The Overlay Description Language
3.1.4 The EMMOS ODL File
3.1.5 Installation of the Debugger
3.1.6 Creating the Task Image File
3.2 The EMMOS RSX Task Image Loader
3.2.1 Inputs and Outputs
3.2.2 The Console Load Map
3.2.3 The Configuration Table
3.2.4 The Overlay Description Table
3.2.5 The Process Description Table
3.2.6 The Load Map
3.2.7 The Physical Memory Allocation Algorithm
3.2.8 The Logical Operation of the Loader
3.2.9 Error Messages
3.2.10 Tracing
3.2.11 Changing the System
3.3 Process Creation
Appendix A: Pitfalls and How to Avoid Them
3
0. Introduction
____________________
This document describes the requirement, design and implementation
of memory management extensions to the Stanford Research Insitute's
Micro Operating System (MOS) for PDP-11 minicomputers. The document is
divided into three parts. Part 1 covers the rationale behind the way in
which the memory management facility of the PDP-11 is used. The results
of the work fall naturally into Parts 2 and 3. Part 2 describes the
modifications to MOS itself and the debugger which are independent of
the operating system of the host machine on which the target system is
generated. Part 3 describes the Linking and Loading modules which are
dependent on the operating system of the host machine. The host machines
used at RSRE are PDP -11/34 and 40s running RSX-11M v3.2. Also we have
provision in our standard MOS for writing processes in a high level
language, CORAL 66, which requires the use of an auxiliary stack pointed
to by R0. This facility may be excluded from the extended MOS by the use
of a configuration switch.
4
PART ONE
________
SYSTEM DESIGN
5
1.1 Requirement
___________________
For a number of planned real-time software projects the 28k words
of code and data space accessible without memory management in the
PDP-11 is inadequate. Therefore the question arises is it possible to
use the memory management registers to provide more code and/or data
space up to 124k words, with minimal overheads in the form of context
switching and with no modifications to presently written processes?
1.2 Overview of Solution
____________________________
The reason for the provision of a two state memory management
system in the PDP-11s is not only to allow access to memory above 28k
words but also, by the use of kernel and user modes, to provide
protection between processes and the core resident part of the operating
system. This usually means that the I/O page is non-resident when a user
process is running and that I/O is handled in kernel mode. Thus in a
real-time system there may be unacceptable overheads due to the
considerable amount of context switching involved in this protection
mechanism. However if one is willing to dispense with interprocess and
operating system protection a simple and elegant solution which produces
no detectable overheads is possible.
The basic concept of the solution is that while a process is
running there is no need to have in scope any other process so that a
snapshot of the 32k virtual memory at any one time will look like an
ordinary MOS system configured for one process. The only extra
executable code involved is in the scheduler for paging out the process
that has just finished running and paging in the process that is about
to be run. The I/O page, operating system, handlers and common buffer
area are always resident otherwise access to these entities would have
to be mediated by the operating system so that they could be paged in
with consequent increase in overheads. A typical MOS process runs for
something like 1 millisecond and the additional paging overhead is 4 MOV
instructions taking about 20 micro seconds, a 2% overhead. A major
change in the MOS layout occurs in the positioning of the stacks.
Because we want the permanently resident part of MOS to be as small as
possible and never greater than 8k words the stacks have been removed
from the Process Control Tables and are paged in and out with their
processes.
1.3 Use of the Memory Mangement Facility
____________________________________________
At any one time only the running process will reside in virtual
memory, the others will be suspended and so have no need to be
accessible. Permanently resident in virtual memory will be the vector
area, global buffer area, operating system including all handlers, and
the I/O page. Because the maximum size of a page is 4K words and it's
position is fixed in virtual memory, the allocation of these pages is
somewhat restricted. However a trade-off between global buffer space,
6
operating system size and process space is possible to some extent, and
the following configurable options are catered for:-
Options where code and stack space of any process does not exceed 4k words:-
Option 1
________
20k words of buffer space
<4k of operating system and handlers
|-----------------------| 32k
7 | I/O |
|-----------------------| 28k
6 |Running process & stack|
|-----------------------| 24k
5 | Operating System EMMOS|
|-----------------------| 20k
4 | Global Buffers |
|- -| 16k
3 | |
|- -| 12k
2 | |
|- -| 8k
1 | |
|- -| 4k
0 | |
page |-----------------------| 0k
Option 2
________
16k words of buffers
<8k operating system and handlers
|-----------------------| 32k
7 | I/O |
|-----------------------| 28k
6 |Running process & stack|
|-----------------------| 24k
5 | Operating System EMMOS|
|- -| 20k
4 | |
|-----------------------| 16k
3 | Global buffers |
|- -| 12k
2 | |
|- -| 8k
1 | |
|- -| 4k
0 | |
page |-----------------------| 0k
7
Options where code and stack space of any process does not exceed 8K words:-
Option 3
________
16k words of buffer space
< 4k words of operating system and handlers
|-----------------------| 32k
7 | I/O |
|-----------------------| 28k
6 |Running process & stack|
|- -| 24k
5 | |
|-----------------------| 20k
4 | Operating System EMMOS|
|-----------------------| 16k
3 | Global Buffers |
|- -| 12k
2 | |
|- -| 8k
1 | |
|- -| 4k
0 | |
page |-----------------------| 0k
Option 4
________
12k words of buffers
<8k words for operating system and handlers
|-----------------------| 32k
7 | I/O |
|-----------------------| 28k
6 |Running process & stack|
|- -| 24k
5 | |
|-----------------------| 20k
4 | Operating System EMMOS|
|- -| 16k
3 | |
|-----------------------| 12k
2 | Global Buffers |
|- -| 8k
1 | |
|- -| 4k
0 | |
page |-----------------------| 0k
8
Only kernel mode is used by EMMOS, even user processes run in
kernel mode because no attempt has been made to produce a secure system.
However with options 3 and 4 the process space and stack space are
allocated separate pages wherever possible. Thus the debugger or a
procedure call may be used to write-protect the code space if desired.
Holes appear in virtual memory from the end of the code or stack to the
end of the 4k boundary ensuring that a system stack cannot underflow and
the running process cannot access another process' code or stack .
However, a process can damage another process indirectly by, for
example, supplying a pointer to data on its stack to another process.
Because the debugger is bigger than 4k, options 1 and 2 do not
allow use of the debugger. This is not a serious drawback as it is
envisaged that debugging could take place with options 3 or 4 and that
options 1 or 2 could then be used to give another 4k words of buffer
space.
Using EMMOS requires a stricter discipline when writing processes
than was required with MOS. Strictly speaking the code of a process
should not contain any private data space. If a process wants private
data space it should use its stack. Any data space shich is shared by
two or more processes should be obtained , via MOSMEM calls, from the
global buffer area. Exceptionally a single incarnation of an assembler
process could reserve space after its code for local tables but any
attempt by another process to access this area will result in disaster!
All high level language processes should be written as closed procedures
with private data area on their stacks.
Shared Library procedures must either be replicated so that they
are paged in with each process that requires to use them or they must be
linked with the EMMOS operating system so that they are permanently
resident
9
A typical example of real and virtual memory allocations for
an EMMOS system are shown below:-
128k |-----------------------|
| I/O addresses |
124k |-----------------------|
| |
| |
| |
|-----------------------|
| stack (proc3) |
|-----------------------|
| stack (proc2) |
|-----------------------| 32k |_______________________|
7 | I/O | | Codebody2 (proc 2&3) |
|-----------------------| 28k |_______________________|
6 |Running process & stack| | stack (proc1) |
|- -| 24k |_______________________|
5 | | | Codebody1 (proc1) |
|-----------------------| 20k |-----------------------|
4 | Operating System EMMOS| | Operating System EMMOS|
|- -| 16k |- -|
3 | | | |
|-----------------------| 12k |-----------------------|
2 | Global Buffers | | GLOBAL buffers |
|- -| 8k |- -|
1 | | | |
|- -| 4k |- -|
0 | | | |
|-----------------------| 0k |-----------------------|
Page Virtual Memory Physical Memory
The linker/loader system used with EMMOS should be reasonably
intelligent! It is it's job to reserve physical memory in accordance
with the stack size requested for each process. In particular it must
evaluate the total memory requirements of a process that has more than
one incarnation and determine whether or not some or all of the code has
to be replicated.This may arise when the stacks of all the incarnations
of a process plus the code will not fit into the paging window. The
linker/loader is also responsible for outputting error messages when the
process and stack requirements cannot be satisfied by that particular
configuration of EMMOS.
10
PART TWO
________
TARGET MACHINE ASPECTS
11
2.1 Alterations to MOS
___________________________
Perhaps the most vital alteration is that made to the scheduler. A
context swap now involves saving the suspended process' window and
restoring the new process' window. This however, only involves about
four move instructions extra, for a typical configuration ( 8K processes
).
The major alteration is, however, to the Process Control Tables (
PCTs ) and consequently to the PCT initializer. In particular the stacks
for a process no longer reside here, hence a process' PCT is relatively
small. Extra fields in the PCTs give the memory management window
necessary to 'page-in' the process and the ends of the system and CORAL
stacks. The PCT initializer now has to cope with setting up these
fields, which it does by consulting the load map, supplied by the
linker/loader.
Additions have also been made to the $CREAP macro. The sizes of the
system and CORAL stacks are given to the $CREAP which in turn passes the
information to the linker/loader ( depending on the method used ).
It has also been necessary to offer modified synchronous I/O
routines 'SOUT' and 'SIN'. The original versions can be used if the I/O
is performed on buffers from the permanently resident buffer pool,
however they will not work if the I/O buffer is taken from the local
code or data space. This is because the local area may be paged-out when
the interrupt routine tries to get the next character, causing it to
pick up rubbish or trap out. The new versions will copy the local I/O
buffer into a global buffer before initiating the I/O, thus ensuring
that it is always paged-in. This is likely to be the main overhead of
the system.
Finally, the memory sizing performed in MOSMEM is no longer
required. This is because the position of the global buffer pool is
fixed by the page allocation scheme.
12
2.1.1 The Process Control Table
The following fields have been added to the PCT:
pct_mmr : ARRAY [ first_page..last_page ] OF window_type;
pct_r0e, pct_r6e : virtual_byte_addresses;
where
TYPE window_type = RECORD par, pdr : INTEGER END;
TYPE virtual_byte_addresses = 0..#177777;
The following field has been removed from the PCT:
pct_stk : ARRAY [ 1..stk_len ] OF BYTES
pct_mmr is an array containing the window required to 'page-in' the
process. The bounds are not 0..7 because it is known that most pages
remain constant, only pages first_page to last_page are changed since
this is the amount of virtual memory allocated to a process. Each
element of the array is four bytes long and contains the page address
register ( par ) and page descriptor register ( pdr ) values for that
page.
pct_r0e and pct_r6e contain the virtual byte addresses of the ends
( smallest numbered address ) of the R0 ( CORAL ) and R6 ( system )
stacks. These values are used to detect stack overflow.
Since the stacks no longer reside in the PCT the field pct_stk has
been removed.
The new fields in the PCT are set up directly from the Load Map
produced by the linker/loader.
13
2.1.2 The Scheduler
A context swap now involves saving and restoring the memory
mapping, as well as the general register values. We can actually avoid
saving the mapping when a process is suspended because normally it
remains constant. The protection on a page is the only thing likely to
change and if this is done using the supplied routines then we can still
avoid doing the saving.
When the process is made runnable it is first 'paged-in' by
restoring it's memory mapping. To ensure that the stack pointer and it's
stack are restored together, interrupts are disabled earlier than in the
old version. If an interrupt is allowed to occur between 'paging-in' the
stack and restoring the stack pointer then disaster will obviously
follow.
Stack overflow, on both the R0 ( CORAL ) and R6 ( system ) stacks,
is checked for when the process is suspended. The ends of the stacks are
marked with overflow detect words, the address of which is held in the
PCT. It is a simple matter to check if these words are still intact. If
the system has no CORAL processes or library routines then the checks
made on the R0 stack can be omitted by setting a switch in the EMMOS
configuration file.
2.1.3 The $CREAP Macro
This macro initializes parts of the PCT, the rest is done at run
time. Two extra parameters are now required, which specify the size, in
words, of the R0 ( CORAL ) and R6 ( system ) stacks. To avoid confusion
the parameters should be called by name, the default size is zero words:
eg: $CREAP G802,<UX25 >,,DV.IMP+0,DV.IMP+1,R0SIZE=200,R6SIZE=20
$CREAP G802,<UX25 >,,DV.IMP+2,DV.IMP+3,R6SIZE=30,R0SIZE=350
The macro converts the size into memory blocks ( 40 octal words )
and places the information, along with the process id and name, in the
.PSECT LDRCON. This information is used by the linker/loader in a way
that depends on the methods being used. Note that different incarnations
of the same process can have different sizes of stacks. In the above
example ( all numbers are octal ) the first process would have
ceiling( 200/40 ) = 4 memory blocks = 200 words for R0
ceiling( 20/40 ) = 1 memory block = 40 words for R6
and the second process would have
ceiling( 350/40 ) = 10 memory blocks = 400 words for R0
ceiling( 30/40 ) = 1 memory block = 40 words for R6
Note: "ceiling( r )" is a function that rounds up real numbers to
the next largest integer.
14
2.1.4 Input and Output Routines
The synchronous I/O routines, SIN and SOUT, have to be modified for
use with EMMOS. The user interface is unchanged, but the method of
operation is slightly different. SOUT now grabs a global buffer, copies
the string into it from the user's buffer and then performs the SIO on
the global buffer, to output the characters. SIN grabs a global buffer,
calls SIO to fill it and then copies the string into the user's buffer.
This is necessary because if the string to be output is in the
process' local area, it will be 'paged-out' with the process. SIO uses a
pointer to the string ( a virtual address ) to get the characters, so if
it does this when the initiating process is 'paged-out', it will either
pick up rubbish or cause a memory management error. Similarly for SIN.
The copying would not be required for a string already in a global
buffer, because the buffer will always be 'paged-in', regardless of
which process is running. In this case the MOS versions could be renamed
and used to avoid a loss of efficiency.
The same problems apply to the asynchronous I/O routine, SIO.
However, any copying from local data space to a global buffer that is
needed, is left up to the user. Checks have been included in the code
that ensure both the IORB and the data area are in the global buffer
pool or the resident EMMOS code. These checks can be omitted for a
program known to work by setting a switch in the EMMOS configuration
file.
2.1.5 Memory Sizing
This is now not carried out. The limits of the global buffer area
are assumed to be the ends of the pages allocated to buffers ( less
space for the vectors if this includes page 0 ). It is assumed that the
hardware has been configured in a sensible manner.
15
2.1.6 Protection Routines
Two routines are supplied, one to inspect the access control field
( ACF ) of a page, the other to change the ACF. Under the memory
management system each page can be either
1. non-resident -all attempts at access will cause
a trap
2. read-only -attempts to write into a location
in the page will cause a trap
3. (unused) -same as non-resident
4. read/write -neither reads nor writes are trapped
The two routines will only work for one of the process pages. They
both return an integer result. This equals 'true' if the page specified
is a process page, else it equals 'false' and the ACF is not changed or
inspected. The pre-defined constants, 'non resident', 'read only', 'un
used' and 'read write', should be used to specify and compare the ACFs.
INTEGER PROCEDURE change page protection ( VALUE INTEGER page, access );
If the page specified is a process page then the ACF of that page
is changed to the given access. The copy of the memory management
registers held in the PCT table for this process, is also updated.
INTEGER PROCEDURE current protection ( VALUE INTEGER page;
LOCATION INTEGER pct, pdr );
If the page specified is a process page then pct and pdr are set to
the values of the ACF field of the memory management registers for that
page, taken from the PCT window area and active page registers
respectively. The two different versions are supplied in case one of
them has been corrupted. Both should always be the same.
CORAL Constant Values
true 1
false 0
non resident 0
read only 2
un used 4
read write 6
16
cont.
The assembler versions of these routines are completly re-entrant
and run to completion, so it is possible to have one copy of the code in
the resident part of the system which is shared between all processes.
The routine to change the page protection is called $CHAPP and the
routine to inspect the current protection is called $CURPP, they have
the same effect as the CORAL versions but are called as follows:
Change Page Protection :
called by JSR PC,$CHAPP
with R2 = number of the page to be changed
R3 = the new access of that page
returns with
R1 = TRUE or FALSE
R0 and R2...R6 unchanged
Inspect Current Page Protection :
called by JSR PC,$CURPP
with R2 = number of page to be examined
returns with
R1 = TRUE or FALSE
R3 = value from PCT
R4 = value from active PDR
R0, R5 and R6 unchanged
Assembler Constant Values
TRUE 1
FALSE 0
NONRES 0
READON 2
UNUSED 4
RWRITE 6
17
2.1.7 System Configuration
The page allocation scheme, operating system options and various
buffer addresses, are specified in the configuration file, EMMSYS.SML .
2.1.7i The Page Allocation Scheme
The virtual memory is divided up into 8 pages of 4K words, numbered
0..7. Page 7 is always allocated for the I/O page, the remaining seven
are split between the global buffers, the operating system and the
running process ( including it's stacks ). The order is critical,
buffers should be at the low numbered end, then the system, then the
process and finally the I/O page. The exact split is given by specifying
the first and last page used by each part.
eg buffers 0 - 3 emmos 4 - 5 process 6 - 6
or buffers 0 - 2 emmos 3 - 4 process 5 - 6
When decreasing the space allocated to the global buffers, be sure
that the workspace buffers ( see 2.1.7iii ) are still located before
EMMOS, otherwise EMMOS may be loaded on top of one of them before it's
finished with. Note: different linking/loading schemes may lend
themselves to a different ordering of buffers, system and process, in
which case minor alterations to the PCT initializer may be necessary.
The I/O page must always be page 7.
18
2.1.7ii Operating System Options
EMMOS: EMMOS/MOS selection switch
= 0 for EMMOS
= 1 for MOS ( SAVMAP and SET250 have no meaning )
SAVMAP: save mapping when process suspended switch
= 0 when a process is suspended the window is saved as part
of the process' volatile environment. This allows a
process to change the window dynamically ( eg the
protection ). Note, only that part of the window for the
pages allocated to processes is saved and restored, so
changing other pages will affect all processes.
= 1 the window is not saved each time the process is
suspended. The original value is reloaded when the
process is run again, erasing any changes made.
It is recommended that SAVMAP = 1 and the supplied protection
routines are used to change the page allocation, with the other
fields left well alone.
SET250: memory management error trap initialization on/off
= 0 when the system is initialized the trap is set to a local
handler.
= 1 the trap is left untouched, this option should only be
used when the debugger is used ( MUD will set up it's
own handler ).
COPYIO: sin/sout version selection switch
= 0 ( OR EMMOS = 0 )
use the versions of sin and sout that copy to/from a
global buffer before initiating the I/O.
= 1 ( AND EMMOS = 1 )
the MOS versions will be used
NOCORL: CORAL process indicator
= 0 no processes are written in CORAL, or use CORAL libraries,
so the R0 stack is not checked for overflow each time a
process is suspended. The stack is still set up as
usual. The R6 stack is still checked.
= 1 There is a process using CORAL, so the R0 stack will
be used. Both stacks are checked for overflow each
time the process is suspended.
CHKSIO: SIO buffer address checking on/off
=0 Each time SIO is called the IORB and the data addresses
are checked to ensure that they are in the global buffer
pool or the resident EMMOS code. An attempt to use a
local iorb or data area will cause a BUGHLT.
=1 No checks are performed. The use of a local IORB or data
area is likely to cause the system to crash.
19
2.1.7iii Workspace Buffer Addresses
The linker/loader and system initializer must agree on the postion
of the load map. It's address is given by MAPADD in the system
configuration file. The linker/loader is given this information in some
implementation dependant way.
Note that in the RSX linker/loader system, the area specified by
MAPADD is also used by the loader for other workspace buffers during
load time, because of this MAPADD must specify an area of store, lying
between the end of the loader and the beginning of where EMMOS will be
loaded, of about
25 * max_number_of_processes ( decimal ) words long
The loader ends around 032000, without removing redundant
libraries, so even for a max number of 30 processes ( 25 * 30 decimal is
about 1400 octal ), giving MAPADD = 32000 would mean the tables would
end way below 040000. This allows us to load a system with only 8K of
buffers.
20
2.2 The EMMOS Debuggger - MUD
__________________________________
MUD (Marvellous Universal Debugger) is split into two parts, RESMUD
and MUD proper. RESMUD is a small program written in MACRO-11 that is
permanently resident in virtual memory. It acts as the interface between
the operating system and the debugger, MUD, which is only brought into
virtual memory when it is needed.
MUD is loaded by the loader in the same way as any other overlay.
Normally , however, overlays that are not $CREAPed do not have entries
in the load map, but the loader creates a special entry for the
debugger. RESMUD uses this entry to find out where MUD has been loaded,
and transfers control to it. For this reason RESMUD must be the first
module loaded, even before MOSPM, if the debugger is to be used.
If RESMUD is not included in the system, then MUD will be ignored
if it is itself included, and the debugger cannot be used. EMMOS will
ignore MUD, unless a process is foolishly creaped with the name MUDSYS!
The debugger is able to reference the whole of the 18 bit physical
store without the user knowingly changing the memory management
registers.
21
2.2.1 A Summary of MUD Commands
The name of a command is one or two letters long, in upper or lower
case, as follows:
Print Commands:
pb print breakpoint locations
( see breakpoint commands )
pr print the value of a relocation register
( see relocation commands )
Spy Commands:
s spy on store
sr spy on the registers
sw spy on the memory management window
Breakpoint Commands:
b set a breakpoint
k kill a breakpoint
pb print breakpoint locations
( see print commands )
Enter User Program Commands:
g go to a specific address
gs go to a specific address single stepping
c continue from a breakpoint
cs continue from a breakpoint single stepping
Relocation Commands:
r set a relocation register
pr print the value of a relocation register
( see print commands )
Exit Command
x leaves MUD and halts
Special Commands
i change I/O channel
e evaluate an arithmetic expression
f find a memory location containing a given value
22
2.2.2 MUD Command Syntax
The syntax is fairly simple and is quite free with spaces. A
command consists of the command name, one or two letters( upper or lower
case ), optionally followed by one or two operands.
<name>
<name> <op1>
<name> <op1> <op2>
The two operands and the name may be seperated by commas or spaces.
The operands specify 18 bit addresses or numbers in octal. Their syntax
is:
<operand> ::= <repeat symbol>
or= <number> <relocation> <mode>
<repeat symbol> ::= *
<number> ::= <18 bit octal number>
or= null
<relocation> ::= . <relocation register>
or= null
<relocation register> ::= <octal digit>
or= null
<mode> ::= V
or= P
or= null
If the octal digit is ommitted, it is assumed to be zero. If the
relocation dot is not present, then the operand is assumed to be an
absolute physical address. If the octal digit is missed out, then
relocation register one is assumed. Hence the following examples all
mean the same:
s100.,150.
s100.1 150.
s 100. , 150.1
s 100.1 150.1
23
2.2.3 The Relocation Mechanism
There are seven relocation registers supplied ( r1..7 ) which can
be used to modify an address in a command. Values are put into these
registers by the 'r' command (qv). To specify that an address is to be
relocated it is followed by a dot and perhaps a digit. The digit is the
name of the relocation register to be used and is one by default. When
relocation is specified, the contents of the relocation register are
added to the address giving the actual address used. For example:
r2,100 - sets relocation register 2 to 100
s20.2 - spys on location 20+100=120
r350 - sets relocation register 1 to 350
s30. - spys on location 30+350=400
Many commands allow an asterisk to be used as an operand, this
usually means 'all' in some way, for example:
k100 - kill the breakpoint at location 100
k * - kill all breakpoints
24
2.2.4 Addressing Modes
An operand can refer to a virtual location by following the
relocation indicator by a 'V' ( the default mode is 'P', so normally
addresses are physical ). The corresponding physcial location will be
computed using the user's window. Page lengths are ignored.
For example, if the base address of page 5 ( virtual addresses
120000-137777 ) is 120100 and relocation register 2 contains 120000,
then the following are equivalent:
s120110 - physical address by default
s120110p - physical address specified
s120010v - pa = 010 + base(5) = 120110
s110.2 - pa = 110 + 120000 = 120110
s110.2p - . . . . physical specified
s10.2v - va = 10 + 120000 = 120010
- pa = 010 + base(5) = 120110
25
2.2.5 Command Specification
pb print breakpoint locations
pb prints out a list of all the breakpoints
that are set, showing their location as an
absolute address and as an offset +
relocation.
pr print relocation registers
pr print out the value of relocation register 1
pr 'd' print out the value of relocation register 'd'
pr * print out the value of all the relocation
registers.
r set a relocation register
r 'n' set relocation register 1 to 'n'
r 'd' 'n' set relocation register 'd' to 'n'
r * 'n' set all relocation registers to 'n'
r * note that 'n' is zero by default so that
r* will set all relocation registers to
zero. Initially all are zero anyway.
x exit
x leaves MUD and halts. This instruction would only
be used when running under a monitor operating system.
26
cont.
s spy on store
s 'l' spy on the store starting at location 'l'.
The address is displayed followed by the
contents. The contents can be changed by
typing a number followed by <return>, in
which case the new value will be displayed.
To advance to the next location just type
space, to retreat to the previous location
type tab. To finish the spy type <return>.
Note that typing <return> with no number
does not change the value in store.
eg
mud> r100
mud> s20.
000120> 123456 23
000120> 000023 000122> 077177
mud>
this leaves location 122 unchanged, but puts
23 into location 120.
s 'l1' 'l2'
spys on locations 'l1' to 'l2', but does
not allow you to change any values.
Effectively this is just a display or dump
command. 'l1' should be less than or equal
to 'l2' or nothing will happen. To stop
the printing hit <escape>. This is handy if
you specify an enormous amount of printing
by mistake.
sr spy on registers
sr spys starting at register 0
sr 'd' spys starting at register 'd'
This is similar to spy on store, but you can
only progress forward one register at a
time, by typing <space>. After reaching
register 7, the program counter, you get to
look at the psw. The only way to spy on the
psw is to first spy on register 7. If you
space forward from the psw you get back to
register 0, to get out of the spy type
<return>.
27
cont.
sw spy on the window
sw spy on the memory management registers. At present
you are restricted to looking at the registers for
the pages allocated to processes, which under EMMOS
should be the only ones to change.
The registers are automatically split into their
composite fields for displaying
and changing. To go forward to the next field, type
<space>, to go to the next page, type <tab> and to
exit type <return>. Numeric fields are changed by
typing a number followed by <return>, other fields
are changed by typing one character.
the expansion direction, ed> is either up=u or down=d
the access control is either non resident = n
read only = r
or read/write = w
an access control of 'unused' indicated an error
in the memory management registers. The written into
bit cannot be examined or changed.
b set a breakpoint
b 'l' puts a breakpoint at location 'l'. Note
that you are not allowed to place a breakpoint
on top of another breakpoint. It
is possible to set breakpoints on top of emt
instructions, but, once they have been
reached, it is not possible to
continue or single step from them.
k kill a breakpoint
k 'l' kills the breakpoint at location 'l'. If
the program is halted at the breakpoint, it
is not possible to continue from there, use
'go' instead.
k * kills all breakpoints. If the program had
halted at a breakpoint it will not be
possible to continue, use 'go' instead.
28
cont.
g go to command
g 'l' jumps to location 'l' and starts
executing. 'l' must be a virtual address,
mapped according to the window. The
registers are loaded as set.
gs 'l' as above, but single stepping. Only one
instruction is executed before control is
returned to MUD. When single stepping all
breakpoints are temporarily removed, so no
breakpoints as such will be hit.
c continue command
c continue from the breakpoint last reached or at the
next instruction if previously single stepping.
cs as for continue but single stepping
( see 'gs' above )
Note that it is not possible to continue from a
breakpoint that has just been killed, but you can
use 'g' instead with 'l' as the contents of the program
counter, R7.
i change I/O command
This command has a special syntax, because it is
used to redirect the input and output of MUD from one
terminal to another.
i change I/O to the console
iv 'n' change I/O to vdu 'n'
is sink the output. All output will be thrown away,
including echoing. Input remains on the same channel.
f find a value
f 'v' searches the physical memory, stating at
location zero, for a word containing the value
'v'. Every so often a message will say
"still looking" if one has'nt been found
yet. To stop the search early hit <escape>
to break in.
f 'v' 'l'
As above but the search starts at 'l'
instead of zero.
29
cont.
e evaluate an expression
This command evaluates an 18 bit expession, using octal
or decimal numbers or the contents of a relocation register. There
is one accumulator, acc, which holds an 18 bit number. The operations
that can be performed on it are:
^ val Load the acc with 'val'
+ val Add 'val' into the acc
- val Subtract 'val' from the acc
P Convert the least significant 16 bits of acc into an
18 bit virtual address
V Treat the value in acc as an 18 bit physical address and
convert it into a 16 bit virtual address.
The values are given by:
# octal number - an 18 bit octal number, the # can be omitted
$ decimal number- a 15 bit decimal number, ie positive only
R num - the 18 bit value of relocation register 'num',
num is 1 by default
Several commands can appear on one line, seperated by commas
or spaces if it is necessary to avoid ambiguity. The value in acc is
displayed when the end of the line is reached.
The value in acc is zero the first time the 'e' command is
used. Subsequently the value is that left in it after the last use
of 'e'.
30
2.2.6 Using MUD
If RESMUD is linked in before EMMPM and MUD is present as an
overlay, then the debugger will be entered once the system has been
loaded. If not the debugger cannot be used.
When control is passed from MUD to the user program, by using the
go or continue commands, the system error traps are set so that any
error will be handled by MUD. This remains so until the user program
changes them. Note that while MUD is running the traps are set to
different handlers which detect whether the debugger itself is in error.
If so check that RESMUD has been assembled with the current page
configuration.
We can return to the debugger at any time via the console emulator.
Find the address of the global symbol $MUDIN, it is part of RESMUD and
so is permanently resident, it's virtual and physical address is
therefore the same. Use the console emulator to jump to this location
and MUD will restart.
31
2.3 Linker and Loader Functions
___________________________________
In general a system's linker and loader perform two logically
distinct functions, though in practice the border between these
functions may be less clear. In a virtual memory system it is usually
the operating system that allocates physical memory, but in a monitor
system this may be done by the linker. There are a number of functions
that the linker and loader are required to perform on an EMMOS system.
Some of these functions require a degree of intelligence that is not
usually found in either linkers or loaders. These intelligent functions
may be incorporated in either the linker or the loader, or distributed
between them. When presenting these functions we will describe them all
as being part of the linker, for ease of understanding. The loader then,
just copies an output image file from the linker, word for word into the
128K physical memory. However, linkers are complex pieces of system
software, that are difficult to write or modify. Therefore, in practice,
one usually takes the best of the functions that the linker has to offer
and writes a loader program which incorporates those remaining.The
functions that the linker/loader must perform are listed below:
1. Resolve all global references and relocate the
object code.
2. Allocate physical memory for the system, buffers,
process code and stacks. This takes into account
the page configuration scheme.
3. Construct a Load Map which specifies where processes
are in physical memory and where the ends of their
stacks are in virtual memory. This is passed on to the
loaded system.
2.3.1 The Linker
We can consider the input to the linker to be a set of Relocatable
Binary ( RLB ) modules, with some system of reference resolving that
need not concern us. A process' body is made up of one or more RLB
modules, with sharing allowed. EMMOS is similarly composed. To specify
how the system and process bodies are constructed we supply the linker
with a Link Control File, which specifies which modules make up the
system and process bodies, and the order in which they are to be linked
to form the body. We also need a way of uniquely identifying a process
body, so that when we create a process we can specify which code body to
use.
Here we give an example of what this file could look like. The line
headed 'SYSTEM' would define the construction of the resident operating
system part of EMMOS and lines headed 'BODY' would each describe a
process body, it's name given in brackets:
SYSTEM: EMMPM-EMMTTY-..........-EMMMEM;
BODY(X25): X25CODE;
BODY(PRINTR): IOCODE-IOLIB;
END:
32
cont.
The EMMOS linker must also allocate physical memory for the process
code bodies and stacks. In order to do this it needs to know which code
bodies will be incarnated as which processes and how much stack to give
those processes.T This information could go in the Link Control File. As
an example of how it could be represented, we consider the example body
description given above, and create some processes with those bodies:
PROC(X25-A): "X25" ,SYSTEM=200
PROC(X25-B): "X25" ,SYSTEM=200
PROC(PRINTR): "PRINTR",SYSTEM=100,CORAL=400
END:
Here we describe three processes, called X25-A, X25-B and PRINTR.
The first two are both incarnations of the body "X25" and both have 200
words for their system stack and no CORAL stack. It's not necessary for
them to have the same size stacks, though some implementations could
impose this as a restriction with no great hardship. The third process,
"PRINTR", is the one and only incarnation of the body "PRINTR". It has
100 words for it's system stack and 400 words for it's CORAL stack.
The linker still requires information on the page configuration
before it can allocate physical memory. This information is not strictly
necessary. We could allocate the code and stacks for each process as one
lump of physical store and if it is too large for the configuration,
then the system will fail at run time. By knowing the page allocation
scheme however, the linker can be quite subtle and avoid unnecessary
duplication of code and could even arrange for the code and stacks to be
on separate pages if the sizes are right. ( For more details see 3.2.7/8
). To complete our example we shall give an idea of how the
configuration could be specified in the Link Control File:
PROCS 5..6, BUFFERS 0..2, EMMOS 3..4;
Note that page 7 must always be allocated for the I/O page and that
the section using page 0 will lose the first 1000 octal bytes for the
vector area.
In our example, if the body "X25" was less than 4K words then it
would fit on one memory page. Since the stacks for "X25-A" and "X25-B"
are quite small, they too fit on one page. The linker should then only
put one copy of the body into physical memory and arrange the mapping of
both the processes such that page 5 maps onto it. Page 6 would map onto
the processes stack areas, which would of course be different physical
memory locations.
Once the linker has finished allocating physical memory it needs to
be able to tell the EMMOS initializer what the memory management window
for each process is and where it's stacks are. It does this by including
a table with this information in, in the Load Image File. This table
must be loaded at a place in memory where the initializer can find it,
say somewhere in the global buffer pool.
33
2.3.2 The Loader
The Loader takes the Load Image file as input. This file is
produced by the linker and contains absolute binary data with an
indication of where in physical memory to load it. The loader need not
know anything about the processes or the system it is loading. However
there needs to be some way of telling where the system's entry point is.
Note that since the entry point will be a virtual address it is
necessary to include an initial memory mapping as part of it's
definition.
34
PART THREE
__________
HOST MACHINE ASPECTS
35
3.1 Linking Using the RSX Task Builder
___________________________________________
The RSX task builder( TKB ) is used as the linker. It is because
not all the functions we require are supplied by this linker, that our
loader is more complicated then usual. The Task Builder is used to
perform the relocating and global reference resolving parts of the
required actions. The physical memory allocation, process incarnation
and the production of the Load Map are all actions that are performed by
our loader. This does mean that the loader is complicated, but less
overall effort is required to write it than to completely re-write the
linker.
3.1.1 Use of the Task Builder
The task builder is able to handle overlays in several ways. It can
produce disk resident overlays, in which "paged-out" segments reside on
disk, or memory resident overlays, in which all segments reside in
physical memory but not all are "paged-in" into virtual memory. TKB will
also allow your task to be some combination of the two. Overlays are
paged in and out by special calls to RSX which TKB automatically inserts
into the object code. It is possible to instruct TKB to leave out this
code, allowing the user to handle his own paging. Overlays are
characterized by being relocated at the same place in memory. it is this
feature that EMMOS exploits to get all the process' bodies relocated at
the same base address. We do however require that the linker does not
put any auto loading code in for the overlays.
Amongst the various tables hidden in the task image, is the window
description table, this does not appear for disk resident overlays, so
we chose to use memory resident overlays throughout. Also, when using
memory resident overlays, each layer of the overlay tree is relocated on
a page boundary, which is another requirement for EMMOS. There is an
option, when running a task under RSX, of allowing it to be swapped in
and out of physical memory ( checkpointed ). A checkpointable task has a
large space in it's task image file, so as EMMOS has no need of it, the
task must be built not checkpointable ( /-CP switch ). We will assume
then that the task is not checkpointable, with manually loaded ( though
we shall not include the calls to do it ) memory resident overlays.
36
The first snag is met when we try to get information about the size
of the overlays and their position within the task image. The tables
that contain this information follow on directly after the code of the
root segment, but there is no indication as to how long the root is,
except in these tables. Normally their position is known because the
overlay loading mechanism references them, but we've done away with
this. Ideally the root segment would be the EMMOS system, but there is
no clean way of finding out how long it is, so we chose to not allocate
page 0 to the system. Instead the root segment does not contain any
code, only some tables of known length used by the loader.
Unfortunately, because TKB will relocate the next overlay layer on page
1, we have used up 4K of virtual space just for the interrupt vectors
etc. To avoid this we allocate the global buffer pool space to pages
0,1.... because no code is loaded for it. The problem remains to
persuade TKB to relocate EMMOS at the end of the buffers. We could do
this by extending the root segment to occupy all these pages and then
just ignore it all when loading, but this would mean the task image file
would be enormous. Instead we use dummy overlay layers. Although each
layer contains only one word, it is sufficient to persuade TKB to
relocate the next layer at the next page boundary.
It is because of such devious means of getting the overlay
relocated to the desired base, that all the calculations performed by
TKB to produce physical memory load addresses are useless. Instead of
wasting large areas of physical memory the loader does the calculations
again. This also means that we can handle the process' stacks nicely.
Finally there is the problem of uniquely identifying each overlay,
so a process can specify which is to be it's code body. TKB gives each
overlay a name, which is the first six characters of the first object
file's name in the overlay, however this may not be unique. We have then
two options; use the name given by TKB or use a position numbering
scheme. Despite it's drawbacks of non-uniqueness, the former was chosen.
Names are easier to understand then numbers, and the possible inclusion
of the debugger may well upset the numbering, though it is a special
case. The use of six character names for the overlays led us to adopt
the <name> field of the $CREAP macro to specify which code body to use.
Other schemes may require another field in the macro call.
37
3.1.2 The Format of the Task Image File
The information required to complete the Overlay Description Table
is scattered across the first few blocks of the task image file. The
format is dependant on various switches and the form of the ODL file.
There follows a brief description of the parts relevant to the loader,
assuming it was built in the required manner.
BLOCK 0:
Contains no useful information, except perhaps the creation
date of the task image.
BLOCK 1:
No useful information.
BLOCK 2:
This is the start of the task header. The word at byte
offset 72 octal is the "header guard word pointer". This
contains the byte offset, within the next block, of the "guard
word", this immediatly preceeds the code of the root segment and
is how the loader finds the root.
BLOCK 3:
This contains the start of the root segment. The first word
of the root follows the "guard word". The root segment is used
to transmit configuration details and the process description
table to the loader. Immediatly following these tables is 29
words of rubbish ( code used to load the overlays under RSX ),
the segment descriptor table, one word of rubbish and then the
window descriptor table. These two tables are used by RSX to
define the positions of the memory resident overlays in physical
memory, EMMOS uses them to construct the Overlay Description
Table, as follows:
segment descriptor table -
each entry is 9 words long, there is one entry for each overlay and
the root
word offset 1 the virtual base address
word offset 6&7 the segment's name in radix 50 format.
window descriptor table -
each entry is 10 words long, there is one entry for each overlay
word offset 2 overlay's size in memory blocks (64bytes)
NEXT BLOCK BOUNDARY:
The code of the first overlay begins at the first word of
the next disk block. Subsequent overlays start on block
boundaries following on in order.
38
3.1.3 The Overlay Description Language
The Task Builder takes a series of object files and links them
together to produce a Task Image. This is the exact image of what is to
be loaded at run time ( under RSX ). EMMOS exploits the memory resident
overlay capabilities of the Task Builder in order to relocate the system
kernel and processes' bodies in the desired place. The arrangement of
the overlays is described in the Overlay Description Language ( ODL )
file, the structure of which is as follows:
module1 - module2 - ............ - moduleN
specifies that the N modules are to be concatenated in the given
order
( module1, module2, ............ , moduleN )
specifies that the N modules will overlay each other in virtual
memory. The modules not paged-in will be resident on disk or, if they
are memory resident overlays ( indicated by a ! before the open bracket
), in physical memory.
These concatenation and overlay operations are combined to describe
the overlay tree structure, eg:
A - B - !( C - D - !( E , F ) , G - !( H - !( I , J ) , K ) , L )
this specifies the following tree
E F I J
|___| |___|
D H K
| |_____|
C G L
|___________|__________|
B
|
A
At any one time during runtime, the memory will contain those
modules lying on one of the paths from the root to a leaf.
For ease of writing, it is possible to give parts of the overlay
tree a name, using the factor command ( .FCTR ). so a complete ODL file
for the above example could be:
.ROOT A - B - !( OV1 , OV2 , L )
OV1: .FCTR C - D - OV11
OV2: .FCTR G - !( OV22 , K )
OV11: .FCTR !( E , F )
OV22: .FCTR H - !( I , J )
.END
39
3.1.4 The EMMOS ODL File
.ROOT ROOT - !( OV1 - !( OV2 - !( EMMOS - !( PROCS ) ) ) )
EMMOS: .FCTR EMMPM - FIRST - ......................
PROCS: .FCTR BODY1 , BODY2 , ..............., BODYn
The factor EMMOS describes all the system modules, which are all
concatenated. The factor PROCS describes each of the different code
bodies for the processes. Normally only one code body is required at a
time so the modules overlay each other, but it is possible to
concatenate processes which are to access common routines.
For example, suppose we have four code bodies BODY1, ... , BODY4
which are each $CREAPed only once to give processes PROC1, ... , PROC4.
If these processes are completely independant then we write:
PROCS: .FCTR BODY1 , BODY2 , BODY3 , BODY4
Suppose we have a library module, LIB, that is shared by PROC1 and
PROC2, then we could write:
PROCS: .FCTR BODY1 - LIB , BODY2 - LIB , BODY3 , BODY4
This would cause two copies of LIB to be in physical memory. We
could however write:
PROCS: .FCTR BODY1 - BODY2 - LIB , BODY3 , BODY4
which would cause one copy of LIB to be put in the task image, but
PROC1 and PROC2 would now be paged in and out of virtual memory
together, which may not be desirable.
Note that if the total size of BODY1, BODY2, LIB and the stacks for
PROC1 and PROC2 exceeds the space allocated to processes, then the
loader will duplicate some or all of the code in an attempt to fit it
in. It may be then, that more physical memory would be used in trying to
share LIB than if we specified duplication in the first place.
40
cont.
The modules ROOT, OV1, OV2, ..... OVn are required by the system to
ensure that the operating system and the process bodies are relocated in
the correct place. ROOT also transmits information from the $CREAP macro
calls to the loader. The Task Builder relocates the first "layer" of the
overlay tree ( the root ) at location zero. Successsive layers are
relocated at the start of the next page boundary ( a page is 4K words
long ). So, because each of these modules is quite small, ROOT is
relocated at 0, OV1 at 20000, OV2 at 40000 etc. If we require the
operating system to be relocated at 60000, we put it in the next layer (
as in the above example ). Suppose we want EMMOS to start on page 4 (
location 100000 ), and it is over one page long so the processes will be
on page 6 ( location 140000 ), we would need to pack out pages 0..3 thus
.ROOT ROOT - !( OV1 - !( OV2 - !( OV3 - !( EMMOS - !( PROCS ) ) ) ) )
For technical reasons it was not possible to have the operating
system at the low numbered end of memory, so that is where the buffers
have been put. Dummy overlays are used to pad out the buffer area
because they cause the task image to be much smaller than if we had a
.BLKW . The latter would cause it to contain about 12K of zeroes!
41
3.1.5 Installation of the Debugger
The EMMOS debugger, MUD, is in two parts, a resident part RESMUD
and an overlay part MUD. To be able to use the debugger both parts must
be included in the ODL file. RESMUD must appear before EMMOS but MUD can
be put anywhere amongst the PROCS, eg:
.ROOT ROOT - !( OV1 - !( OV2 - !( RESMUD - EMMOS - !( PROCS , MUD ) ) ) )
3.1.6 Creating the Task Image File
Once the form of the ODL file has been worked out the task builder
is run under RSX-11 by:
TKB EMMOS , EMMOS /NOSP = EMMOS /MP
The first output file is for the Task Image ( .TSK ) file, the
second is the task map ( .MAP ) file, which is spooled by default. The
MP switch specifies that the input file is an .ODL file.
The .MAP shows the structure of the task image and gives the
virtual addresses of the modules, global symbols and .PSECTs. This link
map should be used in conjunction with the loader's load map, which
specifies physical addresses of code and stacks, when debugging. The
.TSK file now needs to be copied to the RT-11 format disk which contains
the loader using FLX. The system is now ready for loading and running.
42
3.2 The EMMOS RSX Task Image Loader
_______________________________________
This section describes the operation of the loader for the Extended
Memory MOS system. The loader itself is loaded from a floppy disk in
RT-11 format, using a fairly standard bootstrap loader. The EMMOS loader
then loads the system from an RSX-11 Task Builder Image file ( stored in
RT-11 format ) on the same floppy disk. Control is then passed to the
operating system or to the EMMOS debugger, MUD, if it is loaded.
43
3.2.1 Inputs and Outputs
The input to the loader is an RSX-11 Task Builder Image file stored
on a floppy disk in RT-11 format. This file contains:
1. The Configuration Table ( see 3.2.3 )
2. The Overlay Description Table ( see 3.2.4 )
3. The Process Description Table ( see 3.2.5 )
4. Absolute binary code for the operating system
and processes.
The loader produces a table, the load map, which it places in
memory and displays on the console, that describes the positioning of
all the processes in physical memory.
The address of the entry point to the loaded system must be stored
in the first word of the system. This is set by MOSPM or, if the
debugger is included, RESMUD.
44
3.2.2 The Console Load Map
This is a human readable form of the LOAD MAP that is supplied to
the loaded system. There is one line for each process describing the
position of it's code and stacks.
The first column gives the process id ( pid ), the next two the
limits of the code and the fourth the limits of the two stacks combined.
If none of the pages containing code, contain any stack then the
third column will read "unshared", that is, there is no page that
contains both stack and code. However, if there is only one page for
processes then this must contain all the code and all the stack, in this
case the third column will read "page is shared".
If the code is longer than one page, and it is necessary to have a
page containing both stack and code, then the third column will be the
limits of the piece of code that is sharing with the stacks. This may be
a copied piece of code.
The limits are given in terms of physical memory blocks, so a limit
of
001200 - 001250
for the code would mean that the code starts at physical memory
location 120000 and ends somewhere between 125000 and 125077.
If the task image does not contain an overlay for a process' body,
then the load map entry for that process will be of the form:
pid - ABCDEF has'nt got a body
If one or more processes lack bodies then the load will be
abandoned, once the load map has been printed, with the message:
"process requires body" (see ERRORS)
Note: the printing of the Load Map can be abandoned by hitting
<escape>.
45
3.2.3 The Configuration Table
This table resides in the root segment of the task image,
immediately before the Process Description Table. It is used to tell the
loader the page allocation scheme and the maximum number of processes.
The information is passed to it at run time so that the loader need not
be re-compiled for every change in the configuration.
The format is:
VAR config_table :RECORD
max_number_of_processes,
load_map_address,
first_mos_page, last_mos_page,
first_process_page, last_process_page
: INTEGER
END;
46
3.2.4 The Overlay Description Table
This table is constructed by the loader from the task image's
window and segment descriptor tables, that follow the root segment. Each
overlay named by the task image ( except the dummy ones and MOS ) has an
entry of the form:
TYPE overlay_description = RECORD
name : ascii;
size : mem_blocks;
disk_address : INTEGER
END;
TYPE ascii = ARRAY[ 1:6 ] OF CHAR;
TYPE mem_blocks = 0..number_of_process_pages * #200 -1;
The table is given by
odt : ARRAY[ 1..number_of_overlays ] OF overlay_description;
odt[ ov ].size contains the size of the overlay segment ( that is of
the code ) in 32 word memory blocks.
odt[ ov ].name is the overlay's name, taken from the overlay description
language file at build time.
odt[ ov ].disk_address is the number of the disk block where the overlay starts.
Note that the disk address field is not used in the current version
of the loader. It remains as a left-over from a version that would not
support down line loading.
47
3.2.5 The Process Description Table
This table resides in the root segment of the task image, following
the Configuration Table. It is produced by the $CREAPs used to create
the processes. It is used to tell the loader which processes are
required, which overlay will supply the code and the size of their
stacks. There is one entry in the table per process, each of the form:
TYPE process_description = RECORD
pid : INTEGER;
name : ascii;
r0_size, r6_size : mem_blocks
END;
TYPE ascii = ARRAY [ 1..6 ] OF CHAR;
TYPE mem_blocks = 0..number_of_process_pages * #200 - 1;
The table is given by
VAR pdt : ARRAY [ 1..number_of_processes ] OF process_description;
pdt[ i ] is the entry for the ith process,
so pdt[ i ].pid = i for all i IN 1..number_of_processes.
The name field contains the six character name of the overlay which is the
code for the process. This is taken from the second parameter of the $CREAP
macro call.
pdt[ i ].r0_size and pdt[ i ].r6_size contain the required sizes, in 32
word memory blocks, of the stacks.
Note that the macro converts the sizes from words to memory blocks before
placing the values in the table, using
size_in_blocks := ceiling( size_in_words / 32 ).
48
3.2.6 The Load Map
The load map describes the loaded system to the initializer. It is
placed in memory by the loader at an agreed address before it gives
control to the system. There is one entry in the table for each process,
and a special one for the debugger if it is included in the task image.
The form of a table entry is:
TYPE load_description = RECORD
pid,
r0_low, r0_hi,
r6_low, r6_hi : INTEGER;
window : ARRAY
[ first_page..last_page ]
OF window_type
END;
TYPE window_type = RECORD par, pdr : INTEGER END;
The map is given by:
map : ARRAY [ 1..number_of_processes + 1 ] OF load_description
map[i] is the entry for the ith process, so map[i].pid = i, for all
i IN 1..number_of_processes. map[i].pid = -1 for the debugger entry. Any
unused entries have pid = 0.
r0_low and r0_hi contain the virtual byte addresses of the lowest
and highest numbered words in the R0 ( CORAL ) stack for the process.
Similarly for r6_low, r6_hi.
window describes the mapping from virtual memory onto this process.
There are two words for each page allocated to a process, the page
address register and the page descriptor register.
49
3.2.7 The Physical Memory Allocation Algorithm
The loader should attempt to arrange the loading of the Load Image
into physical memory in a way that will reduce the amount of physical
memory required. The total amount of virtual memory allocated to a
process for it's code body and stacks will be fixed ( = num_pp memory
pages ), but for each process the sizes of it's code and the total size
of it's two stacks will vary.
If at all possible the loader will arrange for the stacks and the
code to be on different pages in virtual memory, allowing the process to
put read-only protection on it's code. This is only possible if:
pages( code_size ) + pages( stack_size ) <= num_pp
where pages( i ) = the size i rounded up into a whole number of
memory pages. Of course if pages( code_size + stack_size ) > num_pp then
not enough virtual memory is availiable for this process. However if we
have:
pages( code_size + stack_size ) = num_pp
then some of the code and some of the stack space must share one of
the pages. The shared page cannot then have read-only protection placed
on it. If a process has already been created with this code body then
physical memory following the code body will have been allocated as a
stack. We cannot then have the stack for the next process following
directly on from the code, so it may be necessary to copy part of the
code body and arrange that the memory mapping maps the two parts of the
code onto consecutive virtual memory locations.
50
E.G.
Suppose we have two pages for a process, num_pp = 2, and we have a
code body which is $CREAPed twice, as PROC1 with stacks S1 and as PROC2
with stacks S2. If the code body is larger than one page then the second
page must be shared between the stacks and some of the code.
First the loader copies the code body into physical memory, then it
allocates the stacks for PROC1, the window for PROC1 will be
(a..b,b..d).
--a---------------b----c------d----f---------------
: code size : S1 : ....physical memory
---------------------------------------------------
|<- 4K words -->|<-- 4K words -->|
Now for PROC2. If pages( code size + S1 + S2 ) <= num_pp, then the
loader can allocate the stack following on directly from S1, thus the
window for PROC2 will be (a..b,b..e).
---a--------------b----c------d--e-f---------------
: code size : S1 :S2:
---------------------------------------------------
Note that PROC2's window allows it to ( illegally ) access the
stack of S1. If however pages( code size + S1 + S2 ) > num_pp then the
loader must take a copy of the piece of code that overlaps the second
page, (copy from b..c to d..g), thus the window for PROC2 will be (
a..b,d..h ), and the copy will be mapped into the same virtual locations
as the original.
---a--------------b----c------d--e-fg--h------------
: code size : S1 :copy :S2:
----------------------------------------------------
Now PROC1's stack is not in the window of PROC2 so neither process
can use the other's stack.
The loader will try and fit as many stacks as possible into the
page without copying a piece of code, but it wont try to optimize the
ordering. If the code size is smaller than a page and the stacks are
bigger then the whole of the code will be copied.
51
3.2.8 The Logical Operation of the Loader
get process description table from task image;
construct overlay description table;
load EMMOS;
FOR each overlay
DO
load the code;
FOR each process
DO
IF name of process = name of overlay
THEN
IF size of code in pages + size of stacks in pages
<= number of pages for processes
THEN
code and stacks are on completely different pages;
set up the window for the process;
leave room for the stacks in physical memory
ELSF (size of code + stacks) in pages
= number of pages for processes
THEN
one page contains stack and code;
IF stack will fit on the end of the last
incarnation's window
THEN
leave room for the stacks after last stack;
set up the process' window
ELSE
make a copy of the piece of code that must
share a page with some stack;
leave room for the stacks after the copied code;
set up the window for the process
FI
ELSE
process is too large
FI
FI
OD
OD
52
3.2.9 Error Messages
i. System Traps
These indicate something has gone drastically wrong with
the loader, hopefully they will only occur after modifications
have been made and not thoroughly tested. They may also occur if
the loaded system goes wrong before it has set the trap
locations to its own trap handlers.
cpu error: pc = nnnnnn ps = mmmmmm
reserved instruction: pc = nnnnnn ps = mmmmmm
These messages occur when a trap to #004 and #010 occur.
The contents of the program counter and processor staus word, as
placed on the stack by the trap, are displayed.
memory management trap 250
nnnnnn at mmmmmm
This message occurs when something goes wrong with the
memory management. The contents of the two status registers are
displayed. nnnnnn = SR0 which is further expanded in full
following the error message , and mmmmmm = SR2, the virtual
address associated with the error. Following the written
expansion of SR0, the contents of the kernel mode memory
management registers are displayed.
power failed, start again
the power to the processor failed, re-boot the system. NB
this has not been tested!!!!
53
ii. Loader Errors and Messages
test: eof
The end of the task builder image file has been reached prematurely.
test: get has failed
An error occurred in the transfer of data from the floppy disk. Try
again.
test: rewind
The loader requires the input task image file to be rewound. This
is only an error if the file is not on a rewindable device, eg down
line loading. It is caused by an error in the loader.
cant find 'task file name'
The file is not on the floppy disk, check the file name specified
in the loader is the same as you copied to the disk, and the disk
is in the correct drive ( drive 0 ). Remember that the name specified
in the loader must include any blanks necessary to pack the name to
10 chars, in the form ABCDEF.EXT
too short!
The file is less than 4 blocks long, so cannot possibly be the
correct task image.
first dummy overlay not found
the dummy overlay that is used to pack out the buffer area is not
present in the task image. Check the file is a task image and that
the ODL file correctly specifies the dummy overlays.
mos is larger than nn pages long
the resident part of EMMOS is too large, more than the nn pages
specified in the configuration file. Make sure that the processes
have been put in as overlays and not in the resident part.
window and segment descriptors of unequal size
The loader has lost it's place in the task image, or the ODL file
does not specify the overlays in the correct format.
ABCDEF is too large
The named overlay is too large for the configuration, even without
stacks. Check that the overlay names are seperated by commas in the ODL
file.
54
Errors and Messages - cont.
process nn too large
The process with pid = nn has stack and code too large for the
configuration. Check that stack sizes have been given in words in
the $CREAP macro call.
ABCDEF has'nt been made a process
The named overlay does'nt appear in any of the $CREAPs. The overlay
is loaded but otherwise ignored, this is not a fatal error.
process nn: code duplicated
The loader was forced to make a copy of some of the process' code
because the code had already been $CREAPed and the stack and code
needs to share a page. This is not an error.
debugger has been loaded nnnnnn - mmmmmm
The overlay part of the debugger has been loaded, starting at
memory block nnnnnn and finishing in block mmmmmm. This is not an
error.
ABCDEF has incorrect base address = nnnnnn
The named overlay has been relocated at nnnnnn by the Task Builder.
This does not agree with the virtual page allocation scheme given.
Check that the ODL file is correct, that the right number of dummy
packing overlays are present and that all overlays are not too
large for the configuration.
debugger is too large
The debugger takes up more space than is allowed for processes. it
is still loaded into physical memory, but only those pages normally
allowed for processes will be paged into virtual memory, this will
almost certainly cause MUD to crash. Use the console emulator to
enter the system instead of MUD. Alternatively, remove the debugger
or allow more pages for processes. This is not a fatal error.
55
Errors and Messages - cont.
the debugger cannot be used
Either the resident part or the overlay part of the debugger are
missing and hence the debugger cannot be used. If the debugger is
needed then include the missing name in the ODL file. This is not a
fatal error.
non-unique overlay ABCDEF
The ODL file describes a configuration such that two overlays have
the same name. The loader requires all overlays to have unique
names so it can unambiguously select the code body for a process.
Try re-ordering the overlay description in the ODL file. If the
effending overlay is a common library, put it's name after the code
body, otherwise the file may need to be copied to one of a
different name.
process requires body
at least one process has specified a body name that does not
appear in the Task Image as an overlay. An indication of which
process is causing the problem can be found from the Console Load
Map (qv). Make sure the name specified in the $CREAP macro call
is that of one of the object files, check in the task map for
the names actually used. If the debugger is included, make sure
it, and all other overlays, are separated by commas in the ODL
file.
56
3.2.10 Tracing
The loader and it's libraries are liberally scattered with code to
produce vast amounts of output, that describes in detail what is being
processed. These pieces of code are normally excluded by surrounding
them in comment brackets. The start of a piece of trace code is
indicated by {trace and the end is followed by ecart}. To include the
trace statements change {trace to {trace} and ecart} to {ecart}. This
will make it easy to restore them later.
The output produced should be self explanatory, except for
PROCEDURE waiting; which displays the message 'waiting' and waits until
any character is typed in at the keyboard. All this does is slow down
the output, since <cntrl-s> and<cntrl-q> are not implemented.
3.2.11 Changing The System
The loader need not be re-compiled every time the system
configuration is changed. This is because the configuration information
is placed in the root segment of the task image file at link time.
57
3.3 Process Creation
________________________
Processes are created at compile time by calling the $CREAP macro
in the system configuration module. The $CREAP macro defines the
attributes of a process; it's entry point, the name of the overlay that
contains it's code body, the devices associated with it and the size of
the CORAL and System stacks.
Note, the name given is the name of the overlay in which the code (
and therefore the entry point ) is found. This name is that of the first
module given in the overlay subtree. E.g. If we have a library module
LIB and a process body module BODY1, then in the ODL file we could
write:
PROCS: .FCTR BODY1 - LIB , .........
in which case the $CREAP call would be
$CREAP $ENTRY , <BODY1> , ............
or alternatively
PROCS .FCTR LIB - BODY1 , .........
in which case the $CREAP call would need to be
$CREAP $ENTRY , <LIB > , ...........
For this reason all the overlay names must be unique, othewise the
loader would not know which code body to give a process. Now suppose we
have another process body module, BODY2, which also uses LIB. We could
not write:
PROCS: .FCTR LIB - BODY1 , LIB - BODY2 , .........
since both overlays would be called LIB. Instead write either
PROCS: .FCTR BODY1 - LIB , BODY2 - LIB , .........
with
$CREAP $ENT1 , <BODY1 > , ......
$CREAP $ENT2 , <BODY2 > , ......
or
PROCS: .FCTR LIB - BODY1 - BODY2 , ......
with
$CREAP $ENT1 , <LIB > , ......
$CREAP $ENT2 , <LIB > , ......
or some other combination.
Remember that the name must be of exactly six characters. Pad out
with blanks if necessary.
The stack sizes are specified using named parameters, these are the
only extra parameters of $CREAP due to the EMMOS extensions. The sizes
are given in words, but are automatically rounded up into memory blocks
( 32 words ). A zero length R0 stack is acceptable for a non-CORAL
process, but the R6 stack should always have some space. If the size is
not specified it is set to 32 words by default. Note the stack sizes are
given seperately for each process, not for each code body, so two
incarnations of the same body could have different stack sizes, though
it would be a bit odd.
58
APPENDICES
__________
59
APPENDIX A Pitfalls.
A.1 Summary
Processes must ensure that all communication is performed via the
global buffer pool.
Shared overlaid library procedures must not use the CORAL global
vector for their entry.
CORAL code bodies incarnated more than once should not preset
variables.
All local variables must be taken from the stack, in code bodies
incarnated more than once.
All IORBs and their data buffers must be located in permanently
resident memory (ref 2.1.4).
A.2.1 Use of local space for inter-process communication
Obviously processes with different code bodies cannot use their
local static space ( ie not the stack ) for communication, but at first
sight it would appear that different incarnations of the same body could
communicate using variables in the body. Although sometimes possible,
this is extremely configuration ( and even loader ) dependant. The
problem arises when there is not enough room to fit all the code body (
which includes local static data ) onto process pages of it's own. The
loader may need to take a copy of some of the code, which may include
local data. In this case there would be more than one copy of the local
data, so if two processes use it for communicating, they would actually
be using two different locations.
Therefore, processes must ensure that all communication is
performed via the global buffer pool.
60
A.2.2 Use of shared, overlaid CORAL libraries.
All CORAL procedures, common variables etc., must use unique
indices to the global vector, to define themselves ( unless some trick
is being performed ). This remains true even if the procedure is defined
in an overlay, because there is only one global vector. If a library
module containing some CORAL procedures only appears in one overlay,
then the global vector will contain the virtual addresses of the entry
points. However, the procedures will only be in virtual memory when a
process which is an incarnation of that overlay is running, so only
these processes can use the procedures.
This is ok, the problem arises when two or more overlays use the
library module. If the overlays are such that when relocated, the entry
points of the procedures are at different virtual locations, the global
vector mechanism wont work. The entry point recorded will probably be
that for the last overlay processes by the linker, so the first overlay
will crash when it tries to call the procedure.
To prevent this all copies of the library module must be relocated
at the same virtual locations, either by placing it on permanently
resident EMMOS pages or putting it first in the overlay, though this may
cause problems with the overlay naming convention. A more satisfactory
solution in many ways, would be to not use the global vector for the
procedure entry. The offending procedures would use genuine names for
their entry points, not vector indices. This is achieved in CORAL by
using a large index, outside the bounds of the global vector.
A.2.3 Use of static variables
Any data space allocated in a code body, which is incarnated more
than once, will be used by more than one incarnation, unless the loader
causes a copy to be made. The processes cannot then use this space for
private local variables, because another may use the same physical
memory for it's private local variables. Neither can it be used for
shared variables, because the loader may have copied them ( see 'use of
local space for inter process communication' ). Constants are ok since
all incarnations will require them to remain constant. Beware of string
constants though, although they really should remain constant, some
unhealthy procedures actually change some parts of them. Such practices
should be banned.
In particular, note that the local variables can be created in
devious ways, in CORAL a preset variable becomes an OWN variable by
being allocated not on the stack, but in the local static space.
Presetting and variables in the outermost block must be avoided for
multi-incarnate processes.
Always take local variables from the stack.
61
Don't Panic !
62