IP Pascal is an implementation of the Pascal programming language using the IP portability platform, a multiple machine, operating system and language implementation system.
Overview
IP Pascal implements a superset of ISO 7185 Pascal. It adds modularity, including the parallel tasking monitor concept, dynamic arrays, overloads, and a host of other minor extensions to the language. IP implements a porting platform, including a widget toolkit, tcp/ip library, midi and sound library and other functions, that allows both programs written under IP Pascal, and IP Pascal itself, to move to multiple operating systems and machines.
Language
IP Pascal starts with ISO 7185 Pascal (which standardized Niklaus Wirth's original language), and adds:
- modules, including parallel task constructs process, monitor and share.
module mymod(input, output);
const one = 1;
type string = packed array of char;
procedure wrtstr(view s: string);
private
var s: string;
procedure wrtstr(view s: string);
var i: integer;
begin
for i := 1 to max(s) do write(s[i])
end;
begin { initialize monitor }
end;
begin { shutdown monitor }
end.
Modules have entry and exit sections. Declarations in modules form their own interface specifications, and it is not necessary to have both interface and implementation sections. If a separate interface declaration file is needed, it is created by stripping the code out of a module and creating a "skeleton" of the module. This is typically done only if the object for a module is to be sent out without the source.
A program from ISO 7185 Pascal is directly analogous to a module, and is effectively a module without an exit section. Because all modules in the system are "daisy chained" such that each are executed in order, a program assumes "command" of the program simply because it does not exit its initialization until its full function is complete, unlike a module which does. In fact, it is possible to have multiple program sections, which would execute in sequence.
A process module, like a program module, has only an initialization section, and runs its start, full function and completion in that section. However, it gets its own thread for execution aside from the main thread that runs program modules. As such, it can only call monitor and share modules.
A monitor is a module that includes task locking on each call to an externally accessible procedure or function, and implements communication between tasks by semaphores.
A share module, because it has no global data at all, can be used by any other module in the system, and is used to place library code in.
Because the module system directly implements multitasking/multithreading using the monitor concept, it solves the majority of multithreading access problems. Data for a module is bound to the code with mutexes or Mutually Exclusive Sections. Subtasks/subthreads are started transparently with the process module. Multiple subtasks/subthreads can access monitors or share modules. A share module is a module without data, which does not need the locking mechanisms of a monitor.
- Dynamic arrays. In IP Pascal, dynamics are considered "containers" for static arrays. The result is that IP Pascal is perhaps the only Pascal where dynamic arrays are fully compatible with the ISO 7185 static arrays from the original language. A static array can be passed into a dynamic array parameter to a procedure or function, or created with new.
program test(output);
type string = packed array of char;
var s: string;
procedure wrtstr(view s: string);
var i: integer;
begin
for i := 1 to max(s) do write(s[i])
end;
begin
new(s, 12); s := 'Hello, world'; wrtstr(s^); wrtstr('Thats all folks')
end.
- Constant expressions. A constant declaration can contain expressions of other constants.
const b = a+10;
- Radix for numbers.
$ff, &76, %011000
- Alphanumeric goto labels.
label exit; goto exit;
- '_' in all labels.
var my_number: integer;
- Special character sequences that can be embedded in constant strings:
const str = 'the rain in spain\cr\lf';
Using standard ASCII memnemonics.
- Duplication of forwarded headers.
procedure x(i: integer); forward;
...
procedure x(i: integer);
begin
...
end;
This makes it easier to declare a forward by cut-and-paste, and keeps the parameters of the procedure or function in the actual header where you can see them.
- 'halt' procedure.
procedure error(view s: string);
begin
writeln('*** Error: ', s:0); halt { terminate program }
end;
- Special predefined header files.
program myprog(input, output, list);
begin
writeln(list, 'Start of listing:'); ...
program echo(output, command);
var c: char;
begin
while not eoln(command) do begin
read(command, c); write(c)
end; writeln
end.
program newprog(input, output, error);
begin
... writeln(error, 'Bad parameter'); halt ...
'command' is a file that connects to the command line, so that it can be read using normal file read operations.
- Automatic connection of program header files to command line names.
program copy(source, destination);
var source, destination: text; c: char;
begin
reset(source); rewrite(destination); while not eof(source) do begin
while not eoln(source) do begin
read(source, c); write(destination, c)
end; readln(source); writeln(destination)
end
end.
- File naming and handling operations.
program extfile(output);
var f: file of integer;
begin
assign(f, 'myfile'); { set name of external file } update(f); { keep existing file, and set to write mode } position(f, length(f)); { position to end of file to append to it } writeln('The end of the file is: ', ___location(f)); { tell user ___location of new element } write(f, 54321); { write new last element } close(f) { close the file }
end.
- 'fixed' declarations which declare structured constant types.
fixed table: array [1..5] of record a: integer; packed array [1..10] of char end = array
record 1, 'data1 ' end, record 2, 'data2 ' end, record 3, 'data3 ' end, record 4, 'data4 ' end, record 5, 'data5 ' end
end;
- Boolean bit operators.
program test;
var a, b: integer;
begin
a := a and b; b := b or $a5; a := not b; b := a xor b
end.
Modular structure
IP Pascal uses a unique stacking concept for modules. Each module is stacked one atop the other in memory, and executed at the bottom. The bottom module calls the next module up, and that module calls the next module, and so on.
wrapper serlib program cap
The cap module (sometimes called a "cell" in IP Pascal terminology, after a concept in integrated circuit design) terminates the stack, and begins a return process that ripples back down until the program terminates. Each module has its startup or entry section performed on the way up the stack, and its finalization or exit section performed on the way back down.
This matches the natural dependencies in a program. The most primitive modules, such as the basic I/O support in "serlib", perform their initialization first, and their finalization last, before and after the higher level modules in the stack.
IP Pascal has a series of modules (or "libraries") that form a "porting platform". These libraries present an idealized API for each function that applies, such as files and extended operating system functions, graphics, midi and sound, etc. The whole collection forms the basis for an implementation on each operating system and machine that IP Pascal appears on.
The two important differences between IP Pascal and many other languages that have simply been mated with portable graphics libraries are that:
1. IP Pascal uses its own porting platform for its own low level code, so that once the platform is created for a particular operating system and machine, both the IP system and the programs it compiles can run on that. This is similar to the way Java and the UCSD Pascal systems work, but with true compiled code, not interpreted code.
2. Since modules can override lower level functions such as Pascal's "write" statement, normal, unmodified ISO 7185 Pascal programs can also use advanced aspects of the porting platform. This is unlike many or most portable graphics libraries that force the user to use a completely different I/O methodology to access a windowed graphics system, for example C, other Pascals, and Visual Basic.
IP modules can also be created that are system independent, and rely only on the porting platform modules. The result is that IP Pascal is very highly portable.
History
The Z80 Implementation
The compiler started out in 1980 on Micropolis Disk Operating System, but was moved rapidly to CP/M running on the Z80. The original system was coded in Z80 assembly language, and output direct machine code for the Z80. It was a single pass compiler without a linker, it included its system support library within the compiler, and relocated and output that into the generated code into the runnable disk file.
After the compiler was operational, almost exactly at the new year of 1980, a companion assembler for the compiler was written, in Pascal, followed by a linker, in Z80 assembly language. This was then used to move the compiler and linker Z80 source code off the Micropolis assembler (which was a linkerless assembler that created a single output binary) to the new assembler linker system.
After this, the compiler was retooled to output to the linker format, and the support library moved into a separate file and linked in.
In 1981, the compiler was extensively redone to add optimization, such as register allocation, boolean to jump, dead code, constant folding, and other optimizations. This created a Pascal implementation that benchmarked better than any existing Z80 compilers, as well as most 8086 compilers. Unfortunately, at 46kb, it also was difficult to use, being able to compile only a few pages of source code before overflowing its tables.
Despite this, the original IP Pascal implementation ran until 1987 as a general purpose compiler. In this phase, IP Pascal was C like in its modular layout. Each source file was a unit, and consisted of some combination of a 'program' module, types, constants, variables, procedures or functions. These were in "free format". Procedures, functions, types, constants and variables could be outside of any block, and in any order. Procedures, functions, and variables in other files were referenced by 'external' declarations, and procedures, functions, and variables in the current file were declared 'global'. Each file was compiled to an object file, and then linked together. There was no type checking across object files.
As part of the original compiler, a device independent terminal I/O module was created to allow use of any serial terminal (similar to Turbo Pascal's CRT unit), which remains to this day.
In 1985, an effort was begun to rewrite the compiler in Pascal. The new compiler would be two pass with intermediate, which was designed to solve the memory problems associated with the first compiler. The front end of the compiler was created and tested without intermediate code generation capabilities (parse only).
in 1987, the Z80 system used for IP was exchanged for a 80386 IBM-PC, and work on it stopped. From that time several other, ISO 7185 standard compilers were used.
The 80386 Implementation
By 1993, ISO 7185 compatible compilers that delivered high quality 32 bit code were dying off. At this point, the choice was to stop using Pascal, or to revive the former IP Pascal project and modernize it as a 80386 compiler. At this point, a Pascal parser, and assembler (for Z80) were all that existed which were usable on the IBM-PC. From 1993 to 1994, the assembler was made modular to target multiple CPUs including the 80386, a linker to replace the Z80 assembly language linker was created, and a the Pascal compiler front end was finished to output to intermediate code. Finally, an intermediate code simulator was constructed, in Pascal, to prove the system out.
In 1994, the simulator was used to extend the ISO 7185 IP Pascal "core" language to include features such as dynamic arrays.
In 1995, a "check encoder" was created to target 80386 machine code, and a converter program created to take the output object files and create a "Portable Executable" file for Windows. The system support library was created for IP Pascal, itself in IP Pascal. This was an unusual step taken to prevent having to later recode the library from assembly or another Pascal to IP Pascal, but with the problem that both the 80386 code generator and the library would have to be debugged together.
At the beginning of 1996, the original target of Windows NT was switched to Windows 95, and IP Pascal became fully operational as a 80386 compiler under Windows. The system bootstrapped itself, and the remaining Pascal code was ported from SVS Pascal to IP Pascal to complete the bootstrap.
The Linux Implementation
In 2000, a Linux (Red Hat) version was created for text mode only. The plan is to create a version of the text library that uses termcap info, and the graphical library under X Windows.
Steps to "write once, run anywhere"
In 1997, a version of the terminal library from the original 1980 IP Pascal was ported to windows, and a final encoder started for the 80386. However, the main reason for needing an improved encoder, execution speed, was largely made irrelevant by increases in processor speed in the IBM-PC. As a result, the new encoder wasn't finished until 2003.
In 2001, a companion program to IP Pascal was created to translate C header files to Pascal header files. This was meant to replace the manual method of creating operating system interfaces for IP Pascal.
In 2003, a fully graphical, operating system independent module was created for IP Pascal.
Lessons
In retrospect, the biggest error in the Z80 version was its single pass structure. There was no real reason for it, the author's preceding (Basic) compiler was multiple pass with intermediate. The only argument for it was that single pass compilation was supposed to be faster. However, single pass compiling turns out to be a bad match for small machines, and isn't likely to help the advanced optimizations common in large machines.
Further, the single pass aspect slowed or prevented getting the compiler bootstrapped out of Z80 assembly language and onto Pascal itself. Since the compiler was monolithic, the conversion to Pascal could not be done one section at a time, but had to proceed as a wholesale replacement. When replacement was started, the project lasted longer than the machine did. The biggest help that two pass compiling gave the I80386 implementation was the maintenance of a standard book of intermediate instructions which communicated between front and back ends of the compiler. This well understood "stage" of compilation reduced overall complexity. Intuitively, when two programs of equal size are mated intimately, the complexity is not additive, but multiplicative, because the connections between the program halves multiply out of control.
Another lesson from the Z80 days, which was corrected on the 80386 compiler was to write as much of the code as possible into Pascal itself, even the support library. Having the 80386 support code all written in Pascal has made it so modular and portable that most of it was moved out of the operating system specific area and into the "common code" library section, a section reserved for code that never changes for each machine and operating system. Even the "system specific" code needs modification only slightly from implementation to implementation. The result is great amounts of implementation work saved while porting the system.
Further reading
- Kathleen Jansen and Niklaus Wirth: PASCAL - User Manual and Report. Springer-Verlag, 1974, 1985, 1991, ISBN 0-387-97649-3, ISBN 0-540-97649-3, ISBN 0-387-90144-2, and ISBN 3-540-90144-2 [1]
- Niklaus Wirth: The Programming Language Pascal. Acta Informatica, 1, (Jun 1971) 35-63
- ISO/IEC 7185: Programming Languages - PASCAL. [2]