Proposal for an ALGOL language for a new millennium Theses State of the art has changed since Algol68: stabilization - ASCII character set imbedded in a much larger international character set - Unix semantics for the operating system - IEEE 754 floating point very successful language C / C++ / Java functional (and lazy) language Haskell Why is C+ successful? (let me call C+ a language enhanced with features from GNU C and C++) - preprocessor - support of common procedural paradigm - versatile idioms - low level access - easy editing Why not going functional? functional paradigm not well suited to actual computer systems - computer memory is read/write optimized not write once - very dependent on run time system (GC) some things easily expressible procedurally very intricate in functional - 'tying the knot' Where is an ecological niche place for Algol? A closer look to GNU-C shows that this enhanced language has indeed most of the expressive power of Algol68. Most constructs have been extended by orthogonal extension from C heritage. (From gcc.info: * Statement Exprs:: Putting statements and declarations inside expressions. * Nested Functions:: As in Algol and Pascal, lexical scoping of functions. * Constructing Calls:: Dispatching a call to another function. * Lvalues:: Using `?:', `,' and casts in lvalues. * Variable Length:: Arrays whose length is computed at run time. * Subscripting:: Any array can be subscripted, even if not an lvalue. * Initializers:: Non-constant initializers. * Constructors:: Constructor expressions give structures, unions or arrays as values.) Missing (or annoying) in C+ are: Top-down declarations (C bottom-up declarations sometimes nearly not understandable because pre-, post- and infix-operators are intermixed in a non-intutitive manner). Full 'row' semantics - conflicts caused by pointer equivalence - only one-dimensional semantics - thus no slicing or trimming of arrays bad practises in operator use (* as deref) unnecessary conflicts caused by postfix operators use of equals sign for assignation need to put a semicolon after last statement in grouping dangling else I think it should be an exercise not too hard, just to change the GNU C parser to compile a considerable a68 subset. Things annoying in A68 - selection by OF vs. dot symbol or -> - collateral semantics in the case of AND and OR - need to cast NIL for almost any real application - use of 4 keywords for different aspects of absence (SKIP, EMPTY, NIL, VOID) - overlapping use of the equals symbol and isdefinedas symbol (complicates parsing, I prefer C == in this case) - there is no brief do / od symbol - the loopclause is the single statement in A68 (syntactically strong void) -- though sometimes the last value of the loop count is needed outside - formats - need to double parentheses for print - no empty statement (why?) - no nested comments - environment enquiries (max int etc.) don't -- min int /= - max int - LONG and SHORT don't -- it is nonsense that INT is widened to REAL and LONG INT to LONG REAL; for 32bit, int has better to be widened to double IEEE type - there should be modes which make the type system a lattice -- to define the straightening and dereferencing as operators - no short exit from a function (e.g. when searching a linear list) - there is gap between REF and PROC shown by this example. MODE FLISTAMODE = REF STRUCT( FLISTAMODE tail, AMODE head ); MODE LLISTAMODE = PROC STRUCT( LLISTAMODE tail, AMODE head ); Using the 1st only finite lists can be constructed, using the 2nd only infinite ones, as there is no analogue to NIL for PROCs. As is well known, there is no way to construct interesting functions in A68, anyway. But the modes are legal. - unnecessary redundancy in identity definition: -- REF REAL x = LOC REAL; better: x = LOC REAL -- INT ten = 10; better: ten = 10 -- LONG INT lten = 10; better: lten=LONG INT(10) - PROC, OP, MODE are likewise redundant. The only practical use, making parsing easier, has been killed by the merging in RR411a, 2nd alt. Things that probably have to be changed from Algol68 (mostly known from implementation difficulties) - remove transput (not congruent with OS, anyway) - make 1-pass parseable (implies need for forward declarations) - remove 2nd alternative of RR411a (comma between different COMMON declarations) (why? it breaks (LA)LR1 parsing for love) - remove EXIT (needs GOTO anyway) - some form of return statement (does what EXIT does not) - insertion of white space into identifiers (wrong sort of polymorphism) - add some form of nested comments - proceduring has to come back (not as coercion, of course). In parameter positions requiring PROC AMODE, the semantics should behave as if 'AMODE:' had be prepended, making a syntactical unit to a routinetext (call by name; and solution of the ANDIF ORELSE syndrom) Nice things to be imported - 2-dimensional programming (Haskell style layout) - function currying (as already proposed in AB39.3.1) - postfixed selection - more general typing and typeclassing I want to have introduced a generalization of the widening coercion. If a type (REAL, e.g.) is imbedded into another (COMPL), definition of a operator OP(REAL)COMPL WIDEN (naming to be improved) may replace the widening coercion. Perhaps it is possible to restrict the creation of overloadable operators in a way that these coercions can be autmatically applied to a significantly larger set of modes than in A68. - infinite precision integer arithmetic as standard - simple polymorphism to construct TREE AMODE from AMODE etc. More cleanup - unification of function currying and row slicing - unification of PROC and OP - unification of PROC (without parameters) and REF -- so dereferencing and deproceduring coincide - no SORT::strong;firm;meek;week;soft anymore - clean up usage of different style of bracketting pairs -- () only for making associativity explicit --- (||) for the choice constructs {sh,c}ould remain -- [] for constructing composite objects -- {} for nested comments - adjust the routinetext parentheses to match the lexical scope (REAL x)REAL: ( C phrases(x) C ) => (REAL x,REAL: C phrases(x) C ) - move modes REAL and COMPL from the kernel to standard environment - remove PRAGMAT PR (or make a sensible definition of a set of pragmats)