[AGS4] Syntax overhaul

Started by Monsieur OUXX, Mon 06/09/2021 17:39:39

Previous topic - Next topic

Monsieur OUXX

Fact : AGS script is VERY inspired by C. It's the archaic side to AGS.

Don't get me wrong. C is a fine language, and it works.

But nowadays, who declares a class as "struct"? Who uses so-called "pointers" in a high-level scripting language?
The only reason why AGS still offers both managed and unmanaged is because until the latest versions dynamic structures were not fully implemented. And this is on the verge of being fixed.

Here is my suggestion :
Quote
1. Accept to break (a tiny bit of) script retro-compatibility upgrading when from AGS 3 to AGS4.
2. Replace keyword "struct" with "class"
3. Make *all* structs managed.
4. Get rid of the "asterisk" ( * ) when dealing with managed objects.
This would effectively make AGS move to the age of References rather than pointers.

So INSTEAD of this :
Code: ags

managed struct A {
  int i;
};

function Foo(A* a) 
{
   a.i = 777;
}

void game_start()
{
   A* a = new A;
   Foo(a);
}


...the new syntax would be this. And would look like most mainstream languages.
Code: ags

class A {
  int i;
};

function Foo(A a) 
{
   a.i = 777;
}

void game_start()
{
   A a = new A;
   Foo(a);
}



The beauty of this is that almost everything is already there. It's only a matter of substituting some syntactic elements with other syntactic elements;
 

Crimson Wizard

#1
The necessity to write "*" has been already removed in ags4 branch's new script compiler (written by fernwelten).

Frankly I've been worrying that this may lead to confusion... but time will show.

There has been quite a few changes and additions to the new compiler, but we don't have a proper list of these posted anywhere yet, it's all spread around pull requests.


Quote from: Monsieur OUXX on Mon 06/09/2021 17:39:39
The only reason why AGS still offers both managed and unmanaged is because until the latest versions dynamic structures were not fully implemented.

I never thought it that way...

Quote from: Monsieur OUXX on Mon 06/09/2021 17:39:39
3. Make *all* structs managed.

This is not only a matter of syntax.

AGS interpreter was written to emulate x86 processor, and optimized to work with raw memory, thus engine works with non-managed structs faster than with managed ones.
Also AGS script is generally slower compared to most modern script languages out there; and the number of changes made since 3.2.1 for compatibility reasons made it seriously slower than before, especially when it comes to dynamic arrays and managed structs.
(there has been even proposal to introduce real unmanaged pointers so that you could pass unmanaged structs or regular variables by address and write faster low-level code)

If non-managed structs are abandoned in favor of all-managed objects, that would be a different language based on different concepts. It will likely be necessary to redesign the compiled bytecode and/or  the interpreter to make it optimized towards working with managed memory instead.

Side note, even C# has managed objects (declared as class) and structs which work in a memory-optimized way.
https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/choosing-between-class-and-struct

Monsieur OUXX

#2
I understand everything you wrote. For argument's sake, I'd simply like to say "yeah, ok, structs still exist out there but no one uses them in modern languages except for very specific, optimization-related reasons". And in any case their syntax is not different from classes.
The removal of the star is good enough for me.

That being said, the reasons you mentioned related to the byte code are solid. Always that same glass ceiling. Damn.
 

fernewelten

If there's clamour for it, I can introduce a 'class' keyword to the compiler. OTOH, we can achieve a similar effect as far as this is concerned by simply saying #define class struct.

But that's only half the fun with classes.

What you'd expect from classes is constructors and destructors and stuff. Also, unless you have grown up with 'C' style languages, True Inheritance for functions and attributes: For instance let Car inherit from Vehicle: When a Vehicle pointer v actually points to a car at runtime then the expression 'v.Accelerate()' should by default call 'Car.Accelerate()', not 'Vehicle.Accelerate()'.

We can't get that sort of thing from a simple renaming.
To the contrary, renaming 'class' to 'struct' might raise expectations that we can't yet fulfill.

fernewelten

That we emulate the technological limits of an x86 processor is a bit of make-believe.

For instance, we compile for exactly 8 registers. In the x86 processor, there might indeed be exactly 8 memory addresses that are particularly fast to access. But the virtual machine that we provide isn't fundamentally limited in that way. We only pretend that. Our registers are just a constant-size array. We could equally provide 64 or 2048 "register" addresses without losing any efficiency whatsoever.

In the same vein, we pretend that we can only do conditional jumps that are dependent on the AX register but not on the BX register. But our virtual machine doesn't have that limit really. It's about just as efficient to do (if REGISTERS[0] == 0) as it is to do (if REGISTERS[1] == 0).

If the x86 instruction set is indeed a bottleneck, we could gradually extend it and gradually make the compiler prefer the added instructions to the inefficient ones. But perhaps it may be the other way round: The original x86 instructions might be emulated efficiently, and the instructions that have been added to this set might be the inefficient culprits. Or perhaps neither is very inefficient, and we're barking up the wrong tree entirely.


Crimson Wizard

#5
Quote from: fernewelten on Fri 24/09/2021 03:08:23
If the x86 instruction set is indeed a bottleneck, we could gradually extend it and gradually make the compiler prefer the added instructions to the inefficient ones. But perhaps it may be the other way round: The original x86 instructions might be emulated efficiently, and the instructions that have been added to this set might be the inefficient culprits. Or perhaps neither is very inefficient, and we're barking up the wrong tree entirely.

I don't know if these are "bottleneck", in any case these are definitely not the only bottleneck: as I mentioned multiple times here, there are other things that were added on top of VM post 2012 that made it slower.

But it's not like this inneficiency is a random baseless claim. It has been noted before by several people; including me when I was studying how AGS VM works. Even though I am not an expert in asm or x86 architecture, but I noticed that it has unnecessary repetition of operations which move data around instead of perfoming an actual task. This is described here: https://www.adventuregamestudio.co.uk/forums/index.php?topic=47320.0

Such operations could make sense for a hardware built in certain way, but I doubt this is the best way to approach a scripting language in software.

rofl0r once posted a test results on github, where he compared the bare AGS script interpreter (extracted from the engine and free from additional burden), with other scripting languages, running same algorithms. I might find his post later, but it was clear that AGS script is really slower than others. Also I recall he was doing some experiments in "squashing" bytecode instructions without adding any new ones, and achieved perhaps not major but noticable speed improvements in the script heavy games.

fernewelten

#6
By the bye, concerning the second example in your post you reference: Allocating an integer variable.
Code: ags
LITTOREG ax, arg2  <--- copy the literal number (initial value) to Ax register
REGTOREG sp, mar   <--- copy stack ptr address to memory address register (MAR)
MEMWRITE ax, mar   <--- write value from Ax register to where MAR points to (here - stack ptr)
ADD      sp, arg2  <--- advance stack ptr by N bytes (4, if that was integer), thus completing new local variable


The fastest way to do this with the current instruction set is to simply PUSH the initial value onto the stack (and keep track offline that the memory block on stack has increased by 4 bytes, which you need to do in any case). I think that's what the new compiler does right now.
Code: ags
LITTOREG ax, arg2 
PUSHREG ax


fernewelten

#7
Part of the problem isn't in the register architecture itself but in the compiling concept, which is everything but optimal.

Crimson's first example is comparing two values:
Code: ags
PUSHREG ax         <--- copy first value from Ax register to stack
LITTOREG ax, arg2  <--- copy second (literal) value to Ax register
POPREG bx          <--- copy first value from stack to Bx register
LESSTHAN bx, ax    <--- compare first value to second and store boolean result in Bx register
REGTOREG bx, ax



  • Lines 1 and 3 are clumsy because they are equivalent to moving AX to BX, for which an instruction already is in the instruction set.
  • Line 5  would be superfluous if the compiler could simply keep track of the fact that the expression result happens to be in BX this time instead of enforcing that it must always end up in AX no matter what.
  • And the whole construct could be simplified if we simply kept the first value in AX, where it is fine already, and loaded BX with the second value.
Code: ags
LITTOREG bx, arg2
LESSTHAN ax, bx


The instructions are all in the instruction set already. You can't blame that on the instruction set, that's all on the compiling concept.

The compiler makes no attempt to keep track of what registers are currently loaded with values we will need and what registers may safely be clobbered.
The compiler insists that all expression results must always end up in AX,

  • even though BX to DX are available, too, and especially CX and DX are idling most of the time
  • even though a good portion of "expression results" are in the memory pointed to by MAR, with MAR already loaded correctly, which is good enough for keeping track of them, and
  • even though a portion of "expression results" are literals or constants known at compile-time and so really needn't be stored anywhere before they are used.

Crimson Wizard

#8
@fernewelten,

I think all the above belongs to the "slowness" topic, or the bytecode optimization topic (the one I linked above). This topic was about syntax change proposals, and it feels like the conversation starts going offtopic.

In my opinion your earlier comment about AGS not supporting constructors, destructors and virtual inheritance is the main reason to not rename AGS structs into "classes".

I might have made a mistake talking about bytecode optimization here again. I cannot formulate this properly, but my concern is simply that the language focused on the work with the managed memory should be structured differently. Maybe my assumption is incorrect.
Too many things went wrong during making AGS 64-bit compatible, there was a significant overhead added to achieve that, and everytime there's a discussion about managed objects, I keep returning to this problem in my mind. Perhaps if it were resolved one day, the situation would become easier.

After a thought, my personal argument against removing non-managed structs is that regular structs are simply easier to use for newcomers. "Null pointer" errors may be confusing to people who do not know programming very well, and may make their learning of AGS scripting slower and more frustrating. Removing common structs would achieve probably nothing, but will make programs more difficult to write for non-tech people, and generally slower due to how AGS works now.

edmundito

Is this a general thread about improving the syntax in AGS 4 or specifically about changing struct to class? I got a huge list of things I'd like to see improved.

Should I start a new thread or reply here?
The Tween Module now supports AGS 3.6.0!

fernewelten

For a general discussion, have a go at your pet peeves.

But in order to act on specific suggestions, we'd need them in the AGS bug tracker so that they don't get lost.

Also have a look at
to see what we've already got with the new compiler so far.

I'll see what I can do for you :)

SMF spam blocked by CleanTalk