Welcome to part two of the Shogun tutorial series! If you haven’t read it yet, go look at part one.
The Stack
Let’s get into some of the internals of Shogun, and find out how they affect us. First up: the stack!
Shogun is a stack-based virtual machine, as opposed to a register-based one. What does that mean? Well, almost all CPUs are register machines (including the one in your computer). Register machines have, in addition to memory space to store data, a few select areas that they can store other data for fast access, called registers. Most operations in register machines operate on these registers. Shogun, however, uses the stack approach.
Instead of a limited set of spaces to store data, Shogun has a ‘stack’ of space to store data in addition to memory. Internally, it is represented as a dynamic list (actually a deque as of this writing), but only two functions are exposed to scripts: pushing and popping.
First, think of the stack like a stack of papers on your desk (if you don’t have a desk, then tough luck). If I push onto the stack, I am placing a new piece of paper (data) on top of the stack. If I pop the stack, then I am taking a piece of paper off of the stack.
In fact, push and pop are the names of the commands you use! Let’s try them out. Throw this together:
main:
push 1
push 2
add
call print
What do you think that will output? Go ahead and run it through shoasm and then shogun. You should get an answer of 3. How did that work? Well, the 1 and 2 obviously were added together, but let’s talk about exactly how that add command works.
First, you push 1 and two onto the stack. The stack now looks like this:
2
1
Where the 2 is on top (and hence if we run pop, the 2 will be removed). After running the two pushes, the add operations runs. It pops twice to get two operands, and then adds them together. Finally, the call operation runs print which then pops the result of add and prints it.
Now try running this:
main:
add 2 1
call print
You should get the same answer of 3! What happened to the push statements? Well, this is where we talk about some syntactical sugar that the assembler gives us.
Sugar of the Syntax
Writing push statements all the time is fairly annoying, so the assembler takes away the trouble by allowing you to specify arguments inline. You’ve already been doing that for the call operation, even if you haven’t realized it. In reality, the assembler actually converts these inline arguments and inserts the push operations for us!
Try running this code:
main:
dump
add 2 1
call print
You should get something like so:
-------
ShogunVM 0.1.6 (binary 9) dump
Constant Table:
[0:12345678] = "print"
Instruction List (pointer @ 0):
[0] = DUMP
[1] = PUSH 1
[2] = PUSH 2
[3] = ADD
[4] = LOADK 0
[5] = CALL
Stack:
Memory:
End of Dump
-------
3
Whoa! A dump has appeared in front of us! If you haven’t guessed already, dump will have Shogun print the current state of the virtual machine to console. Let’s see what each part means.
Those first few lines give us Shogun’s version (and the binary version) and tells us what is in the constant table. What is this constant table, you ask? Well, it holds any value that isn’t a number. I’ll talk about why we need that later.
Take a look at the highlighted section in the middle, the one labeled “Instruction List”. That first line tells us that the pointer is at 0. That means that the instruction at 0 is about to be executed. Now, look at the list of instructions and compare it to your code. The first instruction is right, but after that it isn’t!
The assembler pulled a fast one on us. It converted add 2 1 into two push statements! Not only that, but it reversed the operands! Something must be wrong! No, that is correct. Remember, whatever is pushed last onto the stack is popped first, so you have to push arguments in reverse order (don’t believe me? replace add with sub and try it yourself).
List v. Stack
Now, someone who looks carefully may notice that push breaks the rule of popping from the stack! Well, yeah, it is because there has to be some way to get items on the stack without removing them first, but that isn’t the only reason. There are actually two types of instructions: list-based, and stack-based.
Stack-based instructions are what we’ve been talking about; they pop arguments from the stack. List-based instructions take their arguments instead from an argument list that is written with the instruction in the binary. Arguments for list-based instructions can’t be taken from the stack, and as such must be specified during compile-time. Side-note: this also prevents variables from being used in list-based instructions, but we’ll get to that in a later article.
Conclusion
That’s it for part two. If you want an exercise to work on yourself, try making a simple adding calculator that asks for each operand, then adds them. Hint: To read from console, use call readline. It will push whatever is typed on a line onto the stack.
Click here for part 3.