A Scheme Interpreter for ARM Microcontrollers: Implementation (050)

Overview:

Armpit Scheme is written in ARM assembly using the unified syntax (ARM and THUMB-2) for the ARM7TDMI , ARM920T, ARM966E, ARM926EJ, Cortex-M3 and Cortex-A8 cores (ARMv4T, ARMv5TEJ and ARMv7M architectures). The ARM part of the relevant assembly language is summarized in Quick Reference Cards here and here, and the ARM7TDMI core's technical reference (eg. operating modes, stack usage, ...) is available here.

The Armpit Scheme source code is organized into 1 main configuration file, 13 common files and up to 6 mcu-specific files:

    Main Configuration File:    armpit_050.s

    Common Files:               armpit_as_constants.s
                                armpit_as_macros.s
                                armpit_reset_ARM.s
                                armpit_reset_CM3.s
                                armpit_init.s
                                armpit_core.s
                                armpit_port.s
                                armpit_scheme_base.s
                                armpit_scheme_base_6.2.Integers.s
                                armpit_scheme_base_6.2.Numbers.s
                                armpit_scheme_base_library.s
                                armpit_scheme_base_r6rs_library.s
                                armpit_scheme_read_write.s

    MCU-Specific Files:         [BOARD].h
                                [FAMILY].h
                                [FAMILY]_startup.s
                                [FAMILY]_init_io.s
                                [FAMILY]_usb.s
                                [FAMILY]_system_0.s

The main configuration file is used to specify the board for which ArmPit Scheme is assembled, its ID (for I2C multiprocessing, where available), and the major options to use: 1) type of garbage collection, 2) obarray storage location, 3) r6rs extensions (eg. fx+, fx-), 4) inclusion of system 0, 5) inclusion of I2C susbsytem, 6) inlining of memory allocation, 7) top-level environment type, 8) exclusion of non-core functions, 9) exclusion of non-integer math, 10) small vs normal eval-apply, 11) exclusion of pack function, and 12) exclusion of r5rs macros (r3rs option). The configuration file then uses conditional statements and .include directives to combine the needed components of ArmPit Scheme for assembly. It further stitches together the built-in environment for scheme (label scmenv:) as a scheme vector that combines sub-environments defined in the common files (see below), defines the table of common pre-entry functions (label paptbl:, see below) and also contains code for turning board LEDs on and off (used to indicate status, including errors).

The common files represent the bulk of the implementation that is mostly independent of MCU characteristics. armpit_as_constants.s defines the tags used to encode data types, basic constants (#t, #f, #null, scheme_inf, backspace_char, ...), data type indices, function entry indices, register renaming, location and/or sizes and/or indices of system stack, buffer(s) and heap(s), addresses of main program sections for assembly, and, processor run modes. armpit_as_macros.s defines macros used throughout the source code to make it easier to read and, at times, more schemey (eg. the set macro is aliased to the ARM mov instruction). armpit_reset_ARM.s and armpit_reset_CM3.s contain the code that is typically stored at address 0x00 in the ARM code space for ARM and Cortex-M3 MCUs, respectively (either one, or the other, is included in the assembly, based on MCU type). The corresponding code executes on system reset, it sets-up system stacks and defines the location of interrupt service routines. armpit_init.s finalizes hardware initialization and initializes the scheme system. It calls the mcu-specifit hwinit: code in [FAMILY]_init_io.s to set-up needed system clocks, power-up peripherals, configure pins and interrupts (eg. uart). It then sets up the scheme global vector, buffers, heap, environment, interrupts and starts the Read-Eval-Print (rep) loop which is a string at label prgstr: in armpit_init.s.

armpit_core.s contains the inner functionality of the scheme system: string-symbol comparison and copying (labels: stsyeq: and subcpy:), interrupt service routine (label genisr:), garbage collection (labels gc: and gc_bgn:), memory allocation (labels zmaloc:, cons:, save:, ...), common function exits (labels trufxt: ...), type-checking (label typchk:), eval-apply functionality (labels eval:, apply:), environment frame extension and variable lookup (labels mkfrm:, bndchk:, vrnsrt:), error handling (labels catch:, throw:, error4:), version and global vector access (labels versn_:, _GLV:), object address, packing and unpacking (labels padrof:, punpak:, ppack:), and, user library management and file system cleaning (labels plibra:, pexpor:, pimpor:, pfcln:). The top of the file defines the sub-environment that it exports to the scheme user space. The exported scheme objects and functions (or forms) provided in that sub-environment are: _winders, _prg, (throw ...), (gc), version, (_GLV), (address-of ...), (packed-data-set! ...), (unpack ...), (pack ...), (library ...), (export ...) and (import ...). Hooks to machine-code functions are also exported to support in-system assembly and compilation of scheme programs from user space: _catch, _lkp, _mkc, _apl, _dfv, _alo, _cns, _sav, _isx, _ism, _gc, _err.

armpit_ports.s contains intermediate-level input/output port functionality that sits between the scheme level and the hardware level. Up to 6 port types are defined (based on main configuration and board options): file, memory, uart, USB, SD-card and I2C. The code contains functions for opening and closing specific ports and for reading from and writing to them. It also contains interrupt-service routines (ISRs), branched to from genisr: (in armpit_core.s), for those ports (uart, i2c, usb) whose operation is interrupt-driven (labels: puaisr:, pi2isr:, usbisr:). Port models (explained in a later section) are exported to scheme user space via the ports sub-environment, at the top of the file, with corresponding symbols: FILE, MEM, UAR0, uAR1, USB, SDFT, I2C0 and I2C1.

The remaining common files: armpit_scheme_base.s, armpit_scheme_base_6.2.Integers.s, armpit_scheme_base_6.2.Numbers.s, armpit_scheme_base_library.s, armpit_scheme_base_r6rs_library.s, and armpit_scheme_read_write.s implement Scheme functions and macros, following R5RS with some R6RS extensions and some extensions specific to ArmPit Scheme. Each file starts with the sub-environment vector that specifies the symbol-value bindings contributed by the file. This is optionally followed by constants, such as the variable IDs of scheme objects, and the symbols and code or values of implemented objects. armpit_scheme_base.s contains most of the Scheme functions specified as non-library in R5RS, except for the read/write subsystem implemented in armpit_scheme_read_write.s and numeric functions that are implemented in armpit_scheme_base_6.2.Integers.s and armpit_scheme_base_6.2.Numbers.s. The read-write subsystem is included conditionally in the assembly based on an option selected in the main configuration files (normally included). Similarly, either the Integers or the Numbers file is included in assembly based on an option in the main configuration file (Integers is used for those MCUs that have small FLASH and RAM spaces, eg. LPC-2103, LPC-2131 and LPC-1343). The armpit_scheme_base_library.s file contains functions defined as being of library type in R5RS and the armpit_scheme_base_r6rs_library.s file adds bytevectors, bitwise operations and, optionally, fixnum operations, defined in R6RS.

The mcu-specific files contain constants and code that are specific to a given board or to a MCU family. The [BOARD].h files (eg. TCT_Hammer.h) contain board-specific configuration data (some of which is MCU related) such as LED pin ports, PLL parameters, whether the board has USB, SD-card port, pins and interface type, whether the board should operate in Live-SD mode (no FLASH use), uart parameters, RAM addresses, buffer addresses, and FLASH addresses for user files and libraries. The [FAMILY].h files (eg. S3C24xx.h) contain MCU family-specific constants such as number of interrupts and interrupt numbers for specific peripherals, masks for interrupts to be enabled, base addresses and offsets of peripheral registers, and type of GPIO pin-setting/clearing (combined or separate registers). The [FAMILY]_startup.s files (eg. S3C24xx_startup.s) contains system startup code for those MCUs without executable FLASH that need the ArmPit Scheme machine code to be copied to RAM for execution (EP-9302, LPC-2888, S3C24xx, OMAP3530 and DM3730). The [FAMILY]_init_io.s files (eg. S3C24xx_init_io.s) contain: 1) the machine code ISR vector for the MCU, 2) hardware initialization code (eg. clocks, FLASH, LED pins, uart, usb) executed at system reset (label: hwinit:), 3) a routine to check whether to load the scheme "boot" file at startup based on GPIO pin status (label FlashInitCheck:), 4) low-level functions to write user files to FLASH or SD card and for storing user libraries in FLASH (if available) (labels: wrtfla:, ersfla:, libwrt:, sd_cfg:, ...), 5) the file and library FLASH sector maps (labels: flashsectors:, lib_sectors:), and 6) low-level i2c isr sub-functions (labels: hwi2cr: ...). The [FAMILY]_usb.s files (eg. S3C24xx_usb.s) contain sub-functions used by the USB subsystem. The [FAMILY]_system_0.s files (eg. S3C24xx_system_0.s) contain useful scheme symbols and functions for interacting with hardware. These files start with a sub-environment vector containing exported bindings representing register names / base addresses and scheme functions (eg. config-power, config-pin, ...).

Armpit Scheme runs the MCU in User Mode (not Priviledged Mode). All interrupts processed by the Interrupt Service Routines (ISRs) are categorized as IRQ (not FIQ). The implemented scheme language (especially its extensions) is presented on the Armpit Scheme Language web page.

Register Usage:

ARM registers are renamed within the Armpit Scheme source to better reflect their purpose during normal program execution (outside of interrupts). The 16 ARM User Mode registers, their name within the source code and their use outside of interrupts is as follows (file: armpit_as_constants.s):

   ARM  Armpit
   Name  Name   Usage
    r0    fre   pointer to the next free memory cell (+ mem reservation status)
    r1    cnt   scheme continuation register
    r2    rva   raw value work register a (not garbage collected)
    r3    rvb   raw value work register b (not garbage collected)
    r4    sv1   scheme value work register 1
    r5    sv2   scheme value work register 2
    r6    sv3   scheme value work register 3
    r7    sv4   scheme value work register 4
    r8    sv5   scheme value work register 5
    r9    env   pointer to current environment
    r10   dts   pointer to data/return stack
    r11   glv   pointer to global scheme vector
    r12   rvc   raw value work register c (not garbage collected)
    r13         system stack pointer (ARM sp)
    r14   lnk   system link return (ARM lr)
    r15         system program counter (ARM pc)

The value in fre (r0) is updated by the memory management code in armpit_core.s, each time new memory is allocated on the heap, during garbage collection and possibly during interrupts. The memory is allocated in 2-word (8-byte) chunks such that the binary value in fre would normally end with 3 zeros in its lsbs (least-significant bits). In Armpit Scheme, the 2 lsbs in fre are used to indicate the reservation status of memory to support post-interrupt restarting of memory allocation (Software Transactional Memory - STM). Outside of memory allocation, the memory is de-reserved and fre ends in #b10. During a memory allocation operation (cons, list, save zmaloc), the memory is reserved at level 1 and fre ends in #b01. During garbage collection, the memory is reserved at level 2 and fre ends in #b00. If a memory-allocating interrupt (eg. context-switch) occurs while memory is reserved at level 1, the return address of the interrupted code is modified so that its interrupted memory allocation is restarted when the code is resumed. If a memory-allocating interrupt occurs while memory is reserved at level 2 (i.e. during gc), the interrupted gc is continued with interrupts disabled prior to continuing with the interrupt's own memory allocation process. Garbage collection is triggered when memory allocation would cause fre to be greater or equal to the current value of heaptop, in which case the allocation is restarted after the gc completes.

The scheme continuation register, cnt (r1), stores the address at which an Armpit scheme function should return upon completion. On ARMv4T MCUs and Cortex-A8, cnt ends in #b00 (32-bit instructions for ARM assembly) and cnt (an address) always points to code memory that is outside of the heap (it is gc-safe). On ARMv7M MCUs (Cortex-M3), cnt may end in #b00 or #b10 (mix of 16- and 32-bit Thumb-2 instructions) and may be interpreted as a pointer into code space or a floating point number, both of which are gc-safe. The ARM lnk register (r14, lr) behaves slightly differently on ARMv7M as discussed below.

The rva, rvb and rvc registers (r2, r3 and r12) are used to manipulate non-scheme entities in the code, for example the raw (un-tagged) value of an integer or ASCII character. The content of these registers is not garbage collected (they are actually modified by the gc). Their content is side-effected by memory allocation functions (zmaloc, save, list, cons) and is (generally) not preserved across such operations. They are however preserved during context-switching (saved in a special vector) as required for code resumption. The register rvb is also used to specify how many bytes to allocate when zmaloc is called and how many bytes were to be allocated when a gc was triggered. The gap in ARM registers between rvb and rvc (r3 and r12) is used to support the automatic register-stacking on interrupt performed by the Cortex-M3 cores (the eight registers r0-r3 and r12-r15 are automatically stacked on that core -- all of which are non-gceable in Armpit Scheme, such that the auto-stacking is properly gc-safe).

Registers sv1 to sv5 (r4 to r8) are used to temporarily store scheme values within the code. The content of these registers is garbage collected and must always be interpretable as a proper scheme object (scheme integer, character, list, vectcor, ...). On function entry, sv1 holds the 1st function argument, sv2 holds the second argument and so on, unless a function is specified as having 0 input arguments in which case sv1 holds a list (possibly empty) of all its input arguments. The unpacking of input arguments into registers is done by apply (label prmapp: in armpit_core.s) just before it calls a scheme function. On function exit, sv1 holds the function's return value.

The env register (r9) stores the address of the current environment (a scheme list on the heap or, optionally, a binary tree) that eval uses to evaluate an expression. In Armpit Scheme, a list (such as env) consists of cons cells that each occupy 8 contiguous bytes of RAM (2 contiguous words). The first word of the cons cell contains the car and the second word contains the cdr of the cell. For example, in ARM assembly language, the car of the environment list (eg. 1st binding frame) is refered to by [env] and the cdr of the environment list is refered to by [env, #4].

The address (in heap) of the data/return stack is stored in dts (r10). This stack is implemented as a scheme list and hence all of its components need to be gc-safe (scheme integer, character, list, vector, ...). This (gc-safety) includes the contents of cnt, env, sv1-sv5, and sometimes rva-rvc (when they store an immediate, eg. a scheme int or char). The scheme stack in dts is used to store both scheme functions' return addresses (cnt) and scheme data (sv1-sv5) when these values need to be preserved against upcoming computations. The Armpit Scheme assembly macros save and restor (code file armpit_as_macros.s) add and remove items from the stack. Unlike the system stack (r13, sp) this scheme stack is functional (i.e. not side-effected) such that captured continuations can extend it and resume from it independently of one another. The bottom of this stack consists of a null circular cons cell (a cell whose car is null and whose cdr points to its car), stored in the source code (normally in FLASH memory) at label stkbtm: in the file armpit_core.s.

The glv (r11) register contains the heap address of the global scheme vector that stores a variety of scheme objects used by Armpit Scheme as follows:

       index    object
         0      scheme interrupt callback function
         1      address of top of current heap (as pseudo scheme integer)
         2      scheme data object received from i2c0 port
         3      scheme data object received from i2c1 port
         4      scheme default input/output port
         5      scheme main system buffer address
         6      scheme open file list
         7      scheme global environment (top-level)
         8      scheme obarray
         9      address of top of lower heap (pseudo scheme integer)
        10      address of top of upper heap (stop-copy) or grey set (mark-sweep)
        11      file flash end page address
        12      library flash start page address
        13      address of built-in scheme environment
        14      library mode1 indicator, for reader
        15      library mode2 indicator, for reader
        16      primitive pre-entry function table bytevector (paptbl:)

It is returned by the _GLV function. The scheme interrupt callback is a user-defined function accessed via vector-ref or vector-set! with (_GLV) and 0 (index) as arguments. The address of the top of the current heap is updated by garbage collection or when non-moving objects are installed and the heap is shrunk. The top of the lower and upper heap (or grey set) are equal when mark-and-sweep gc is used but differ when stop-and copy gc is used (they are also updated when the heap shrinks). The i2c0 and i2c1 data objects are used for storing packed gc-eable objects received via i2c. The scheme main system buffer (address stored at index 5) is a bytevector in non-heap RAM (described further in the next section). The open file list (index 6) is updated by file manipulation functions such as (open-input-file ...) and (close-output-port ...). The global environment (a list of binding frames) at index 7 is the user environment used by the rep and (load ...) and returned by (interaction-environment), that sits on top of the built-in implementation environment. The built-in environment is a vector of environment vectors. Its start address is stored at index 13 and can change when libraries are imported (to include the library's exported environment). The file flash end page and library flash start page (indices 11 and 12) keep track of flash space utilization when in-flash libraries are instantiated and deleted. The library flash space grows downwards into file flash space. The scheme obarray (index 8) is an association list of bindings between symbol names and variable IDs (object array) and is updated by the parser as well as functions like (string->symbol ...). The primitive pre-entry function table (index 16) is a bytevector that stores the start addresses of built-in pre-entry functions (explained later).

The system stack (r13, sp) is used (exclusively) for interrupts and during some file-oriented operations (rather than to stack return addresses during ordinary Scheme function calls). It points to a small region at the top of RAM that is not part of the heap (not garbage collected). It is a standard, fast, raw stack, rather than a more flexible, slower, scheme object (eg. list).

The system link register, lnk (r14, lr), is used to provide a return address when bl is used to branch to a non-scheme function (that returns via: set pc, lnk). On ARMv4T and Cortex-A8 MCUs, lnk ends in #b00 and is gc-safe (except in special cases of writing to FLASH where it may temporarily point to heap RAM in which case the code is protected against interrupts) and can be temporarily stored in a scheme value register (sv1-sv5, and stack) if it needs to be preserved against a second bl to another non-scheme function. On ARMv7M MCUs (Cortex-M3), the value of lnk produced by bl may end in #b01 or #b11 (the lsb is automatically set to indicate Thumb mode) and the lsb needs to be cleared to make lnk gc-safe if it needs to be saved in a scheme value register, and then the lsb needs to be restored prior to returning via lnk. To help with this process, the constant lnkbit0 (#b00 on ARMv4T and Cortex-A8 but #b01 on ARMv7M) is defined and used where necessary in the Armpit Scheme code (file armpit_as_contants.s).

The system program counter (pc, r15) keeps track of which instruction the ARM core is executing. It has the same gc-safety characteristics as the lnk register discussed above (can be preserved in scheme variable registers). During a scheme function call (cf. call macro in file armpit_as_macros.s), the pc is stored in cnt to provide a return address. In the case of Cortex MCUs, the return code is padded with nops to account for the variable instruction length (16- and 32-bits, while pc is 8-bytes ahead when captured).

Memory Usage:

Armpit Scheme runs the Scheme core from either FLASH or RAM. For MCUs with executable FLASH, the core runs from FLASH while for others it runs from external RAM, remapped by the MMU to start at 0x00 in most cases. User files (created with open-output-file) are stored in on-board FLASH (either on-chip or off-chip) except on Live-SD OMAP3530/DM3730 where they are stored on an external SD-card. User libraries (executable) are installed in on-chip flash where available and in above-heap RAM where executable FLASH is not available.

Most of the RAM is defined as heap space. Some RAM below the heap is typically used for Armpit Scheme I/O buffers, while RAM above the heap is used by the system stack and above-heap objects. The top of the heap moves down in RAM as above-heap objects are installed (unpacked) in a running system. On MCUs and boards with two or more RAM spaces, the below-heap buffers are commonly moved to the smaller RAM space while the heap, above-heap objects and system stack are placed in the larger RAM space. Typical memory maps are depicted below. The first map (used on most MCUs) is for the case where all RAM and FLASH are on-chip. The second map is for the case of on-chip and off-chip RAM with only on-chip FLASH (LPC2478-STK) and the third map adds off-chip FLASH (LPC-H2214, LPC-H2294). The 4th map is for cases without executable FLASH (LPC-H2888, CS-EP9302, TCT-Hammer S3C2410A, TI-BeagleBoard B7) and the 5th map is for cases without any FLASH at all (BeagleBoard Live-SD option, Overo TIDE, BeagleBoard-XM). In cases without executable FLASH, user libraries defined by (library ...) forms are installed in RAM and do not survive system resets.


          --------------------------------------------------------
                       ARMPIT SCHEME COMMON MEMORY MAP
           (system buffers in LPC2148/2158 are in on-chip USB RAM)
          --------------------------------------------------------

Address space top        --------------------------
                                    empty
Top of peripherals       --------------------------
                             PERIPHERAL REGISTERS
Bottom of peripherals    --------------------------
                                    empty
Top of on-chip RAM       --------------------------
                                SYSTEM STACK          <- 160 or 288 bytes
                              grey/black sets         <- if mark-and-sweep gc
          (dynamic size)   above-heap unpack space    <- GROWS downwards
          (dynamic size)           HEAP 1             <- if stop-and-copy gc
          (dynamic size)           HEAP 0
                                write buffer          <- if native USB
                                read buffer
                             MAIN SYSTEM BUFFER
Bottom of on-chip RAM    --------------------------
                                    empty
Top of on-chip FLASH     --------------------------
          (dynamic size)        USER LIBRARIES        <- GROWS downwards, min. 0
          (dynamic size)      FILE CRUNCH SPACE       <- 1 FLASH sector
          (dynamic size)         USER FILES           <- min. 1 FLASH sector
                         ARMPIT SCHEME MACHINE CODE   <- min. 24 KB, max. 64 KB
Bottom of on-chip FLASH  --------------------------



          --------------------------------------------------------
                   ARMPIT SCHEME MEMORY MAP for LPC-2478
          --------------------------------------------------------

Address space top        --------------------------
                                    empty
Top of peripherals       --------------------------
                             PERIPHERAL REGISTERS
Bottom of peripherals    --------------------------
                                    empty
Top of off-chip RAM       --------------------------
                                SYSTEM STACK          <- 288 bytes
                              grey/black sets         <- if mark-and-sweep gc
          (dynamic size)   above-heap unpack space    <- GROWS downwards
          (dynamic size)           HEAP 1             <- if stop-and-copy gc
          (dynamic size)           HEAP 0
Bottom of off-chip RAM    --------------------------
                                    empty
Top of on-chip RAM       --------------------------
                                write buffer          <- if native USB
                                read buffer
                             MAIN SYSTEM BUFFER
Bottom of on-chip RAM    --------------------------
                                    empty
Top of on-chip FLASH     --------------------------
          (dynamic size)        USER LIBRARIES        <- GROWS downwards, min. 0
          (dynamic size)      FILE CRUNCH SPACE       <- 1 FLASH sector
          (dynamic size)         USER FILES           <- min. 1 FLASH sector
                         ARMPIT SCHEME MACHINE CODE   <- min. 24 KB, max. 64 KB
Bottom of on-chip FLASH  --------------------------



          --------------------------------------------------------
                  ARMPIT SCHEME MEMORY MAP for LPC-2214/2294
          --------------------------------------------------------

Address space top        --------------------------
                                    empty
Top of peripherals       --------------------------
                             PERIPHERAL REGISTERS
Bottom of peripherals    --------------------------
                                    empty
Top of off-chip RAM       --------------------------
                                SYSTEM STACK          <- 288 bytes
                              grey/black sets         <- if mark-and-sweep gc
          (dynamic size)   above-heap unpack space    <- GROWS downwards
          (dynamic size)           HEAP 1             <- if stop-and-copy gc
          (dynamic size)           HEAP 0
Bottom of off-chip RAM    --------------------------
                                    empty
Top of off-chip FLASH     --------------------------
                              FILE CRUNCH SPACE       <- 1 FLASH sector
                                 USER FILES
Bottom of off-chip FLASH  --------------------------
                                    empty
Top of on-chip RAM       --------------------------
                                write buffer          <- if native USB
                                read buffer
                             MAIN SYSTEM BUFFER
Bottom of on-chip RAM    --------------------------
                                    empty
Top of on-chip FLASH     --------------------------
          (dynamic size)        USER LIBRARIES        <- GROWS downwards, min. 0
                         ARMPIT SCHEME MACHINE CODE   <- max. 64 KB
Bottom of on-chip FLASH  --------------------------



          ---------------------------------------------------------
               ARMPIT SCHEME MEMORY MAP for LPC-2888, CS-E9302,
                 S3C2410A and OMAP3530 (non-Live-SD option)
          ---------------------------------------------------------

Address space top         --------------------------
                                    empty
Top of peripherals        --------------------------
                             PERIPHERAL REGISTERS
Bottom of peripherals     --------------------------
                                    empty
Top of off-chip FLASH     --------------------------
                              FILE CRUNCH SPACE       <- 1 FLASH sector
                                 USER FILES           
                          ARMPIT SCHEME MACHINE CODE  <- not on OMAP3530
Bottom of off-chip FLASH  --------------------------
                                    empty
Top of off-chip RAM       --------------------------
                                SYSTEM STACK          <- 160 bytes
                              grey/black sets         <- if mark-and-sweep gc
          (dynamic size)   above-heap unpack space    <- GROWS downwards
                             and USER LIBRARIES
          (dynamic size)           HEAP 1             <- if stop-and-copy gc
          (dynamic size)           HEAP 0
                                write buffer          <- if native USB
                                read buffer
                             MAIN SYSTEM BUFFER
                          ARMPIT SCHEME MACHINE CODE   <- max. 64 KB
Bottom of off-chip RAM    --------------------------



          ---------------------------------------------------------
           ARMPIT SCHEME MEMORY MAP for OMAP3530, DM3730 (Live-SD)
          ---------------------------------------------------------

Address space top         --------------------------
                                    empty
Top of peripherals        --------------------------
                             PERIPHERAL REGISTERS
Bottom of peripherals     --------------------------
                                    empty
Top of off-chip RAM       --------------------------
                                SYSTEM STACK          <- 160 bytes
                              grey/black sets         <- if mark-and-sweep gc
          (dynamic size)   above-heap unpack space    <- GROWS downwards
                             and USER LIBRARIES
          (dynamic size)           HEAP 1             <- if stop-and-copy gc
          (dynamic size)           HEAP 0
                                write buffer          <- if native USB
                                read buffer
                             MAIN SYSTEM BUFFER
                          ARMPIT SCHEME MACHINE CODE   <- max. 60 KB (approx)
Bottom of off-chip RAM    --------------------------

The scheme MAIN SYSTEM BUFFER in the above maps is a bytevector stored in non-heap RAM (contents are not subject to garbage collection). The address of this bytevector is stored at index 5 in the global system vector (_GLV) described earlier to make it available from scheme in user space. Its contents can be accessed via byte-indexing, or, if meaningful, vector indexing (eg. (vector-ref (vector-ref (_GLV) 5) 1) for machine-code ISR vector):

       byte   vector
       index  index   item
         0      0     file lock (file system locked if non-zero)
         4      1     address of vector of machine-code ISRs
         8      2     address of read buffer
        12      3     MCU's I2C address (if MCU has no dedicated register)
        16      4     I2C0 state buffer (5 words)
        36      9     I2C1 state buffer (5 words)
        56     14     address of write buffer
        60     15     USB state buffer (13 words)
       112     28     Cortex-M3 enabled interrupts (64 or 96 bits)

The file lock at byte-index 0 stores the state of the file system (non-zero if reserved, zero if free). The machine code ISR address at vector-index 1 points to the vector of interrupt service routines, normally in FLASH at startup (described in the next section). The address of the read buffer that stores incoming characters from the uart or usb is at vector index 2 while that of the write buffer (used for outgoing characters on usb only) is at vector-index 14. These buffers are bytevectors (fixed size) stored in non-heap RAM. Their first word stores the number of chracters in the buffer and remaining bytes store the actual characters. The MCU's I2C address at byte-index 12 is used on MCUs that do not otherwise have a peripheral register to store it (eg. SAM7, EP9302). The I2C buffers at byte indices 16 and 36 are used to store the state of I2C transactions and a single word (4 bytes) of I2C data. The USB buffer at byte-index 60 stores the state of USB transactions and partial data used during enumeration. The Cortex-M3 enabled interrupts at byte-index 112 is a bit-mapped area used by the core when disabling and re-enabling interrupts in the Scheme core, on Cortex-M3 MCUs (described further in the next section).

The heap is the dynamic memory area where user Scheme data are stored, including the user environment, user obarray, scheme stack, global scheme vector, user-defined functions (closures) and user data (strings, vectors, lists, ...). It also stores temporary data used to evaluate functions and to process files. Space on the heap is allocated 8-bytes (2 words) at a time by primtives _alo:, _cns: and _sav: (file armpit_core.s). These primitives are designed such that memory allocation can be transactional. They reserve memory at level 1 when they start and de-reserve it upon completion. A memory-allocating interrupt occuring while memory is reserved at level 1 will cause the interrupted allocation to be restarted when user code resumes after the interrupt. This eliminates the need for disabling and re-enabling interrupts around memory allocation which potentially speeds up code execution and reduces interrupt latency. Within the Armpit Scheme source code, cons, list and save are typically called via macros defined in file armpit_as_macros.s. In the case of cons and list, the macro performs the de-reservation of memory. In the case of _alo (zmaloc) de-reservation is to be performed explicitly after it is safe to do so (for example, after the items of a newly allocated vector have been stored in the vector -- because vector contents are subject to gc). The memory allocation functions call the garbage collector prior to storing an object if the memory to be allocated for this object would extend past the top of the heap.

Two garbage collection algorithms are implemented in Armpit Scheme: stop-and-copy and mark-and-sweep (label gc: or gc_bgn in file armpit_core.s). The stop-and-copy algorithm (default) is the faster of the two but uses two semi-heaps (HEAP 0 and HEAP 1) leading to half the useable RAM for Scheme at any time. The mark-and-sweep algorithm is slower but places all available RAM in a single heap which maximizes the space available to Scheme. Mark-and-sweep also uses more code FLASH space than stop-and-copy and uses a small amount of RAM above the heap to keep track of grey and black sets (tri-color marking scheme). Both algorithms are fully compacting and the user selects between them at assembly time by commenting or uncommenting a constant definition (for mark-and-sweep) in file armpit_050.s. Neither algorithm uses the system stack.

The above-heap unpack space in the memory maps is a resizeable zone of RAM where the user can install (unpack) non-gceable objects such as fixed data items (fonts) or machine code (drivers, compiled scheme code). It grows downwards and the heap is resized automatically to account for the size of installed objects (the unpack function is at label punpak: in file armpit_core.s). In MCUs without executable FLASH, this space is also used to store user libraries.

The SYSTEM STACK is that pointed to by the MCU's sp register and is used during interrupts (and possibly file writing) only.

FLASH space (where available) is occupied by user files, file crunch space and user libraries, all stored above the ArmPit Scheme machine code. User files are stored in a page-based format (typ. 256 bytes) where each page contains a file ID, block number, number of bytes and data. The first page of a file also contains the file name. Deleted files are marked by modifying the file header without erasing file contents. Actual erasing of files occurs when a new file is written to FLASH and no free space is available. At that time, the space used by deleted files is reclaimed as all non-deleted file pages are moved to the bottom of flash and the freed flash is erased to #xffffffff (label fsc: in armpit_core.s). On MCUs with executable FLASH, user libraries are stored above user files, in a space that grows downwards, each time a (library ...) form is evaluated (label plibra: in armpit_core.s). The global vector (glv, (_GLV)) keeps track of the resulting end of file FLASH space and start of library FLASH space and the fsc (file system cleaning) function is called as needed to free-up deleted file space. On MCUs without executabe FLASH space, user libraries are stored in the above-heap unpack space.

Interrupt Service:

Armpit Scheme defines a single entry point for all interrupts as the genisr: routine in the file armpit_core.s. Those interrupts that are enabled at startup are defined in the mcu-specific file [FAMILY].h (variable names: scheme_ints_enb or scheme_ints_en1, scheme_ints_en2, ...), they are enabled in armpit_init.s using the macro enable_VIC_IRQ from armpit_as_macros.s. In most cases, the uart, usb (if available), i2c (if assembled) and timer interrupts are enabled. Interrupts may be temporarilly disabled by the core during critical code section. To support this on Cortex-M3 MCUs (this is not needed on other MCUs) a bit mask representing the enabled interrupts is stored in the MAIN SYSTEM BUFFER at byte indices starting with 112. On Cortex-M3 MCUs, this bit mask has to be updated by the user/programmer if he/she adds new ISRs to the system (the update may be performed using bytevector and bitwise functions) such that the new interrupts are automatically re-enabled at the end of the critical code section. A function that does this, taking the interrupt number as input, could be:

    ;; add an enabled interrupt on Cortex-M3
    (define (CM3-add-int int) ;; for Cortex-M3 only
      (let* ((i   (+ 112 (quotient int 8)))
             (b   (remainder int 8))
             (MSB (vector-ref (_GLV) 5))
             (v   (bytevector-u8-ref MSB i)))
        (bytevector-u8-set! MSB i
           (bitwise-ior v (bitwise-arithmetic-shift 1 b)))))

The generic Interrupt Service Routine genisr: (in armpit_core.s) processes all interrupts. On interrupt entry, the 8 registers fre, cnt, rva, rvb, rvc, lnk_usr, pc_usr and spsr are pushed onto the IRQ mode stack (for compatibility with Cortex cores). The number (ID) of the interrupt source is then identified and stored in rvb. The code then checks if an executable machine code interrupt service routine is available for this interrupt ID at that ID's index in the machine code ISR vector (vector index 1 in the main system buffer: (vector-ref (vector-ref (_GLV) 5) 1)), and, if not, exits the generic ISR, clearing the interrupt in the core (but not in the peripheral) and popping the stack to return to the interrupted user mode process (label gnisxt: in armpit_core.s). Otherwise, it executes the ISR from the ISR vector. The machine code ISR may simply clear the interrupt at the peripheral and return to genisr (via lnk) for further processing, or it may fully process the interrupt. At startup, the machine code ISR vector contains full ISRs for uart, usb (if available) and i2c (if asssembled) and a partial ISR for timer interrupts. The full ISRs perform all needed ISR operations, including clearing the interrupt in the peripheral and in the core, and exiting the ISR properly, returning to user mode upon completion. The partial ISRs (eg. timer), that return to genisr via lnk, must not modify rvb, sv1-sv5, env, dts, glv and sp such that genisr can be properly resumed. The return to genisr makes it possible for the ISR to eventually call a user-specified interrupt service routine written in Scheme that executes in the top-level as described below.

The uart interrupt (character received) is enabled when the MCU is not connected to a valid USB line that has undergone full enumeration (for USB-enabled MCUs). This interrupt is set to fire for each character received by the interface. The uart full ISR (label puaisr: in armpit_ports.s) is branched to from genisr via the machine ISR vector and reads the received character from the uart's receive-hold register which it then stores at the next available location in Armpit's READBUFFER, or otherwise updates this buffer appropriately when a backspace, "Enter" or ctrl-c is received. The ISR also echoes received characters to the uart's Tx line, clears the uart interrupt and eventually exits back to resume the interrupted user code through gnisxt:.

The USB ISR with label usbisr: (armpit_ports.s) is enabled when the MCU is connected to a USB line and has undergone full enumeration (for USB-enabled MCUs). During character reception, the USB ISR operates similarly to the uart ISR described above but obtains its input characters from the USB line. The USB system also uses a WRITEBUFFER to send characters out through USB. Scheme "write" functions targeted to the USB port write the relevant external representations to the USB WRITEBUFFER and the ISR sends the contents of this buffer out through the USB line when the appropriate request (which generates an interrupt in Armpit Scheme) is received from the USB Host. The echo of characters received via USB is handled similarly. The ISR further handles standard, interrupt-driven, USB management tasks, such as device enumeration. Upon successful enumeration and configuration, the ISR disables UART interrupts and sets USB as the default i/o port.

The i2c full ISR is found at label pi2isr in file armpit_ports.s. It is used to transmit simple scheme objects and packed expressions between MCUs or between MCU and I2C peripherals. The ISR manages the state of the I2C state buffer to indicate busy status, data ready, and to store small transmitted objects before returning them to the user as scheme-tagged items. For the reception of packed expressions (large gc-able objects), the ISR allocates heap memory for the object being received after checking whether memory was reserved when the interrupt arose, taking appropriate action in that case (completing or suspending (for later restart) the user-space allocation, and collecting garbage).

The partial timer ISR is found at label ptmisr: in file armpit_core.s. It is branched to (using the bl operation) by genisr, using the ISR vector. It clears the timer interrupt in the timer peripheral block and then uses lnk to return to the generic ISR. This partial ISR is archetypical of partial ISRs that a user may want to add to the ISR vector such that the targeted system interrupt can be processed by a Scheme ISR at top level. It is a pseudo-primitive, that is, a block of code that starts with the primitive function tag (.word proc | 0x00) and ends with a return instruction such as set pc, lnk or set pc, cnt. Such blocks of machine code are not gc-eable and must be stored in FLASH or above-heap RAM. In the latter case, the address of the pseudo-primitive code block (to be stored in the machine ISR vector) is obtained as output of the Armpit Scheme unpack function.

When the generic ISR is resumed by returning from a partial ISR, it first checks to see if a memory allocation operation or garbage collection was preempted by the interrupt and, if so, updates the saved pc_user to either restart or complete the memory allocation once the interrupt completes, or resumes the gc, with interrupts disabled, prior to going forward with the remainder of interrupt processing. These operations are performed by the code at label genism:. The identification of situations where a memory allocation operation was interrupted is based on the lower 2 bits stored in the free-memory pointer (fre) that indicate the memory reservation status. The decision to restart memory allocation or to complete it is based on the uniform use, within memory allocation code, of the instruction:

     orr	fre, rva, #0x02

to de-reserve memory right-after committing the memory allocation process (the commit instruction must always immediately precede this instruction). Interruptions prior to this instruction lead to a restart of the memory allocation whereas interruptions right at this critical instruction lead to completion (in genism:) of the allocation.

After dealing with memory management issues, the generic ISR checks the global vector (glv, accessed in Scheme via the function (_GLV)) at index 0 for a user-specified Scheme ISR. If no such ISR is found, the generic ISR exits via gnisxt: (clear interrupt in core, resume interrupted user process). If a user Scheme ISR is found, the generic ISR proceeds to save the context of the interrupted Scheme execution in two vectors: a gc-eable vector (scheme vector) for sv1-sv5, env and dts, and a non-gceable vector (a bytevector really) for cnt, rva-rvc, lnk_usr, pc_usr, psr_usr (spsr) and FPU registers (if a FPU is used). It then clears the interrupt in the core, stores the context vectors on a new Scheme stack, sets the Scheme continuation (cnt) to restore the interrupted context from that stack (rsrctx:), sets the IRQ stack content to resume the built-in apply function with sv1 set to the user Scheme ISR and sv2 set to the interrupt ID and then exits the interrupt via rsrxit:. This process returns the system to user mode, making it execute (apply scheme_ISR interrupt_ID) with resumption of the saved context of the interrupted scheme process as continuation. In other words, the system executes the user-specified Scheme ISR with the interrupt ID as input argument and, when this Scheme ISR is done, the system resumes the interrupted scheme process.

The user-specified Scheme ISR is a function of one argument (the interrupt ID). It is executed in user mode with interrupts enabled like the rest of user Scheme code. It can use a conditional statement to perform different actions based on which interrupt triggered its execution. Its continuation is the resumption of the interrupted Scheme process and can be captured with call/cc and stored on a queue for preemptive (or otherwise interrupt-driven) multitasking. The input argument to that continuation is a dummy argument. The user-defined Scheme ISR is stored on the glv at index position zero from user-mode, by writing it to the vector returned by the function (_GLV), at index 0. It can also be read from that same location. It is worth remembering at this stage that this Scheme callback (ISR) can be executed only if there is a partial ISR for the desired interrupt source in the machine code ISR vector (and if that source is enabled in the system, set to branch to genisr: on interrupt, and, in the case of Cortex-M3 MCUs, the interrupt enabled bit has been added in the main system buffer at byte index 112+ as discussed earlier). Additionally, because the machine code ISR vector is stored in FLASH at startup (except on MCUs with no executable FLASH), this vector has to be copied to above-heap RAM, so that it is modifiable and not moved by garbage-collection, before adding new ISRs to it. For example:

      ;; copy machine code ISR vector to above-heap RAM
      (let* ((misr (vector-ref (vector-ref (_GLV) 5) 1))
             (n    (vector-length misr))
             (mcpy (unpack (pack (make-vector n)) 1)))     ;; above-heap
         (let loop ((m 0))
            (if (= m n)
                (vector-set! (vector-ref (_GLV) 5) 1 mcpy) ;; ISR vector <- copy
                (begin
                   (vector-set! mcpy m (vector-ref misr m))
                   (loop (+ m 1))))))

Internal Representations:

Armpit Scheme uses the lower bits of data objects to identify their type. The tagging is adaptive with 2, 4 or 8 bits giving full type. Two-bit tags are used to differentiate between addresses, integers, floats and other objects. The type of an object can be recognized by inspecting its two least significant bits as follows:

      00 -> address
      01 -> integer
      10 -> float
      11 -> other object

Addresses are aligned to word boundaries and hence natively have their lowest two bits as 00 in the 32-bit Armpit system. Integers are stored as a raw value in the 30-bits above their type tag, using two's complement. A raw integer is shifted left by 2 bits and orred with the #b01 type tag to make a scheme integer. The converse operation is performed by an arithmetic shift, two bits to the right, to preserve the sign and value of the raw integer. Floats are represented using a shortened form of the IEEE-754 32-bit standard (single precision), namely the lower two bits of the mantissa are replaced by the type tag #b10. The following illustrates these internal representations, with uppercase letters representing hexadecimal values (X if arbitrary), lower case letters representing bits, and numerical digits representing actual bit values (s, e and m stand for sign, exponent and mantissa):

      XXXXXXXbb00 -> address, aligned to word boundary
      XXXXXXXbb01 -> integer
      seeeeeeeemmmmmmmmmmmmmmmmmmmmm10 -> float

Atoms, other than integers and floats, are identified by an 8-bit tag in the object's least significant byte. The full internal representations of these 8-bit tagged single-word objects are (in hex):

      #x0000000F -> '()
      #x0000001F -> #t
      #x0000002F -> #f
      #x0000CC3F -> character, CC = character's ASCII code
      MC-VRID-AF -> variable, MC = MCU ID or 0, VRID = 16-bit var. ID
      #x0000009F -> broken-heart (used during garbage collection)

Scheme variables have the 8-bit tag #xAF. Bytes at offsets 1 and 2 represent the unique numerical ID of the variable, assigned to it by string->symbol (eg. as used by read/parse), and stored as a raw 16-bit integer. The upper byte of the variable stores the ID of the MCU on which the variable was created, or zero if it is a built-in (implementation) or library variable. The MCU ID is included in the variable to differentiate user-defined variables from built-in (implementation) variables and to enable multiprocessing where the MCU may receive variables defined on another MCU that must not be confused with variables that have the same 16-bit numerical IDs but that were defined on the local MCU. The MCU ID used in the internal representation of variables is that defined for I2C communication over the i2c0 interface and is user-defined at the top of the Armpit Scheme main configuration file (armpit_050.s). It can be modified by writing to the appropriate I2C register of the MCU (I2C0ADR on LPC2000 devices).

The VRID and MCU-ID of variables can be obtained at top-level in a running Armpit Scheme system using (for example):

      (define ash bitwise-arithmetic-shift)
      (number->string (ash 'quote  -6) 16) ; -> "00000103", quote  has VRID #x103
      (number->string (ash 'lambda -6) 16) ; -> "00000203", lambda has VRID #x203
      (number->string (ash 'UAR0   -6) 16) ; -> "00000304", UAR0   has VRID #x304
      (define x #t)
      (number->string (ash 'x -6) 16) ; -> "0064000B", VRID #x0B, MCU-ID #x64

For built-in variables (eg. quote, lambda and UAR0 in the above examples), and for variables defined in user libraries, the VRID is built by combining the variable's position in the sub-environment where it is defined with the position of that sub-environment in the built-in environment constructed in armpit_050.s (label scmenv:). For example, the #x203 VRID for lambda means that it is the second variable defined in the scheme base sub-environment (the 3rd sub-environment in scmenv:) as can be seen at the top of the armpit_scheme_base.s file.

Multi-word objects (objects pointed to by an address identified by #b00 as LSbs) are classified based on 2-, 4- and 8-bit tags stored in the word of memory pointed to by their address. A 3-bit mask (#b111) can be used to identify whether the object is a compound number (rational or complex) or something else as these compound numbers have the unique combination #b011 as Least Significant bits (LSbs). The full tags of compound numbers have 4 bits and are #b0011 for rationals and #b1011 for complex numbers. These objects occupy 2 successive words of memory (8 bytes) with 30-bits for each value (numerator and denominator, both 30-bit integers, or real and imaginary part, both 30-bit floats) and 4 bits for the tag (total of 64 bits). The encoding is as follows (N and n represent the numerator, D and d the denominator, R and r the real part and I and i the imaginary part, upper-case for 4-bit nibbles, lower case for bits, and numbers 0/1 are bits):

        address        contents
      -----------    -----------
      XXXXXXXbb00 -> NNNNNNN0011     <- rational
                     DDDDDDDddnn

      XXXXXXXbb00 -> RRRRRRR1011     <- complex
                     IIIIIIIiirr

The macros numerat, denom, real and imag in armpit_as_macros.s can be used to split these compound numbers into their components, represented as tagged integers or floats.

Other multi-word objects can be identified as compound objects (symbol, string, bytevector, vector, macro or procedure) by using the 8-bit mask #x47 on the LSB of the word of memory pointed to by the address. A logical-and between #x47 and that byte produces #x47 if the object is compound, otherwise (unless the object is a rational or complex) the object is a list. Fixed size compound objects (symbol, string, bytevector and vector) are stored in a set of contiguous memory locations starting with a word that combines its size (upper 24-bits) and tag (LSB). The raw (size 0) 8-bit tags are:

      0100-0011 = #x7F -> symbol
      0101-0011 = #x5F -> string
      0111-0011 = #x6F -> bytevector
      0100-1011 = #x4F -> vector

The 24-bits above the 8-bit tag indicate the number of items in the object (bytes for symbols, strings and bytevectors, words for vectors) as a raw unsigned integer. Considering the foregoing, the full internal representation of implemented fixed size compound objects becomes:

      -------------- ----------------- ----------------- ----------- ----------------
      object:        symbol            string            bytevector  vector
      -------------- ----------------- ----------------- ----------- ----------------
      word offset 0: #xnnnnnn7F        #xnnnnnn5F        #xnnnnnn6F  #xNNNNNN4F
      word offset 1: ascii-chars-0123  ascii-chars-0123  bytes-0123  scheme-object-1
      word offset 2: ascii-chars-4567  ascii-chars-4567  bytes-4567  scheme-object-2
      ...            ...               ...               ...         ...
      -------------- ----------------- ----------------- ----------- ----------------

where nnn... represents object size in number of bytes and NNN... in number of words. Note that the 2 msbs of the 8-bit tag are #b01 and correspond to the 2-bit integer tag. Hence, by shifting the 32-bit tag word of a fixed size object to the right by 6 bits, one obtains a scheme integer that gives the size of the symbol, string, bytevector or vector (as in (string-length ...), (vector-length ...), ...). Also, within the heap, these fixed size objects are followed by an appropriate amount of empty space to satisfy the 8-byte alignment of the implementation.

Macros and procedures (including continuations) are mostly implemented as tagged listed objects consisting of a sequence of linked 8-byte memory cells (car-cdr combinations). The exception is compiled (or ARMSchembled or built-in) procedures that include a non-listed component consisting of a sequence of machine code instructions stored in consecutive memory cells. The 8-bit tag of these objects, whether listed or sequential, (eg. as masked with #x47) differentiates them from normal lists (so as to not confuse the evaluator). These 8-bit tags are #xDF for a procedure and #xD7 for a macro. For procedures, 4-bits at bit-index 12 in the tag specify whether it is a built-in (or ARMSchembled) primitve (sequential), a compound (listed) procedure, a continuation (listed) or a compiled procedure (partly listed but with a sequential machine code component as well). Additionally, for built-in procedures (aka primitives), the 3 bits at index 8 specify the number of input arguments (0 to 4, 0 if input arguments should be listed on entry), the bit at index 11, if set, specifies that the primitive is of syntax type, the 8 bits at index 16 may specify the index of a common pre-entry subroutine (index into paptbl: in file armpit_core.s) and the 8 bits at position 24 may specify a startup value for register sv4 (this value must be a proper scheme object, represented by a maximum of 8 bits, eg. '(), #t, #f and 6-bit unsigned integers).

Macros are constructed by the scheme syntax procedure 'syntax-rules (label sntxrl: in armpit_scheme_base.s), Compound procedures (closures) are built by the scheme syntax procedure 'lambda (label plmbda: in armpit_scheme_base.s), continuations are captured by the scheme function 'call/cc or 'call-with-current-continuation (label callcc: in armpit_scheme_base.s). Syntax and procedural primitives can be built using an appropriate ARMSchembler and linker (both written in scheme) and compiled procedures are made using an appropriate compiler (written in scheme) and finalized using the function _mkc (label _mkcpl: in armpit_core.s). The structures of the resulting tagged compound objects are (where parentheses indicate lists or cons-cells and square brackets indicate sequential memory words):


      macro                 <-  (#x000000D7 . macro-body)
      macro-body            <-  (literals transformer1 transformer2 ...)
      literals              <-  (variable1 variable2 ...)
      transformer           <-  (pattern template)

      primitive procedure   <-  [ #xSSEE0-0nnn-DF ]  <- SS  = opt. startup value for sv4
                                [   machine code  ]     EE  = opt. common entry sub-proc. idx
                                [   machine code  ]     nnn (0-4) = num. input args (3-bits)
                                [       ...       ]

      primitive syntax      <-  [ #x00000-1nnn-DF ]  <- nnn (0-4) = num. input args (3-bits)
                                [   machine code  ]
                                [   machine code  ]
                                [       ...       ]

      compound procedure    <-  (#x000040DF . proc-content)
      proc-content          <-  (env vars-list . body)
      vars-list             <-  (variable1 variable2 ...)
      body                  <-  (expr1 expr2 ...)

      continuation          <-  (#x000080DF . cont-content)
      cont-content          <-  (cnt-winders cnt env . dts)

      compiled procedure    <-  (#x0000C0DF . compiled-proc-content)
      compiled-proc-content <-  (env vars-list . body-address)
      vars-list             <-  (variable1 variable2 ...)
      body-address          <-  [   machine code  ]
                                [   machine code  ]
                                [       ...       ]

The number of input arguments specified in the tag for primitives is used by (apply ...) (label prmapp: in armpit_core.s) to unpack the input argument list into scheme value registers sv1 to sv5 prior to calling the primitive which can then start manipulating its inputs directly from the contents of these registers If the specified number of input arguments (n) is smaller than that in the input list provided to the function then the list of remaining arguments is stored in register sv_n+1. In particular, if n=0, the whole list of input arguments (possibly null) is found in sv1 on function entry.

For primitive procedures, an optional 8-bit startup value can be given to sv4. This has to be a fully tagged scheme object (fitting within 8-bits) for example null, #t, #f, the empty-character and small integers (0, 1, 2) or an 8-bit tag. This is used (among others) by type-checking functions, for example bytevector? in armpit_scheme_r6rs_library.s, where sv4 receives an index (scheme integer) into the type table (label typtbl: in armpit_core.s) on entry. and jumps to a common type-checking function defined as its pre-entry procedure (otypchk).

An optional pre-entry sub-procedure can be specified for primitive procedures. The pre-entry function is identified by its positional index in the primitive pre-entry function table (label paptbl: in armpit_050.s). This is used to save code space when several functions have a similar entry pattern or when their entire code is similar. An example is type-checking functions (as discussed in the previous paragraph) where the common process at label typchk: in armpit_core.s is defined as the pre-entry sub-procedure. Another example are the functions caar to cddddr in armpit_scheme_library.s whose common code is at label cxxxxr: in that file and linked through index 9 (ocxxxxr) of the pre-entry function table (paptbl: in armpit_050.s). The primitive pre-entry function table is made available, as a bytevector, at index 16 on the (_GLV) such that it may be extended (after being copied to above-head RAM) with additional, user-defined, ARMSchembled, pre-entry functions, if desired.

User environments (in register env during evaluation and on the glv at index 7 for the top-level, (vector-ref (_GLV) 7) to view) and the user obarray (at index 8 on the glv, use: (vector-ref (_GLV) 8) to view) are dynamically extended by user interaction and represented as scheme lists. The user environment is stored on the heap and the user obarray may be stored on the heap or optionally in above-heap space. The dynamic obarray is a list of bindings between variables and their symbols where each binding is a cons cell with the variable as cdr and its symbol's address as car:

      obarray      <-  (obinding1 obinding2 ...)
      obinding     <-  (address-of-var's-symbol . variable)

It is extended each time the reader parses a new symbol that is not already in the obarray.

Dynamic environments are represented as lists of frames where each frame is a list of bindings between variables and their values. As with the obarray, the bindings are implemented as cons cells whose car is the variable and the cdr is its value (a scheme object). The bindings within a frame are sorted in ascending order of variable IDs to provide for faster search (see bndchk: and mkfrm: in file armpit_core.s).

     environment   <- (frame1 frame2 ...)
     frame         <- (ebinding1 ebinding2 ...)   <- sorted by variable ID
     ebinding      <- (variable . value)

In contrast to the above dynamic objects, the built-in implementation environment and obarray are rather static and defined internally by vectors in the code rather lists. The built-in scheme environment (and associated obarray) is constructed at label scmenv: in the main configuration file armpit_050.s. It is made available from the glv at index 13 (eg. (vector-ref (_GLV) 13) to view). The built-in environment is a vector of sub-environment vectors and it is extended with user-library sub-environments each time a user-library is imported. The extended result is placed back on the glv, at index 13, such that objects exported by the imported user-library are now part of that environment and available to the ArmPit Scheme user.

The sub-environments included in the built-in environment vector are themselves vectors. They are defined at the top of ArmPit Scheme source code files (eg. armpit_core.s, armpit_ports.s, armpit_scheme_base.s). Even indices in these vectors contain symbols and the following odd indices contain the scheme object to which each symbol is bound. The position of symbols in the sub-environment, together with the position of the sub-environment in the built-in environment vector defines the variable ID (VRID in the discussion of variables, above) of the symbol and hence implicitly defines the built-in obarray (which essentially associates symbols with their VRID). Accordingly, the structure of the built-in environment is:

     (vector-ref (_GLV) 13) 
       ->
         #( #()                                   <- empty vector
            #(sym1 val1 sym2 val2 sym3 val3 ...)  <- sub-env-1 (core)  VRID #xNN02
            #(sym1 val1 sym2 val2 sym3 val3 ...)  <- sub-env-2 (base)  VRID #xNN03
            #(sym1 val1 sym2 val2 sym3 val3 ...)  <- sub-env-3 (ports) VRID #xNN04
            ... )

The empty vector at index zero is used for compatibility with user-libraries that store their private (non-exported) bindings at that index.

User-libraries are built when a (library ...) form is read-in, parsed and then evaluated (labels parslb: in armpit_scheme_read_write.s and plibra: in armpit_core.s). During parsing a special mode flag is set in the glv such that library expressions are evaluated in a copy of the built-in environment, extended with a private library sub-environment at index 0 and a public (exported) library sub-environment at an index equal to the size of the built-in environment + the number of libraries previously built (installed) in the system. Additionally, if the library being parsed and evaluated imports other libraries, the public (exported) sub-environments of these libraries are included in the library's extended environment before evaluating library expressions. The special mode flag further directs the string->symbol function (that normally updates the user obarray when new symbols are encounterd) to instead update the private and public sub-environments of the library with the newly encoutered symbols (that become essentially library-specific built-in symbols, with no MCUID -- label strsy6: in armpit_scheme_base.s). Once the evaluation process is complete, the extended library environment (which contains all bindings for evaluated library expressions) is linked to (essentially: consed to the front of) the existing list of libraries on the glv (index 12) packed and unpacked to the top of executable FLASH or into above-heap space (for MCUs without executable FLASH). The result is visible using (vector-ref (_GLV) 12) which lists the full environments of all installed libraries (a long list), or with the function (libs) that lists only installed library names. Using library functions then follows the standard r6rs syntax: (import (library-name)), followed by library function use. The (import ...) form extends the environment in glv 13 (initially the built-in environment) with the public (exported) sub-environment of the specified library and sets glv 13 to that extended environment, which, in effect, becomes the new built-in environment (and contains bindings for exported library variables). Meanwhile, all closures defined in the library were closed over the extended environment used during their evaluation, which is a superset of the built-in environment that includes the library's private sub-environment as well as the exported sub-environments of libraries that the library itself imported. The name of the library is stored at index 0 of its private sub-environment and the index at which to place its public sub-environment, within the extended built-in environment vector constructed when the library is imported, is stored at index 1 in its private sub-environment. The internal structure of a user library, as defined by its environment vector (extended from the built-in environment) is accordingly:

     library
      -> 
        #( #(lib-name lib-index priv-sym2 priv-val2 ...)  <- private sub-env of this lib
           #(sym1 val1 sym2 val2 sym3 val3 ...)        <- built-in sub-env-1 (core)  (VRID #xNN02)
           #(sym1 val1 sym2 val2 sym3 val3 ...)        <- built-in sub-env-2 (base)  (VRID #xNN03)
           #(sym1 val1 sym2 val2 sym3 val3 ...)        <- built-in sub-env-3 (ports) (VRID #xNN04)
                                  ...
           #()                                         <- possible empty vector(s)
                                  ...
           #(isym1 ival1 isym2 ival2 isym3 ival3 ...)  <- export sub-env of lib import by this lib
                                  ...
           #()                                         <- possible empty vector(s)
                                  ...
           #(esym1 eval1 esym2 eval2 esym3 eval3 ...)) <- public sub-env of this lib, at lib-index

Input and Output Ports:

Armpit Scheme input and output ports consist of 2 parts aimed at maintaining the simplicity of use of prior versions while also enabling re-use of port code and user-space extensions towards new peripherals. The first part is (typically) the port argument with which a top-level port function is called (eg. port-base-address, register offset, file-id, ...) and the second part is a vector of port data and pseudo-primitives used to operate on the port (pseudo-primitives are machine code that is tagged as a primitive but is not branched to directly by the evaluator but rather by internal port-dispatch code). The two parts are joined in a cons cell to form a "full" input or output port. For the built-in ports (uart, usb, file, memory, i2c, SD-card file), these "full" ports are built by the top-level input/output functions (read, write ...) via calls to internal functions setipr and setopr (labels setipr: and setopr: in armpit_ports.s). For user-defined ports, the "full" ports should be built by the user and passed to the top-level I/O functions as input (or set as return values for the current-input/output-port functions). The top-level I/O functions call specific port-vector pseudo-primitives to accomplish their work and do so through prtfun and associated dispatchers (labels prtcli: to prtfun: in armpit_ports.s). The functions (current-input-port) and (current-output-port) commonly return full ports (eg. to use as illustration while reading this document).

Built-in input and output port vectors are made available at Top-Level as vectors of 3 elements (port models): 1) base address (if any, shifted right by 4 bits), 2) input port vector, 3) output port vector. The corresponding variables (availability depends on configuration options) are (cf. top of armpit_ports.s):

       Variable
       Name        Description
       ----------  ------------------------------------------------
       FILE        port model for on-/off-chip FLASH files
       MEM         port model for memory (eg. peripheral registers)
       UAR0        port model for uart0
       UAR1        port model for uart1
       USB         port model for USB
       SDFT        port model for SD-card, with FAT-16 format
       I2C0        port model for i2c0
       I2C1        port model for i2c1

An input port vector is a Scheme vector that consists of at least 5 elements (eg. label memipr: in armpit_ports.s):

        Position   Item                   Type
        --------   ---------------------  ---------------------------------
	index 0:   port-type	          scheme integer   (1 for input port)
	index 1:   close-input-port       pseudo-primitive (returns via cnt)
	index 2:   read-char / peek-char  pseudo-primitive (returns via cnt)
	index 3:   char-ready?            pseudo-primitive (returns via cnt)
	index 4:   read                   pseudo-primitive (returns via cnt)

The port-type is 1 for an input port (2 for output, 3 for input-output). The 4 pseudo-primitives are machine code functions that receive the full port as input in sv1, perform appropriate processing and return via either lnk or cnt as indicated above. These blocks of code are meant to be used only internally by the system. Additionally, they are called by top-level port functions (read-char, peek-char, ...) through a dispatcher rather than through apply and therefore their number of input arguments (1) does not need to be specified in the 32-bit primitive tag that precedes their code.

For most input ports, close-input-port does nothing but set the return value to #t or the non-printing-object, #npo, (in sv1) and return, while for file ports it removes the requested file descriptor from the open-file-list stored in glv (index 6, (vector-ref (_GLV) 6) to view) prior to returning #t (or 0 if the file descriptor was not there). The read-char/peek-char function returns the next character available from the port, possibly using aditional port-vector data and pseudo-primitives stored at indices above 4. Read-char is differentiated from peek-char internally by temporarily modifying the port-base-address from a scheme integer to a scheme float. For uarts and usb the character is obtained from the read buffer (address identified from the main system buffer) that is then updated or not (read vs peek), for files it is obtained from FLASH and the file descriptor is updated or not accordingly (read vs peek), for memory the character is obtained from the byte at the specified offset above the port-base-address. For character input ports (uart, usb, file) a helper datum at index 5 in the port vector is used to determine whether no character available should cause read/peek to hang or to return the end-of-file character. The char-ready? pseudo-primitive returns a boolean indicating whether a character can be read or peeked from the port without hanging. The read pseudo-primitive at index 4 returns a parsed full datum (internal representation of the aquired datum) from a character input port or a 32-bit integer from a memory port (a bytevector if the offset is negative). For character input ports, the helper datum at index 5 determines whether the input datum is carriage-return (cr) terminated (uart, usb) or if it may also be terminated by a space (file). The additional port-vector items for a typical character input port are as follows:

        Position   Item                   Type
        --------   ---------------------  ---------------------------------
        index 5:   wait-for-cr?           scheme boolean
	index 6:   init                   pseudo-primitive (returns via lnk)
	index 7:   getc                   pseudo-primitive (returns via lnk)
	index 8:   finish-up              pseudo-primitive (returns via lnk)

The init function sets the offset from which to read the port's input buffer (READBUFFER or file descriptor), the getc function returns a raw character from the port's buffer and the finish-up function extracts a full datum from the port's buffer. Three more items are added to the input port vector for file ports: 1) a file-info pseudo-primitive that returns location information about the file, 2) a file-list pseudo-primitive that returns names of all files available through the hardware that the file port connects to, and 3) the size of the block-read-buffer (possibly 0) to use when acquiring blocks of characters from the port (eg. for an SD-card port, 512 characters are obtained per transaction). The additional file input port vector items are:

        Position   Item                   Type
        ---------  ---------------------  ---------------------------------
	index  9:  file-info              pseudo-primitive (returns via lnk)
	index 10:  file-list              pseudo-primitive (returns via cnt)
	index 11:  input file bufr. size  scheme integer

An output port vector is a Scheme vector that consists of at least 4 elements (eg. label memopr: in armpit_ports.s):

        Position   Item                    Type
        --------   ---------------------   ---------------------------------
	index 0:   port-type	           scheme integer   (2 for output port)
	index 1:   close-output-port       pseudo-primitive (returns via cnt)
	index 2:   write-char/write-string pseudo-primitive (returns via lnk)
	index 3:   write / display         pseudo-primitive (returns via cnt)

The port-type is 2 for an output port (1 for input, 3 for input-output). The 3 pseudo-primitives receive as input a scheme object (to output) in sv1 (or a mode with which to close an output port) and the full port in sv2. They then perform appropriate processing and return via either lnk or cnt as indicated above. They are used internally by the system and called by top-level port functions (write-char, write, ...) through a dispatcher, like the input port pseudo primitives, and therefore their number of input arguments (2) does not need to be specified in the tag word (primitive) that precedes their code.

For most output ports, close-output-port does nothing but return. For file output ports, it checks the file closing mode in sv1 to identify whether file data should be written out to FLASH or not and, if so, writes the data out to FLASH. In either case it also removes the file descriptor form the open-file-list. The write-char/write-string function writes the single scheme character, or the set of characters from the scheme string, that it receives as input in sv1 to the output port that it receives in sv2. For memory ports, this writes a single byte of data at the specified base-address + offset (or a bytevector if the offset is negative). For character ports (uart, usb, file) write-char/string uses a helper function at index 4 to perform character-wise output either directly or via a buffer (port-specific). The write/display port pseudo-primitive writes a word to a memory location for a memory port or writes the external representation of the datum it receives in sv1 to the port it receives in sv2 using the helper function at index 4. The additional port-vector item (helper function) for a typical character output port is as follows:

        Position   Item                    Type
        --------   ---------------------   ---------------------------------
	index 4:   putc   	           pseudo-primitive (returns via lnk)

The putc function outputs the scheme character it receives as input in sv1 either directly to the port it receives as input in sv2 and sv4 (uart) or to the scheme write buffer (address identified form the main system buffer) for usb, or to the port's file descriptor for a file (the file descriptor includes a buffer with size described below). For file ports, 3 additional items are included in the vector: 1) a file-info pseudo-primitive (the same as for the input port vector), 2) a file-erase pseudo-primitive to invalidate (pseudo-erase) an existing file prior to writing the first data to the storage medium for a new file with the same name, and 3) the size of the block-write-buffer to use when writing blocks of characters (eg. pages) to the medium connected to the port. The additional file output port vector items are:

        Position   Item                   Type
        ---------  ---------------------  ---------------------------------
	index 5:   file-info              pseudo-primitive (returns via lnk)
	index 6:   file-erase             pseudo-primitive (returns via lnk)
	index 7:   output file buf. size  scheme integer

It is expected that this implementation of ports is sufficient to enable the user to define a keyboard input port and a display output port which could turn an Armpit Scheme system into a standalone (yet still rudimentary) Scheme Machine. A keyboard (raw matrix) input port might re-use most components of the uart input port-vector for example, along with an interrupt service routine (in machine code) that would decode the key input and then store it in the read buffer (eg. extended from puaisr: in armit_ports.s). It may (or not) be sufficient to implement networking ports. Indubitably, the port model of version 00.0160 had to be extended to the current model, to support both FLASH files (on- or off-chip) and FAT-16 SD/MMC cards.

Exported Internal Objects:

Armpit Scheme supports installation of non-gceable objects and code above the heap of a running system. A few internal objects that are normally not easily accessible from the rep are exported (made available at Top-Level) to ease this process when code is assembled or compiled on the system itself (eg. via an ARMSchembler, compiler and/or linker). The exported objects (from armpit_core.s) are as follows:

  Top-Level
  Symbol     Object
  ---------  ------
  _mkc       make-compiled: glue env, vars, tag and mach. code into compiled proc
  _lkp       lookup:   find binding for a variable in env
  _apl       apply:    the internal apply function
  _dfv       define:   extend environment with empty binding for the supplied var.
  _alo       zmaloc:   memory allocation primitive
  _cns       cons:     cons-cell building primitive
  _sav       save:     primitive for saving object onto scheme stack (dts)
  _isx       ISR exit: common exit from ISR
  _ism       ISR mem:  process memory reservation during interrupt
  _gc        gc:       perform garbage collection
  _err       error:    error notification

Extending the System with New Functions:

Adding new functions to the Armpit Scheme built-ins consists of 3 or 5 steps: 1) add a primitive with assembly language code for the function; 2) add an external representation symbol for the function; 3) add the function's symbol and code address to an appropriate sub-environment; 4) if the function is in a new file (recommended), add the sub-environment defined in this new file to the main built-in environment at label scmenv: in armpit_050.s, and; 5) include the new file at the end of the Functions section of armpit_050.s. These steps are examplified below for the addition of a function named "revenu" (a kind of "backwards", or reverse, enumeration used here for simplicity of code, and, despite the times, hopefully completely unrelated to the concept of revenue).

The target new function, revenu, takes either a positive integer count and an ending integer or a positive integer count, an integer step and an ending integer as input values, and returns a list of "count" integers that ends with the ending integer, and decrements by the step (or 1 as default). Once the function has been added to the code and the code has been reassembled and uploaded to the MCU, it will be possible to use this new function at top-level and perform operations such as:

ap> revenu
#proc

ap> (revenu 3 7)
(9 8 7)

ap> (revenu 5 3 2)
(14 11 8 5 2)

ap> (revenu 4 -10 100)
(70 80 90 100)

Much like the revenu function, the implementation example is easiest to present "backwards", starting with including the new code file for the function in armpit_050.s (step 5). We'll assume that the code of the function is to be placed in a file in the common/ sub-directory and the file will be named mycode.s. Accordingly, in armpit_050.s we add 1 line in the Functions section, after the lines for including the _system_0.s file (the new line is the last one below):


    .ifdef	OMAP_35xx
      .include "mcu_specific/OMAP_35xx/OMAP_35xx_system_0.s"
    .endif
	
    .endif
	
    .include "common/mycode.s"          @      <--- include new code file

Next (step 4) we add the sub-environment in which revenu will be defined to the built-in environment. Scrolling up in armpit_050.s we find the scmenv: label where the built-in environment vector is specified. To the list of specified sub-environments we add mycode_env which is the assembler name we'll give to our new sub-environment. We place it before the end_of_scmenv: label as follows (the new line is the one before-last):


    sys0_env:	.word	s0_env		@	system 0

                .word   mycode_env      @       <--- new sub-environment

    end_of_scmenv:	@ end of scmenv

Now, we make a new (empty) file, named mycode.s, and place it in the common/ directory. At the top of that file we build the sub-environment vector that includes the binding for the revenu function. The vector is at label mycode_env: which is the name we used earlier for it in armpit_050.s. It starts with the full vector tag (type tag and number of items) and this is followed by the vector's content: two 32-bit words, the first is the address (label) for the external representation symbol for the function (we choose sreven) and the second is the address (label) for the function's code (we choose preven). At this point (step 3) the full content of mycode.s is as follows (5 lines, including one blank line):


.balign 4                               @ ensure that vector start is 32-bit aligned

mycode_env:	@	mycode's sub-environment
	.word	(2 << 8) | vector_tag   @ (2 << 8) means 2 items in the sub-env vector
	.word	sreven,	preven	        @ binding between symbol and function code

Next (step 2) we specify the external representation symbol for the revenu function. It is the symbol "revenu" and we place it at label sreven:, below the sub-environment vector. The contents of mycode.s are now (4 new lines, including a blank line):


.balign 4                               @ ensure that vector start is 32-bit aligned

mycode_env:	@	mycode's sub-environment
	.word	(2 << 8) | vector_tag   @ (2 << 8) means 2 items in the sub-env vector
	.word	sreven,	preven	        @ binding between symbol and function code

sreven: .word	(6 << 8) | symbol_tag	@ (6 << 8) means 6 characters in the symbol
	.ascii	"revenu"                @ these are the actual characters in the symbol
	.balign 4                       @ align the rest of the file on 32-bit boundary

Finally, (step 1), we add the code of the revenu function, at label preven: below the symbol. The resulting content of the complete (ready for assembly of ArmPit Scheme) mycode.s is then:


.balign 4                               @ ensure that vector start is 32-bit aligned

mycode_env:	@	mycode's sub-environment
	.word	(2 << 8) | vector_tag   @ (2 << 8) means 2 items in the sub-env vector
	.word	sreven,	preven	        @ binding between symbol and function code

sreven: .word	(6 << 8) | symbol_tag	@ (6 << 8) means 6 characters in the symbol
	.ascii	"revenu"                @ these are the actual characters in the symbol
	.balign 4                       @ align the rest of the file on 32-bit boundary
	
preven:	@ (revenu count end) or (revenu count step end)
	@ on entry:	sv1 <- count
	@ on entry:	sv2 <- end  or step
	@ on entry:	sv3 <- null or (end)
	@ on exit:	sv1 <- result (list of numbers)
	.word	(2 << 8) | proc		@ revenu is a primitive with 2 input args (3rd optional)
        @ start of code                                                                     Line:
	sub	sv5, sv1, #4		@ sv5 <- count - 1	                 (scheme int)   1
	nullp   sv3                     @ was step not specified (i.e. {end} = '()) ?           2
        itEE    eq                      @ If-Then instruction (for Cortex)                      3
        seteq   sv4, #5                 @	if so,  sv4  <- 1 = default step (scheme int)   4
	setne	sv4, sv2		@	if not, sv4  <- step             (scheme int)   5
	carne	sv2, sv3	        @	if not, sv2  <- end              (scheme int)   6
	set     sv1, sv2                @ sv1 <- end, 1st value to cons to result(scheme list)  7
	list    sv2, sv1                @ sv2 <- (end) = initial result          (scheme list)  8
rvnulp:	@ loop over values to cons
        eq	sv5, #i0		@ is count = 0 (done) ?                  (#i0 is 0 int) 9
        beq     rvnuxt		        @       if so,  jump to exit                           10
	int2raw	rva, sv1		@ rva <- latest value consed to result   (raw int)     11
	int2raw	rvb, sv4		@ rvb <- step                            (raw int)     12
	add	rva, rva, rvb	        @ rva <- next value to cons to result    (raw int)     13
	raw2int	sv1, rva        	@ sv1 <- next value to cons to result    (scheme int)  14
	cons    sv2, sv1, sv2           @ sv2 <- (... end) == updated result     (scheme list) 15
        sub	sv5, sv5, #4		@ sv5 <- updated count                   (scheme int)  16
	b       rvnulp                  @ jump to add next item                                17
rvnuxt: @ exit
        set     sv1, sv2                @ sv1 <- result                          (scheme list) 18
        set     pc,  cnt		@ return with result in sv1                            19

The function receives its input arguments (scheme objects) in scheme value registers sv1 to sv3, its environment is in the register env (used, for example, if the function calls eval or bndchk) and its continuation (return address) is in the register cnt. The function code can use registers sv1 to sv5 (gc-ed) to manipulate scheme values and rva to rvc (not gc-ed) to manipulate raw values. Scheme values can be temporarily saved on the scheme stack, dts (gc-ed), if needed, and, if so, that stack needs to be popped back to its entry state prior to returning from the function. When ready to return, the function will need to store its return value in sv1 and will then set the program counter (pc) to its continuation (cnt). If the function needs to call another scheme function to perform its work (eg. eval or apply) it can save its return address (cnt) on the dts prior to that call, then use the macro call (armpit_as_macros.s) to call that other scheme function (the macro sets cnt for the appropriate return) and then restore its own continuation from the dts upon return from the call. The revenu function is a simple code example that does not use the env and dts registers and does not call other scheme functions.

In Line 1 of the revenu code, the number of items to cons onto the result list, in additon to the end value, is computed from count (scheme integer in sv1) and stored in sv5 for later use (scheme ints are shifted left by 2 bits relative to raw ints and therefore adding 4 to them is equivalent to adding 1 to a raw int. The same holds for subtraction if numbers are and remain positive or 0). Line 2 tests to see if 2 or 3 input arguments were provided to the function unsing the nullp macro that checks if the content of sv3 is null. The next 4 lines store the end value (scheme integer) in sv2 (for later use) and store the step in sv4 (scheme integer) based on whether 2 or 3 input arguments were provided. The IF-THEN instruction on line 3 is included for compatibility with Cortex-M3 cores (unified syntax). In this code, set (seteq, ...) is an alias (macro) for ARM's mov instruction and car (carne ...) is a macro for: ldr reg1, [reg2], which obtains the car of the scheme list in reg2 and stores it in reg1. Line 7 copies the end value to sv1 and Line 8 builds the initial result list, using the list assembler macro, and stores it in sv2. The statement: list sv2, sv1, builds a cons between the contents of sv1 and null, and stores the result in sv2 (Note: the list macro side-effects raw value registers rva to rvc but does not modify sv1-sv5 except the destination register for the list).

The main code loop starts on Line 9 by using eq (alias to ARM's teq) to test whether more integers should be consed to the front of the result list (i.e. if count, stored in sv5 as a scheme integer, is scheme zero = #0x01 = #i0). If no more numbers need to be consed the code jumps to rvnuxt: for function exit. If more numbers are to be consed, the last number added (in sv1) is converted to a raw integer by the macro int2raw and stored in raw value register a (rva) on Line 11. Similarly, the step (in sv4) is converted to a raw integer and stored in raw value register b (rvb) on Line 12. The sum of the last number (rva) and step (rvb) is then stored in rva (raw integer) and then converted back to a scheme integer with the raw2int macro and stored in scheme value register 1 (sv1) on Lines 13 and 14. The sum (in sv1) is consed to the front of the result list (in sv2) using the cons macro and the resulting list is stored back in sv2 on Line 15 (Note: the cons macro side-effects rva-rvc but preserves sv1-sv5 except the destination register for the cons which is updated. This is why raw values in rva and rvb need to be re-computed from sv1 and sv4 at each pass through the loop. Also, one could potentially replace Lines 11-14 with just 2 lines: (1) bic rva, sv4, #int_tag (2) add sv1, sv1, rva). The count of numbers remaining to be consed (in sv5) is decreased by 1 (as scheme int) on Line 16 and the code jumps back to repeat the loop on Line 17.

When the cons-loop is complete the function's result list is in sv2 and the code jumps to rvnuxt:. There, the result is moved to sv1 which is where it needs to be for proper return (Line 18). The function then returns by setting the program counter (pc) to the function's continuation (cnt).

At this stage, adding additional functions and variables to the system can be done by extending mycode.s only (i.e. steps 1 to 3 above -- armpit_050.s needs no further change). The main items to pay attention to are: 1) be sure to update the number of items in the sub-environment vector in accordance with its contents, and 2) be sure to align the start of each data and code item on a 32-bit boundary where needed (eg. after .ascii directives and after Thumb2/Cortex-M3 code). An example extension of mycode.s, with 2 new symbols, the function zig that returns its input argument and the variable zag that is bound to #t, is given below. The extension uses the macros VECSIZE, SYMSIZE and PFUNC, defined in armpit_as_macros.s, to replace some of the .word directives and enhance code readability.


.balign 4                               @ ensure that vector start is 32-bit aligned

mycode_env:	@	mycode's sub-environment
	VECSIZE	6			@ 6 items (3 symbol-value pairs) in the sub-env vector
	.word	sreven,	preven	        @ binding between symbol and function code
        .word   szig,   pzig
        .word   szag,   scheme_true

sreven: SYMSIZE	6			@ 6 characters in the symbol
	.ascii	"revenu"                @ these are the actual characters in the symbol
	.balign 4                       @ align the rest of the file on 32-bit boundary
	
preven:	@ (revenu count end) or (revenu count step end)
	@ on entry:	sv1 <- count
	@ on entry:	sv2 <- end  or step
	@ on entry:	sv3 <- null or (end)
	@ on exit:	sv1 <- result (list of numbers)
	PFUNC	2			@ revenu is primitive with 2 input args (3rd optional)
        @ start of code
	sub	sv5, sv1, #4		@ sv5 <- count - 1	                 (scheme int)
	nullp   sv3                     @ was step not specified (i.e. {end} = '()) ?        
        itEE    eq                      @ If-Then instruction (for Cortex)                   
        seteq   sv4, #5                 @	if so,  sv4  <- 1 = default step (scheme int)
	setne	sv4, sv2		@	if not, sv4  <- step             (scheme int)
	carne	sv2, sv3	        @	if not, sv2  <- end              (scheme int)
	set     sv1, sv2                @ sv1 <- end, 1st value to cons to result(scheme list)
	list    sv2, sv1                @ sv2 <- (end) = initial result          (scheme list)
rvnulp:	@ loop over values to cons
        eq	sv5, #i0		@ is count = 0 (done) ?                  (#i0 is 0 int)
        beq     rvnuxt		        @       if so,  jump to exit                      
	int2raw	rva, sv1		@ rva <- latest value consed to result   (raw int)
	int2raw	rvb, sv4		@ rvb <- step                            (raw int)
	add	rva, rva, rvb	        @ rva <- next value to cons to result    (raw int)
	raw2int	sv1, rva        	@ sv1 <- next value to cons to result    (scheme int)
	cons    sv2, sv1, sv2           @ sv2 <- (... end) == updated result     (scheme list)
        sub	sv5, sv5, #4		@ sv5 <- updated count                   (scheme int)
	b       rvnulp                  @ jump to add next item                              
rvnuxt: @ exit
        set     sv1, sv2                @ sv1 <- result                          (scheme list)
        set     pc,  cnt		@ return with result in sv1                           

.balign 4

szig:   SYMSIZE	3
	.ascii	"zig"
	.balign 4
	
pzig:	@ (zig object)
	@ on entry:	sv1 <- object
	@ on exit:	sv1 <- object
	PFUNC	1			@ 1 input arg
	set	pc,  cnt		@ return (do nothing, just return input arg)

.balign 4

szag:   SYMSIZE	3
	.ascii	"zag"
	.balign 4

Last updated January 7, 2012

bioe-hubert-at-sourceforge.net