A Scheme Interpreter for ARM Microcontrollers:
ChangeLog for Version 060
Changes from release 050:
-
This release supports 5 new Cortex-M4 boards, 4 of which have Floating Point Units
(FPU, enabled):
the SAM4S-eXplained (no FPU) from Embest and Atmel,
the EK-LM4F120 Launchpad and EK-LM4F232 from Texas Instruments,
the LPC4330-Xplorer from NGX, and
the STM32F4-Discovery from ST.
Support for the STM32F4-Discovery is a contribution from Petr Cermak.
Two additional boards (or MCUs) are supported, albeit preliminarily,
the Fractal MCU32-1.12 which is a contribution from Tzirechnoy (from the forums)
and the AT91-SAM3U4C MCU on the SAM4S-eXplained (the board has 2 MCUs on it).
For the AT91-SAM3U4C, however, only uart communication is implemented, which
makes it of little practical use at this time.
-
The source code of this release has been adjusted to assemble with GNU binutils
version 2.22.
-
The mcu_specific directory in the source code tree has a new structure designed
to reduce the use of .if statements in the armpit_060.s file.
Generic names (eg. init_io_code.s) are used within family-specifc directories.
Some device-specific macros are now included in the device.h files.
Each board has a separate sub-sub-directory containing a file (board.h)
where board configuration parameters and options are specified.
The buildarmpit script file (top directory) provides examples of how to
assemble Armpit Scheme with this new structure.
-
The source code has been separated into pairs of ..._code.s and ..._data.s files
in an attempt to take advantage of the Harvard architecture of Cortex-M3 and M4
MCUs.
The ..._code.s files contain machine instructions and the ..._data files contain
the corresponding data structures (eg. sub-environments, function name strings,
flash sector tables, function jump tables).
This splitting is used in the STM32F4-Discovery board where code is stored
in flash and data is copied to closely-coupled-memory (CCM) at startup, while
the heap is stored in the main RAM of the chip.
It is not clear however that this improves performance which may be bound
by heap-access times through the System bus (addresses above 0x20000000)
rather than the ICode and DCode busses (addresses below 0x20000000).
For example, the LPC4330-Xplorer, where heap RAM is below 0x20000000,
has better performance (but note that code runs from RAM on this chip).
The separation of instructions and data may still be beneficial where
separate instruction and data caches are present, as on Cortex-A8 chips,
where it reduces overlap and synchronization issues between caches.
-
The USB subsystem has been reworked to make better use of the write buffer and
eliminate jumbled echo issues observed on some MCUs.
USB (device) support has also been added to the STM32-H107 and TI-EvalBot boards.
-
The Memory Protection Unit (MPU) of Cortex-M3 and M4 MCUs (where available)
is now enabled and used to trigger garbage collection on those chips.
MPU use (where applicable) can be disabled by commenting out the enable_MPU
option in a board's board.h file (sub-sub directory of the mcu_specific directory).
-
The internal representation of pairs, sized objects, procedures and rational/complex numbers
has been modified to simplify and speed-up the system.
Pairs and lists are now referenced by an 8-byte aligned pointer directed to their car.
Sized objects (strings, vectors and bytevectors) are also 8-byte aligned but are
now referenced by a pointer directed 4-bytes into the object (which is where object
data starts, right after the 4-byte tag).
User procedures and continuations are now represented as implicitly sized (16-bytes),
gc-able objects, referenced by a pointer directed 4-byte into the object.
Built-in procedures are represented either as a 32-bit word containing a tag and the
address of the function's code (16-bit) or using a pointer directed at the start of
the function's code which is preceded by an 8-byte aligned tag.
Rational and complex numbers also remain 8-byte aligned but with a pointer directed 4-bytes
into them.
With this representation, differentiating between a pair (8-byte aligned pointer)
and a sized object, procedure or rational/complex number (4-byte aligned with bit 2 set)
is more efficient than in prior versions, which simplifies the evaluator (eg. evalsv1 macro
in file armpit_as_macros.s).
-
Two new performance-oriented assembly options have been introduced: fast_eval_apply
and fast_lambda_lkp.
The first option speeds-up the evaluation of functions with 1 to 3 input variables
that are called with an equal number of arguments by reducing the amount of temporary
data stored on the heap.
The second option speeds-up variable lookup performed inside closures by storing an index
for the expected binding-location of variables (within the closure's environment) in the
closure's code, where applicable.
The index is computed when the closure is first used.
For example: (define (g x) (+ x x)) is initially stored as: (((#> . +) (#> . x) (#> . x)))
as can be seen, in this version's new internal representation, with: (vector-ref g 1).
After application of the function g to some argument, the code of g is modified to: (((#> + . #) (#> 0 . x) (#> 0 . x))) which speeds-up subsequent lookups of the variables + and x.
The use of the fast_lambda_lkp option increases the memory footprint of closures,
and, while it speeds-up small programs, it may also slow down larger programs running
in limited RAM (see the performance web page for performance results).
Both the fast_eval_apply and fast_lambda_lkp options are enabled by default in MCUs
that are not classified as small_memory (8KB RAM) and can be disabled by commenting out
the corresponding statements in file armpit_060.s.
-
The ability of the evaluator to expand macros (cf. evlmac: in armpit_core_code.s)
has been removed in this version.
Macros are now automatically expanded in the reader only, but can also still be
manually expanded using the expand syntax-procedure (cf. updated Expert System example).
-
The number used in Armpit Scheme to refer to the Cortex-M3 and M4 Systick Timer interrupt
(cf. tick_hndlr: in armpit_reset_CM3.s) has been changed to 255 (from 64 in 050).
Last updated February 02, 2013
bioe-hubert-at-sourceforge.net