Category Archives: stm32

Updating an NXP MCU-Link to work with OpenOCD

Not just OpenOCD, but just.. get it updated.

MCU-Link is a USB-HS “CMSIS-DAP” compatible debug interface, cheaply available. It has a 2x5x1.27mm cortex debug connector, a bonus usb2uart TTL on 2.54mm pins, and even has the 1.27mm 10pin cable in the little case. Supports SWO trace at ~9M they say. Excellent!

As shipped it was running a very old firmware, older than they even have in their release notes. MCU-LINK r0FF CMSIS-DAP V0.078 OpenOCD didn’t even recognise this.

NXP Offers a “MCU-Link installer for Linux” which at the time of writing was v3.128. which certainly sounds more up to date. I downloaded that, found a shell script that wanted to be run as root…. So, with less commentary, here’s how I jumped all the hoops to upgrade the MCU-Link to recent, and get it working properly…

$ sh MCU-LINK_installer_3.128.x86_64.deb.bin --noexec --target .
$ ar x MCU-LINK_installer_3.128.x86_64.deb
$ tar -xf data.tar.gz
$ cd usr/local/MCU-LINK_installer_3.128/
# Now, plug in the "firmware update" jumper and replug the dongle
# This flashes the firmware itself...
$ bin/blhost --usb 0x1fc9,0x0021 flash-image probe_firmware/MCU-LINK_CMSIS-DAP_V3_128.s19 erase 0

# and here we "configure" it, leaving it as unlocked as we can...
$ bin/blhost --usb 0x1fc9,0x0021 read-memory 0x9dc00  512 myoutdump.bin 0
$ bin/mculink_cfg myoutdump.bin --clear --no-device_vendor --no-device_name --no-board_vendor --no-board_name
$ bin/blhost --usb 0x1fc9,0x0021 flash-erase-region 0x9dc00  512
$ bin/blhost --usb 0x1fc9,0x0021 write-memory 0x9dc00  myoutdump.bin  0

Now, unplug, take off the jumper, and replug. You now have a much more standard looking USB descriptors, and running Product: MCU-LINK (r0FF) CMSIS-DAP V3.128

And… now it happily supports non-NXP parts too.

$ openocd -f interface/cmsis-dap.cfg  -f target/stm32g4x.cfg 
Open On-Chip Debugger 0.12.0+dev-00469-g1b4afd13f (2024-01-08-22:49)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "swd". To override use 'transport select <transport>'.
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : Using CMSIS-DAPv2 interface with VID:PID=0x1fc9:0x0143, serial=X1JALQM40K3MS
Info : CMSIS-DAP: SWD supported
Info : CMSIS-DAP: JTAG supported
Info : CMSIS-DAP: SWO-UART supported
Info : CMSIS-DAP: Atomic commands supported
Info : CMSIS-DAP: Test domain timer supported
Info : CMSIS-DAP: UART via USB COM port supported
Info : CMSIS-DAP: FW Version = 2.1.1
Info : CMSIS-DAP: Serial# = X1JALQM40K3MS
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : SWCLK/TCK = 0 SWDIO/TMS = 1 TDI = 0 TDO = 0 nTRST = 0 nRESET = 1
Info : CMSIS-DAP: Interface ready
Info : clock speed 2000 kHz
Info : SWD DPIDR 0x2ba01477
Info : [stm32g4x.cpu] Cortex-M4 r0p1 processor detected
Info : [stm32g4x.cpu] target has 6 breakpoints, 4 watchpoints
Info : [stm32g4x.cpu] Examination succeed
Info : starting gdb server for stm32g4x.cpu on 3333
Info : Listening on port 3333 for gdb connections

Later, I’ll test the actual trace, just wanted to write these down while I still had it all.

FreeRTOS with STM32 Cube

Cube Sucks. But, if you’ve gotta use it, at least it has demos and things right? And you can use FreeRTOS, that’s integrated, right?

Not really. At the time of writing, working with STM32WB, the out of the box FreeRTOS demos will have problems with printf. Or, well, anything that uses a raw malloc() call.

Out of the box, the cube demo code uses FreeRTOS’s heap_4.c for memory management. This is fine. It means that you select a heap size in your FreeRTOSConfig.h file, and the port provides pvMalloc and pvFree that do the right thing. But, if any code, (anywhere!) uses malloc() itself, directly, then…. where does that come from? Well, it ends up in _sbrk() and the out of box implementation in the cube demos is completely busted. They provide you with the same implementation as for the non FreeRTOS demo.

So, looking at a “classic” memory arrangement, with heap starting at the end of BSS/DATA, and stack growing downwards, you have something like this. (_estack and _end are in the default out of box linker scripts from ST) FreeRTOS’s heap_4.c declares “ucHEAP” (you can statically provide it to, if you like) based on your user provided configTOTAL_HEAP_SIZE which then becomes another (somewhat large) chunk of “DATA” space.

The _sbrk() implementation basically checks that malloc() is asking for space that’s above _end and below the current stack pointer. This is fine when you’re not using an RTOS, but! The RTOS tasks have their own stacks, allocated from the RTOS heap! This is good and right, but it means that the out of box sbrk implementation is now absolutely garbage. It will never succeed, as every single tasks’ stack pointer will be below _end.

You have options here. https://nadler.com/embedded/newlibAndFreeRTOS.html has written extensively, though I think they make it all sound really complicated.

Option 1 – Just don’t call malloc (or… call it early…)

This is a garbage option. You either need to make sure you never call malloc, or make sure that you only ever call malloc before you start the FreeRTOS task scheduler. This is safe, as the memory you allocate here will be in safely in the gap above DATA on the diagram, but you must make all malloc() calls before starting the task scheduler. This requires no changes to the _sbrk() system call implementation

Option 2 – Use newlib completely

This is Nadler’s approach. Switch heap_4 out for heap_3, rewrite your system calls (including _sbrk() to make all of FreeRTOS use standard newlib malloc/free natively. This isn’t a bad option, but it’s fairly messy, isn’t really something the FreeRTOS people like much, and puts a lot of reliance on how well newlib behaves. The upside here is that you only have one memory management mechanism for the entire system here. (I failed to get this working at the time of writing)

Option 3 – Fix _sbrk() only

You can also just fix _sbrk. The upside here is the change is trivial. The downside is that you now have two heaps. One that you statically give to FreeRTOS via the config options for heap_4.c (ucHEAP in the DATA section), and a new one, growing upwards on demand, classic style, above DATA, show in red on this diagram. Remember, that in the FreeRTOS context, all of the space marked in purple and red is untouched, or “wasted”. You want to size your ucHEAP that you leave “enough” for untracked raw malloc users, but still big enough for your polite, well behaved FreeRTOS consumers.

The change itself is straightforward, just stop looking at the stack pointer, and look at the linker script provided pointer to the end of ram.

--- /out-of-box/application/sysmem.c	2021-03-26 10:02:54.045633733 +0000
+++ /fixed/application/sysmem.c	2021-04-07 13:58:27.610330156 +0000
@@ -37,18 +37,19 @@
 **/
 caddr_t _sbrk(int incr)
 {
 	extern char end asm("end");
 	static char *heap_end;
 	char *prev_heap_end;
+	extern char _estack;
 
 	if (heap_end == 0)
 		heap_end = &end;
 
 	prev_heap_end = heap_end;
-	if (heap_end + incr > stack_ptr)
+	if (heap_end + incr > &_estack)
 	{
 		errno = ENOMEM;
 		return (caddr_t) -1;
 	}
 
 	heap_end += incr;

And that’s it. The _estack symbol is already provided in the out of box linker scripts. Risks here are that you are only checking against end of ram. Conceivably, if you ran lots of deeply nested code with lots of malloc() calls before you start the scheduler, you might end up allocating heap space up over your running stack. You’d have to be very tight on memory usage for that to happen though, and if that’s the case, you probably want to be re-evaluating other choices.

Option 4 – Make newlib use FreeRTOS memory

There is another option, which is to make newlib internally use FreeRTOS pvMalloc/pvFree() instead of their own malloc/free. This is the logical inverse of Option 2, and has the same sorts of tradeoffs. It’s probably complicated to get right, but you end up with only one memory management system and one pool.

Using NetBeans for STM32 development with OpenOCD

This post supersedes http://false.ekta.is/2012/05/using-netbeans-for-stm32-development-with-stlink-texane/ it has been updated for 2016 and current best software tools.  Again, this is only focussed on a linux desktop environment.

Required pieces haven’t changed much, you still need:

  1. A tool chain.
  2. GDB Server middleware (OR just a tool flashing if you want to live in the dark ages)
  3. A “sexy” IDE (If you disagree on wanting an IDE, you’re reading the wrong post)

Getting a toolchain

New good advice

Advice here hasn’t changed.  The best toolchain is still gcc-arm-embedded  They steadily roll out updates, it’s the blessed upstream of all ARM Cortex GCC work, and it has proper functional multilib support and proper functional documentation and bug reporting.  It even has proper multi platform binaries for windows and macs.  Some distros are packaging “arm-none-eabi-XXXXX” packages, but they’re often old, repackages of old, poorly packaged or otherwise broken.  As of November 2015 for instance, ArchLinux was packaging a gcc 5.2 binary explicitly for arm-cortex, that did not support the -mmcu=cortex-m7 option added in gcc 5.x series.  Just say no.

I like untarring the binaries to ~/tools and then symlinking to ~/.local/bin, it avoids having to relogin or start new terminals like editing .bashrc and .profile does.
~/.local/bin$ ln -s ~/tools/gcc-arm-none-eabi-5_2-2015q4/bin/arm-none-eabi-* .

Old bad advice

The internet is (now) full of old articles recommending things like “summon-arm-toolchain” (Deprecated by the author even) “code sourcery (lite)” (CodeSourcery was bought by Mentor, and this has been slowly killed off.  Years ago, this was a good choice, but all the work they did has long since been usptreamed)  You can even find advice saying you need to compile your own.  Pay no attention to any of this.  It’s well out of date.

GDB Server middleware

New good advice

Get OpenOCD. Make sure it’s version 0.9 or better. 0.8 and 0.7 will work, but 0.9 is a _good_ release for Cortex-M and STM32 parts. If your distro provides this packaged, just use it. Fedora 22 has OpenOCD 0.8, Fedora 23 has OpenOCD 0.9. Otherwise, build it from source

Old bad advice

Don’t use texane/stlink.  Just don’t.  It’s poorly maintained, regularly breaks things when new targets are introduced and not nearly as flexible as OpenOCD.  It did move a lot faster than OpenOCD in the early days, and if you want a simpler code base to go and hack to pieces for this, go knock yourself out, but don’t ask for help when it breaks.

Netbeans

No major changes here, just some updates and dropping out old warnings.  You should still setup your toolchain in netbeans first, it makes the autodetection for code completion much more reliable.  I’ve updated and created new screenshots for Netbeans 8.1 the latest current release.

First, go to Tools->Options->C/C++->Build Tools and Add a new Toolchain…

Adding a new toolchain to netbeans

Adding a new toolchain to netbeans

Put in the “Base directory” of where you extracted the toolchain.  In theory netbeans uses the base directory and the “family” to autodetect here, but it doesn’t seem to understand cross tools very well.

Base path for new toolchain and name

Base path for new toolchain and name

Which means you’ll have to fill in the names of the tools yourself, as shown below.  Click on “Versions” afterwards to make sure it all works.

Adding explicit paths to netbeans toolchains

Adding explicit paths to netbeans toolchains

“Versions” should show you the right thing already, as shown

Versions from our new toolchain (and an error from gdb we must fix)

Versions from our new toolchain (and an error from gdb we must fix)

If you’re getting the error about ncurses from gdb, this because newer gdb builds include the curses “tui” interface to gdb in the standard build.  (Yay! this is a good thing!)  However, as the g-a-e toolchains are all provided as 32bit, you may be missing the 32bit ncurses lib on your system.  On Fedora, this is provided in the ncurses-libs.i686 package.

Ok, now time to build something.  This is your problem, I’m going to use one of the libopencm3 test programs right now, specifically, https://github.com/libopencm3/libopencm3/tree/master/tests/gadget-zero (the stm32l1 version)

Programming your device

Netbeans doesn’t really have any great way of doing this that I know of.  You can use build configurations to have “run” run something else, which works, but it’s a little fiddly.  I should spend more time on that though.  (See later for a way of doing it iteratively via the debugger console in netbeans)

In the meantime, if you just want to straight out program your device:

$ openocd -f board/stm32ldiscovery.cfg -c "program usb-gadget0-stm32l1-generic.elf verify reset exit"

No need for bin files or anything, there’s a time and a place for those, and if you don’t know and can explain why you need bin files, then elf files are just better in every way.

Debugging your part

GDB is always going to be a big part of this, but, assuming you’ve got it flashed, either by programming as above, then you can debug in netbeans directly.  First, make sure OpenOCD is running again, and just leave it running.


$ openocd -f board/stm32ldiscovery.cfg

First, install the gdbserver plugin, then choose Debug->Attach Debugger from the menu.

Attach to a gdbserver

Attach to a gdbserver

  • Make sure that you have “gdbserver” as the debugger type.  (Plugin must be installed)
  • Make sure that you have “ext :3333” for the target.  By default it will show “remote host:port” but we want (need) to use “extended-remote” to be able to restart the process. (See the GDB manual for more details)
  • Make sure that you have the right project selected.  There’s a bug in the gdbserver plugin that always forgets this.

At this point, “nothing” will happen.  If you look at the console where OpenOCD is running, you’ll see that a connection was received, but that’s it.
Press the “Pause” button, and you will stop execution wherever the device happened to be, and netbeans will jump to the line of code. In this example, it’s blocked trying to turn on HSE, as there’s no HSE fitted on this board:

Source debugging in netbeans via OpenOCD

Source debugging in netbeans via OpenOCD

If you now set a breakpoint on “main” or anywhere early and press the “Restart” icon in the top, OpenOCD will restart your process from the top and stop at the first breakpoint. Yay!  If restart fails, make sure you used “extended-remote” for the target!

Bonus

If you click the “Debugger console” window, you can actually flash your code here too.  Leave the debugger running!  (No need to stop the debugger to rebuild) Make a change to your code, rebuild it, and then, on the “Debugger console” just enter “load” and press enter.  You’ll see the OpenOCD output as it reflashes your device.  As soon as that’s done, just hit the “Restart” icon in netbeans and start debugging your new code!

Using SWO/SWV streaming data with STLink under linux – Part 2

In Part 1, we set up some (very) basic code that writes out data via Stimulus Channel 0 of the ITM to be streamed otu over SWO/SWV, but we used the existing ST provided windows tool, “STLink” to be view the stream. Now let’s do it in linux.

OpenOCD has some very draft support for collecting this data, but it’s very rough around the edges. [1]

I wrote a tool based on my own decoding of USB traffic to be a little more flexible. You connect to the STLink hardware, and can start/stop logging, change trace files, and change which stimulus ports are enabled. It is quite rough, but functional. It should not be underestimated how important being able to start/stop tracing is. In the ARM debug docs, turning on or reconfiguring trace is undefined as far as having the output bitstream be properly synced. (Section D4.4 of “Flush of trace data at the end of operation” in the Coresight Architecture spec, and most importantly, “C1.10.4 Asynchronous Clock Prescaler Register, TPIU_ACPR” in the ARMv7M architecture reference manual)

Don’t get me wrong, although my tool works substantially better than OpenOCD does, it’s still very rough around the edges. Just for starters, you don’t have debug or flash at the same time! Having it integrated well into OpenOCD (or pyOCD?) is definitely the desired end goal here.

Oh yeah, and if your cpu clock isn’t 24MHz, like the example code from Part 1, then you must edit DEFAULT_CPU_HZ in the top of hack.py!

So, how do you use it?

First, get the source from github: https://github.com/karlp/swopy. You need pyusb 1.x, then run it, and type connect

karlp@tera:~/src/swopy (master)$ python hack.py 
:( lame py required :(
(Cmd) connect
STLINK v2 JTAG v14 API v2 SWIM v0, VID 0x483 PID 0x3748
DEBUG:root:Get mode returned: 1
DEBUG:root:CUrrent saved mode is 1
DEBUG:root:Ignoring mode we don't know how to leave/or need to leave
(1682, 2053)
('Voltage: ', 2.9293697978596906)
DEBUG:root:enter debug state returned: array('B', [128, 0])
('status returned', array('B', [128, 0]))
('status is: ', 'RUNNING')
(Cmd) 

Yes, there’s lots of debug. This is not for small children. You have been warned, but there is some help!

(Cmd) help

Documented commands (type help ):
========================================
connect     raw_read_mem32   run       swo_read_raw
magic_sync  raw_write_mem32  swo_file  swo_start   

Undocumented commands:
======================
EOF          exit  leave_state  raw_read_debug_reg   swo_stop
enter_debug  help  mode         raw_write_debug_reg  version 

(Cmd) 

The commands of interest are swo_file, swo_start and swo_stop. So, enter a file name, and start it up…

(Cmd) swo_file blog.bin
(Cmd) swo_start 0xff
INFO:root:Enabling trace for stimbits 0xff (0b11111111)
DEBUG:root:READ DEBUG: 0xe000edf0 ==> 16842752 (0x1010000) status=0x80, unknown=0x0
DEBUG:root:WRITE DEBUG 0xe000edfc ==> 16777216 (0x1000000) (res=array('B', [128, 0]))
DEBUG:root:READMEM32 0xe0042004/4 returned: ['0x0']
DEBUG:root:WRITEMEM32 0xe0042004/4 ==> ['0x27']
DEBUG:root:WRITEMEM32 0xe0040004/4 ==> ['0x1']
DEBUG:root:WRITEMEM32 0xe0040010/4 ==> ['0xb']
DEBUG:root:STOP TRACE
DEBUG:root:START TRACE (buffer= 4096, hz= 2000000)
DEBUG:root:WRITEMEM32 0xe00400f0/4 ==> ['0x2']
DEBUG:root:WRITEMEM32 0xe0040304/4 ==> ['0x0']
DEBUG:root:WRITEMEM32 0xe0000fb0/4 ==> ['0xc5acce55']
DEBUG:root:WRITEMEM32 0xe0000e80/4 ==> ['0x10005']
DEBUG:root:WRITEMEM32 0xe0000e00/4 ==> ['0xff']
DEBUG:root:WRITEMEM32 0xe0000e40/4 ==> ['0xff']
DEBUG:root:READMEM32 0xe0001000/4 returned: ['0x40000000']
DEBUG:root:WRITEMEM32 0xe0001000/4 ==> ['0x40000400']
DEBUG:root:READMEM32 0xe000edf0/4 returned: ['0x1010000']
DCB_DHCSR == 0x1010000
(Cmd) rDEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
unDEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 16 bytes of trace buffer
DEBUG:root:Wrote 16 trace bytes to file: blog.bin
DEBUG:root:reading 18 bytes of trace buffer
DEBUG:root:Wrote 18 trace bytes to file: blog.bin

Ok, great. but… where’d it go? Well. It’s in the native binary ARM CoreSight trace format, like so…

karlp@tera:~/src/swopy (master *)$ tail -f blog.bin | hexdump -C 
00000000  01 54 01 49 01 43 01 4b  01 20 01 37 01 31 01 38  |.T.I.C.K. .7.1.8|
00000010  01 0d 01 0a 01 54 01 49  01 43 01 4b 01 20 01 37  |.....T.I.C.K. .7|
00000020  01 31 01 39 01 0d 01 0a  01 54 01 49 01 43 01 4b  |.1.9.....T.I.C.K|
00000030  01 20 01 37 01 32 01 30  01 0d 01 0a 01 54 01 49  |. .7.2.0.....T.I|
00000040  01 43 01 4b 01 20 01 37  01 32 01 31 01 0d 01 0a  |.C.K. .7.2.1....|
00000050  01 54 01 49 01 43 01 4b  01 20 01 37 01 32 01 32  |.T.I.C.K. .7.2.2|
00000060  01 0d 01 0a 01 54 01 49  01 43 01 4b 01 20 01 37  |.....T.I.C.K. .7|
00000070  01 32 01 33 01 0d 01 0a  01 54 01 49 01 43 01 4b  |.2.3.....T.I.C.K|
00000080  01 20 01 37 01 32 01 34  01 0d 01 0a 01 54 01 49  |. .7.2.4.....T.I|
00000090  01 43 01 4b 01 20 01 37  01 32 01 35 01 0d 01 0a  |.C.K. .7.2.5....|
000000a0  01 54 01 49 01 43 01 4b  01 20 01 37 01 32 01 36  |.T.I.C.K. .7.2.6|
000000b0  01 0d 01 0a 01 54 01 49  01 43 01 4b 01 20 01 37  |.....T.I.C.K. .7|
000000c0  01 32 01 37 01 0d 01 0a  01 54 01 49 01 43 01 4b  |.2.7.....T.I.C.K|
000000d0  01 20 01 37 01 32 01 38  01 0d 01 0a 01 50 01 75  |. .7.2.8.....P.u|
000000e0  01 73 01 68 01 65 01 64  01 20 01 64 01 6f 01 77  |.s.h.e.d. .d.o.w|
000000f0  01 6e 01 21 01 0d 01 0a  01 54 01 49 01 43 01 4b  |.n.!.....T.I.C.K|
00000100  01 20 01 37 01 32 01 39  01 0d 01 0a 01 54 01 49  |. .7.2.9.....T.I|
00000110  01 43 01 4b 01 20 01 37  01 33 01 30 01 0d 01 0a  |.C.K. .7.3.0....|
00000120  01 68 01 65 01 6c 01 64  01 3a 01 20 01 32 01 34  |.h.e.l.d.:. .2.4|
00000130  01 35 01 32 01 20 01 6d  01 73 01 0d 01 0a 01 54  |.5.2. .m.s.....T|
00000140  01 49 01 43 01 4b 01 20  01 37 01 33 01 31 01 0d  |.I.C.K. .7.3.1..|
00000150  01 0a 01 54 01 49 01 43  01 4b 01 20 01 37 01 33  |...T.I.C.K. .7.3|

Which is ugly, but you get the idea.

This is where swodecoder.py comes in. The author of the original SWO support in OpenOCD has some code to do this too, it’s more forgiving of decoding, but more likely to make mistakes. Mine is somewhat strict on the decoding, but probably still has some bugs.

Usage is pretty simple

$ python swodecoder.py blog.bin -f
Jumping to the near the end
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 73
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 67
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 75
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 32
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 50
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 49
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 53
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 51
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 13
Not in sync: invalid byte for sync frame: 1
Not in sync: invalid byte for sync frame: 10
TICK 2154
TICK 2155
TICK 2156
TICK 2157
---snip---
TICK 2194
TICK 2195
TICK 2196
Pushed down!
held: 301 ms
Pushed down!
TICK 2197
held: 340 ms
TICK 2198
TICK 2199
Pushed down!
TICK 2200
TICK 2201
---etc---

You (hopefully) get the idea. When the writes to the stimulus ports are 8bit, swodecoder.py simply prints it to the screen. So here we have a linux implementation of the SWV viewer from the windows STLink tool. It’s got a lot of debug, and a few steps, but the pieces are all here for you to go further.

In part three, we’ll go a bit further with this, and demonstrate how SWO lets you interleave multiple streams of data, and demux it on the host side. That’s where it starts getting fun. (Hint, look at the other arguments of swodecoder.py and make 16/32bit writes to the stimulus registers)

To stop SWO capture, type “swo_stop” and press enter, or just ctrl-d, to stop trace and exit the tool.

[1] Most importantly, you can not stop/start the collection, you can only set a single file at config time, which isn’t very helpful for running a long demon. Perhaps even worse, OpenOCD is hardcoded to only enable stimulus port 0, which is a bit restrictive when you can have 32 of them, and being able to turn them on and off is one of the nice things.

Using SWO/SWV streaming data with STLink under linux – Part 1

This is part 1 in a short series about using the SWO/SWV features of ARM Cortex-M3 devices [1]
This post will not explain what SWO/SWV is, (but trust me, it’s cool, and you might work it out by the end of this post anyway) but will focus on how to use it.

First, so you have a little idea of where we’re going, let’s start at the end…

enum { STIMULUS_PRINTF }; // We'll have more one day
 
static void trace_send_blocking8(int stimulus_port, char c) {
        while (!(ITM_STIM8(stimulus_port) & ITM_STIM_FIFOREADY))
                ;
        ITM_STIM8(stimulus_port) = c;
}
 
int _write(int file, char *ptr, int len)
{
        int i;
        if (file == STDOUT_FILENO || file == STDERR_FILENO) {
                for (i = 0; i < len; i++) {
                        if (ptr[i] == '\n') {
                                trace_send_blocking8(STIMULUS_PRINTF, '\r');
                        }
                        trace_send_blocking8(STIMULUS_PRINTF, ptr[i]);
                }
                return i;
        }
        errno = EIO;
        return -1;
}

You can get this code from either:

  • My github repository
  • The swo-1-printf directory in swo-stlink-linux-1

    That’s all[2] you need to have printf redirected to an ITM stimulus port. It’s virtually free, doing nothing if you don’t have debugger connected. [3]

    Groovy. If you have the Windows STLink Utility, you can use this right now. Enter the correct clock speed of your main app, and choose stimulus 0 (or all) and watch your lovely console output.

    stlink-windows-swv

    Ok, that’s cool, but weren’t we going to do this in linux? We were, and we will, but let’s stop here with a good working base, so we can focus on just the extra stuff later.

    [1] Cortex M4 too, but not M0, that’s another day altogether. Specifically, STM32L1 parts, but the concepts and code are the same
    [2] Expects you have your general makefiles all set up to do “the right thing” for newlib stubs and so on.
    [3] Except for generating the formatted string of course, that’s not free. And it does take a _little_ bit of time to write the characters out without overflowing, but that’s a story for another day.

STM32 Unique ID Register on L1 parts

Last week I mentioned I was seeing duplicate “Unique” ID registers on some STM32L151 parts [1], and included some memory dumps of the three unique ID registers as proof.

However, I had foolishly assumed that on the L1 the Unique ID register was three contiguous 32 bit registers, as it is on the F0, F1, F2, F3 and F4. (The base address changes, but that’s normal and expected)

On the L1, the registers are actually at offset 0, offset 4, and offset 0x14. Thanks for nothing ST! :(
(Oh, and L1 Medium+ and High Density devices use a different base address too, good job)

Here’s some more complete memory dumps for the same three parts I was looking at last week.

UID31:0 UID63:32 UID95:64
Part A 0x0e473233 0x30343433 0x00290031
Part B 0x0e473233 0x30343433 0x00320032
Part C 0x0e473233 0x30343433 0x00380030

Reading other reference manuals, and seeing that the UID registers often have 8 bits of unsigned “Wafer number”, 7 bytes of ASCII Lot number, and 4 bytes of X/Y wafer coordinates in BCD, I would interpret my part “A” above as

Wafer Number Lot Number X/Y coords
Hex 0x0e 0x47323330343433 0x00290031
Natural 0x0e G230443 X=0029, Y=0031

For reference, here are some full dumps of that section of memory. “0x7b747800” is what I had been looking at as UID bits 95:64, note that there are other bits in this section with fixed values, no idea what they mean :)

Part A

(gdb) x /20x 0x1FF80050
0x1ff80050: 0x0e473233  0x30343433  0x7b747800  0x50505091
0x1ff80060: 0x00000011  0x00290031  0x11000000  0x11000011
0x1ff80070: 0x00000000  0x00000000  0x029f067e  0x035a0000
0x1ff80080: 0x035a0000  0x035a0000  0x035a0000  0x035a0000
0x1ff80090: 0x035a0000  0x035a0000  0x035a0000  0x035a0000
(gdb) 

Part B

(gdb) x /20x 0x1FF80050
0x1ff80050: 0x0e473233  0x30343433  0x7b747800  0x50505091
0x1ff80060: 0x00000011  0x00320032  0x11000000  0x11000011
0x1ff80070: 0x00000000  0x00000000  0x02a50685  0x035e0000
0x1ff80080: 0x035e0000  0x035e0000  0x035e0000  0x035e0000
0x1ff80090: 0x035e0000  0x035e0000  0x035e0000  0x035e0000
(gdb)

Part C

(gdb) x /20x 0x1FF80050
0x1ff80050: 0x0e473233  0x30343433  0x7b747800  0x50505091
0x1ff80060: 0x00000011  0x00380030  0x11000000  0x11000011
0x1ff80070: 0x00000000  0x00000000  0x02a50689  0x035e0000
0x1ff80080: 0x035e0000  0x035e0000  0x035e0000  0x035e0000
0x1ff80090: 0x035e0000  0x035e0000  0x035e0000  0x035e0000
(gdb)

[1] Again, these are STM32L151C6T6 parts, revision V, package markings “GH254 VG” and “CHN309”

STM32 Unique ID register not so unique (Or, how to read docs properly)

UPDATE: This post is WRONG! See updated information here

The findings below were based on expecting the UID register to be contiguous as it is on all other STM32 parts. This is not true on the L1 family, and I hadn’t taken enough care with reading the reference manual.

Original post below


Following up from when I wrote about it earlier, it turns out that the “unique” id isn’t as unique as it is meant to be.

On my desk I have three different STM32L151C6T6 revision “V” parts, with exactly the same 96bit unique id. The parts all have package labels “GH254 VG” and CHN309

UID[32:0] (0x1FF80050) UID[63:32] (0x1FF80054) UID[96:64] (0x1FF80058)
Hex 0x0e473233 0x30343433 0x7b747800
Decimal 239546931 808727603 2071230464

According to reports on the irc channel ##stm32, this has also been seen (at least) on stm32f407vet6 parts.

Not fun :(

Sniffing 802.15.4 with wireshark and MRF24J40 modules

Wireshark screenshot showing my test data

This is what I did the last two evenings or so. The data being sent is from my example app for my MRF24J40 module library, simrf.

The streaming into wireshark is all thanks to one of the Contiki devs, George Oikonomou’s sensniff project. The peripheral code is just another app using my library.

Neat! This should make some things a bit easier to understand as I move into getting the full networking stack working in Contiki. (You can ignore the warnings from wireshark about the packets being invalid 6LoWPAN frames, all the frames are just raw 802.15.4, with “abcd” being sent as the frame payload.

Porting Contiki on STM32 (libopencm3) continued – timers and clocks and uarts

When I fixed my stupid error earlier, I got stdout/printf working. I then poked a lot of simple examples and started to understand how to write simple contiki apps using protothreads. It’s really pretty simple, as long as you remember that it’s all done with switch/case magic, so you can’t use stack variables the way you might expect. This is ok. This makes you think more about where you data lives and what you’re passing around. With stdout working though, I started to go through what else was needed.

Contiki offers ~4 different apis for doing things at some point in time. ctimers, etimers, stimers and rtimers. (And also with the clock_* api, which are for when you reallllllly want to do some busy waiting. Don’t do that!) This wiki page was very helpful in understanding the differences, and even having some “porting guide” information. This is an area I feel contiki is very weak in, the almost complete lack (to my eyes) of a porting guide. There’s quite a few different ports in the tree, but they’re often implemented in quite radically different ways, and some of them appear to be unmaintained.

But, the good thing about all these timers, is that’s all core. Your port only needs to implement clock.c, and it’s basically all taken care of! Neat! Except rtimers. rtimers are the realtime timers, used for turning radios on and off at the right times to synchronize. When I get to the radio (soon) I’ll be looking at other implementations and seeing what’s best.

As it stands though, ctimers, etimers, timers and the clock routines are all working in my port, and I’d played with some simple play applications to test out how they worked. I then tried to get the shell working. So, this is an app. So you need to put APPS+=serial-shell into your makefile. (Remember, this is the makefile for your final application) Then, because the AUTOSTART_PROCESSES() macro can only be used once in an application, the apps you can include can’t use that macro. You need to start them yourself. Here’s what my “demo” playground main application process looked like:

PROCESS_THREAD(foo_process, ev, data)
{
  PROCESS_BEGIN();
 
  printf("Hello foo world\n");
  leds_blink();  // oh yeah, the leds api works for the stm32l discovery board too ;)
  serial_shell_init();  // starts the serial shell process
 
  // these are all commands the shell can run.  
  shell_ps_init();
  shell_blink_init();
  shell_powertrace_init();
 
  // these are experiments with my own applications
  static struct blipper_info bl1 = { CLOCK_SECOND * 2, 'a' };
  static struct blipper_info bl2 = { CLOCK_SECOND * 5, 'b' };
  process_start(&blipper_process, (void*)&bl1);
  process_start(&blipper2_process, (void*)&bl2);
 
  PROCESS_END();
}

Note that serial-shell is actually just a wrapper around the shell app.
Note that you need to call a shell_XXXXX routine for each command you need to add. Note well that that is not documented anywhere. You actually have to look in [contiki-root]/apps/shell for all the shell-xxx.c files that call shell_register_command() Or, have a look at [contiki-root]/examples/example-shell/example-shell.c

Ok, so you got a shell application built and flashed. But… it doesn’t work! I ran into two problems here. One of them was very well documented, I just didn’t read it. I was used to having to implement newlib syscalls like _read() and _write(), but you actually need to not implement _read(), and make sure you just follow the directions!.

Unfortunately, in my case, this wasn’t enough. I could run my application with the native target, but not on my stm32. Again, reading the documentation, contiki processes serial input line by line, looking for a line feed (hex 0xa) to mark the end of line, and completely ignoring carriage return characters (hex 0xd) With gdb I could see that on the native platform, entering a command and pressing “enter” sent only a LF character, and it worked. With picocom and a USB-serial adapter to my board, I was seeing only a CR character, which was ignored. Wikipedia has a lot to say about this, and some of the helpful people on ##stm32 pointed out that this was because my serial device was in “cooked” mode, and I could put it back to raw mode to send the “right” characters. Turns out lots of terminal software has to deal with this, and miniterm.py (Something I just happened to have installed anyway) by default does the “right thing” (or the other thing) There were some suggestions that contiki should probably be more flexible in it’s input, and deal with both either, or, and both, but not require a specific one. That’s a future debate, but not one I’m battling now.

With a different serial terminal program, all of a sudden I had a console, and a shell. Here’s a full log of my app running, and me typing, “ps” and “help”

--- Miniterm on /dev/ttyUSB2: 115200,8,N,1 ---
--- Quit: Ctrl+]  |  Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
Platform init complete
Hello foo world
Hello blipper id: a!
Started timer
Hello blipper2 id: b!
Started timer
Should get printed after any event, even if it wasn't ours....
callback about to be set...
Should get printed after any event, even if it wasn't ours....
0.0: Contiki> 
Should get printed after any event, even if it wasn't ours....
Processes:
ps
periodic blipper2 process
periodic blipper process
Shell server
Shell
Contiki serial shell
Ctimer process
Event timer
Serial driver
Should get printed after any event, even if it wasn't ours....
0.0: Contiki> 
a hit timer expiry tick: 0 at clock time: 2
Should get printed after any event, even if it wasn't ours....
In the callback!
Should get printed after any event, even if it wasn't ours....
Available commands:
?: shows this help
blink [num]: blink LEDs ([num] times)
exit: exit shell
help: shows this help
kill : stop a specific command
killall: stop all running commands
null: discard input
powertrace [interval]: turn powertracing on or off, with reporting interval 
ps: list all running processes
quit: exit shell
Should get printed after any event, even if it wasn't ours....
0.0: Contiki> 
a hit timer expiry tick: 1 at clock time: 4
Should get printed after any event, even if it wasn't ours....
b hit timer expiry tick: 0 at clock time: 5
b Should only print after our event....

--- exit ---

Note that there is no local echo of commands! (Again, this can be turned on by the terminal program, but often it’s handled by the far side’s shell implementation)

But, all’s well that ends well. At this point I have what’s a pretty complete port of the core of contiki to the STM32L discovery board, and all the code that is common for any stm32 (and mostly, any chip using libopencm3) is in the cpu section of contiki, rather than the platform code.

Now, it’s just time to start the radio drivers, and work out the best way of implementing the rtimers!

My Platform port: https://github.com/karlp/contiki-outoftree This is all you need to target the STM32L discovery, it includes my changes to Contiki, as well as libopencm3 as git submodules.
Contiki port: https://github.com/karlp/contiki-locm3/tree/locm3 You’ll need this if you want to make another platform port based on libopencm3

libopencm3 with Contiki on STM32L part 2 – it’s alive, and a stupid error

In a previous post I got to a compiling and linking build for the STM32L Discovery board, but it didn’t actually print anything. Turns out I’d made one of the classically common mistakes with STM32 development, and one of the weakpoints in libopencm3’s api for the RCC module.

rcc_peripheral_enable_clock(&RCC_APB1ENR, RCC_APB2ENR_USART1EN);

The API unfortunately needs both the register, and the bit, and if you can’t see the problem, don’t worry, you’re not alone. The problem is that I was turning on a feature in APB1 (ONE) with a bit definition designed for APB2 (TWO)

When I actually turned on the USART peripheral, everything started working as I expected. I can’t believe how often I’ve done something like this. Or how often I’ve simply not turned on what I needed. Later, I added code to support USART2 and USART3, but…. didn’t turn on those peripherals. Silly me.

But, that does mean that my Contiki port to the STM32L Discovery board is alive: https://github.com/karlp/contiki-outoftree

Much more to come! Radio drivers and more examples and onwards!