Tag Archives: gcc

version control of tools

Ahh, it’s always the right thing to do, but normally it’s so heavy to check in things like GCC/Binutils versions into every project. It’s the sort of thing you do if you’re a big serious company, working on big serious things, with big serious support. Besides, I’m not relying on anything weird or esoteric right?

Wrong.


../../common/pjrc/usb_debug_only.c:96:45: error: variable 'device_descriptor' must be const in order to be put into read-only section by means of '__attribute__((progmem))'
static uint8_t PROGMEM device_descriptor[] = {

Thanks for nothing avr-gcc updates. I understand what you’re getting at, but still, this used to compile, and put things in PROGMEM. Now it fails the compile. GCC bug in question is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44643, or at least, that’s where it came from. A nice case of, “everybody’s using this code, but it was never actually meant to work, so let’s make it fail for them all, like it was meant to anyway.”

Compilers are awesome.

Unaligned memory access fault on Cortex-M3

AKA A surprising thing that happened to me while porting Contiki to the STM32F1.
AKA Some steps to take when diagnosing an unexpected hard fault on ARM Cortex M3

I already have a STM32L1 port working (for the basic uses of Contiki) and the major difference with this port is that it should support pretty much any target that libopencm3 supports. So I made a new platform and tweaked the GPIO settings for the STM32F1, and flashed it to my STM32VL Discovery board, and…. it started, but then it crashed.

Program received signal SIGINT, Interrupt.
blocking_handler () at ../../cm3/vector.c:86
86	{
(gdb) bt
#0  blocking_handler () at ../../cm3/vector.c:86
#1  
#2  update_time () at contiki/core/sys/etimer.c:72
#3  

Now, I don’t see unhandled exceptions much these days. I consulted the Configurable Fault Status Register (CFSR) at 0xE000ED28 and compared that to the definitions in ARM’s “Cortex M3 Devices Generic User Guide” (link will google search to the current location of that doc)

(gdb) x /wx 0xE000ED28
0xe000ed28:	0x01000000
(gdb) 

Ok, some bit in the top 16bits. That’s the Usage Fault Status Register(UFSR). Let’s look at it a little closer because I can’t count hex digits in my head as well as some people.

(gdb) x /hx 0xE000ED2a
0xe000ed2a:	0x0100
(gdb)

Ok. That bit means, Unaligned access UsageFault. Awesome. One of the big selling points of ARM Cortex-M is that it doesn’t care about alignment. It all “just works”. Well, except for this footnote: "Unaligned LDM, STM, LDRD, and STRD instructions always fault irrespective of the setting of UNALIGN_TRP" Ok, so let’s see what caused that. GDB “up” two times to get to the stack frame before the signal handler. x /i $pc is some magic to decode the memory at the address pointed to by $pc.

(gdb) up
#1  
(gdb) up
#2  update_time () at contiki/core/sys/etimer.c:72
72	      if(t->timer.start + t->timer.interval - now < tdist) {
(gdb) x /i $pc
=> 0x80005c6 :	ldmia.w	r3, {r1, r4}
(gdb) info reg
r0             0x7d2	2002
r1             0x393821d9	959979993
r2             0x39381a07	959977991
r3             0x29d0fb29	701561641
r4             0x20000dc4	536874436
r5             0x2000004c	536870988
r6             0x0	0
r7             0x14	20
r8             0x20001f74	536878964
r9             0x20000270	536871536
r10            0x800c004	134266884
r11            0xced318f5	-825026315
r12            0x0	0
sp             0x20001fb8	0x20001fb8
lr             0x80005b9	134219193
pc             0x80005c6	0x80005c6 
xpsr           0x21000000	553648128
(gdb) 

Check it out. There’s an ldm instruction. And r3 is clearly not aligned. (It doesn’t even look like a valid pointer to SRAM, but we’ll ignore that for now) Ok, so we got an unaligned access, and we know where. But what the hell?! Let’s look at the C code again. That t->timer is all struct stuff. Perhaps there’s some packed uint8_ts or something, maybe some “optimizations” for 8bit micros. Following the chain, struct etimer contains a struct process, which contains a struct pt which contains a lc_t. And only the lc_t. Which is an unsigned short. I guess there’s some delicious C rules here about promotion and types and packing. There’s always a rule.

Changing the type of lc_t to an unsigned int, instead of a short and rebuilding stops it from crashing. Excellent. Not. It does make the code a little bigger though.

karlp@tera:~/src/kcontiki (master *+)$ cat karl-size-short 
   text	   data	    bss	    dec	    hex	filename
  51196	   2836	   3952	  57984	   e280	foo.stm32vldiscovery
karlp@tera:~/src/kcontiki (master *+)$ cat karl-size-uint 
   text	   data	    bss	    dec	    hex	filename
  51196	   2916	   3952	  58064	   e2d0	foo.stm32vldiscovery
karlp@tera:~/src/kcontiki (master *+)$

I’m not the first to hit this, but it certainly doesn’t seem to be very common. Apparently you should be able to use -mnounaligned-access with gcc to force it to do everything bytewise, but that’s a pretty crap option, and it doesn’t seem to work for me anyway. Some people feel this is a gcc bug, some people feel it’s “undefined behaviour”. I say it’s “unexpected behaviour” :) In this particular case, there’s no casting of pointers, and use (or lack thereof) of any sort of “packed” attributes on any of the structs, so I’d lean towards saying this is a compiler problem, but, as they say, it’s almost never a compiler problem :)

Here are some links to other discussion about this. (complete with “MORON! COMPILERS ARE NEVER WRONG” type of helpful commentary :)

I’m still not entirely sure of the best way of proceeding from here. I’m currently using GCC version arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.7.3 20121207 (release) [ARM/embedded-4_7-branch revision 194305], and I should probably try the 4.7-2013-q1-update release, but if this is deemed to be “user error” then it’s trying to work out other ways of modifying the code to stay small for everyone where possible, but still work for everyone.

Not entirely what I’d planned on doing this evening, but someone enlightening at least.

Code Size changes with “int” on 8bit and 32bit platforms

I was looking for a few bytes extra flash today, and realized that some old AVR code I had, which used uint8_t extensively for loop counters and indexes (dealing with small arrays) might not be all that efficient on the STM32 Cortex-M3.

So, I went over the code and replaced all places where the size of the counter wasn’t really actually important, and made some comparisons. I was compiling the exact same c file in both cases, with only a type def changing between runs.

Compiler versions and flags

platform gcc version cflags
AVR avr-gcc (GCC) 4.3.5 -DNDEBUG -Wall -Os -g -ffunction-sections -fdata-sections -Wstrict-prototypes -mmcu=atmega168 -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums
STM32 arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.6.2 20120316 (release) [ARM/embedded-4_6-branch revision 185452] -DNDEBUG -Wall -Os -g -ffunction-sections -fdata-sections -Wstrict-prototypes -fno-common -mcpu=cortex-m3 -mthumb

And… here’s the results

counter type avr-size arm-none-eabi-size
unsigned int 1318 844
uint8_t (original) 1160 856
uint_least8_t 1160 856
uint_fast8_t 1160 844
int 1330 820
int8_t 1212 872
int_least8_t 1212 872
int_fast8_t 1212 820

I would personally say that it looks like ARM still has some work to go on optimizations. If _least8 and _fast8 take up more space than int it’s not really as polished as the avr-gcc code yet. For me personally, as this code no longer has to run on both AVR and STM32, I’ll just use int.

So, after extending this a bit, my original conclusion about the fast_ types not being fully optimized with arm-gcc were wrong. It’s more that, on AVR, your “don’t care” counters should be unsigned for smaller size, while on STM32, they should be signed (Though I still think it’s dodgy that int_least8_t resulted in bigger code than int_fast8_t) Also, even if signed is better in the best case, the wrong signed is also the worst case. Awesome.