ARM Architecture

Racing to Sleep – Jason Sachs

At the moment we’re going to speak about low-power design.

Suppose I’m {an electrical} engineer working with wildlife biologists who’re gathering discipline information on the Saskatchewan ringed-neck mountain goat. My staff has designed a tool referred to as the BigBrotherBear 2000 (BBB2000) with a visit cable and a motor and a digicam and a temperature sensor and a hot-wire anemometer and a real-time clock and an SD card and a battery and a LoRa transceiver. The thought is one thing like this:

  • Each quarter-hour, the BBB2000 measures temperature and wind velocity, and shops them to the SD card
  • If the journey cable will get run over, the next sequence of occasions happens:

    • the digicam takes some variety of footage in speedy succession and shops them to the SD card
    • 5 minutes later, the motor rewinds the journey cable so it is able to go once more

  • Each 24 hours, the BBB prompts the LoRa transceiver and transmits its information to a base station just a few kilometers away

There are wherever between 5 and 20 BBB2000 units per base station. The bottom stations then relay that information by way of satellite tv for pc modem to the Web, accessible at SaskatchewanRingedNeckMountainGoat.information by wildlife biologists all through the world, so that everybody can see footage of stunned mountain goats after they journey over a cable.

The biologists actually have a ton of 12V 5Ah batteries (625 of them at 1.6kg apiece) stacked up in a closet, because of a beneficiant donor, in order that’s the ability supply for the BBB2000.

Right here’s the query: how usually does the latest grad scholar have to move out into the backcountry schlepping batteries to the BBB2000s to modify them out with freshly-charged replacements?

Energy Budgets

To reply this form of query, we have to undergo the train of a energy price range.

There are a variety of things right here, however principally it boils all the way down to treating our system as having a quiescent low-power draw, and quite a few occasions; for every occasion we have to know:

  • how a lot vitality it makes use of (or cost, as referenced to battery voltage)
  • how ceaselessly it happens

That’s it!

So let’s undergo our system and determine these things out. (Notice to the pedantic reader: since it is a fictional scenario, I’m simply making a number of these numbers up from cheap estimates.)

  1. Temperature and wind velocity — 15 minute intervals; the temperature measurement may be very low energy (<100μA) and could be sampled in just a few milliseconds, however the hot-wire anemometer attracts as much as 20mA and takes 2 seconds to stabilize earlier than we will take a studying, in order that’s 40mC.

  2. Digital camera and motor — typical journey occasions happen wherever from 2-10 occasions per day

    • Digital camera — some measurements present that the digicam makes use of about 0.32mAh = 1.15C from the battery per shot. The biologists wish to get 5 footage per journey occasion over 2 seconds.
    • Motor — the journey cable usually loosens by about 1.2m from taut to tripped, and attracts 100mA from the battery for about 20 seconds to rewind, in order that’s 2C of cost.

  3. LoRa transceiver — as soon as per day. Utilizing an RN2903, relying on the transmit energy stage wanted, it takes wherever from 50mA – 125mA at 3.3V to speak with the bottom station at a 12500bps price. There’s usually 10MB of information every day (largely from the digicam footage); the BBB2000 staff estimated 80% protocol effectivity, or 10000bps for information payloads after protocol overhead is subtracted. This takes 8000 seconds, or about 2 hour and 13 minutes, to transmit, and was a significant supply of rivalry with the wildlife biologists, which we’ll discuss in a bit. The BBB2000 makes use of a DC/DC converter from 12V to three.3V with about 90% effectivity at 50-250mA output present, 80% at 10mA output present, and 70% effectivity at 1mA present. A number of “it relies upon” solutions round right here, but when we calculate 100mA at 3.3V transmission × (3.3/12) / 0.9 = 30.6mA drawn from the 12V battery; 30.6mA × 8000s = 245C of cost.

If we work out these things on a per-day foundation, we get:

  • temperature and wind velocity — 40mC × 1440 minutes/day / quarter-hour (96 occasions per day) = 3.84C
  • digicam — max of fifty/day * 1.15C = 57.5C
  • motor — max of 10/day * 2C = 20C
  • transceiver — 245C

Did we neglect something?

Oh proper, we want some form of microcontroller to run this factor, with energy switches to show issues on and off in order that they don’t suck up battery energy once they’re not operating. Let’s simply say your typical microcontroller wants 50mA from 3.3V whereas operating at full velocity and doing plenty of calculations (about 15mA from the 12V battery), however it may be set in a low-power mode to attract 14mA from 3.3V (about 4mA from 12V) when working at 10% of full velocity, and a few very small present ( I_{pd} ) when sleeping. Let’s say the SD card additionally wants about 50mA present from 3.3V (15mA from 12V) for 0.5second to write down information, whether or not it’s 10 bytes or 1 megabyte. These aren’t essentially nice estimates, however we’ll come again and revise them if we have to.

  • SD card: 15mA × 0.5s = 7.5mC; frequency = 96 occasions for the temperature/wind velocity measurements, as much as 50 occasions for footage, for a complete of 1.1C (0.72C for temp/wind velocity, 0.38C for footage)
  • microcontroller:

    • full-speed mode (15mA) for information transmission and digicam use:

      • 8000s = 120C for information transmission
      • 50/day × 2s = 1.5C for digicam use

    • low-power mode (4mA) for motor and temperature/wind velocity measurements (these don’t want plenty of calculations from the CPU, we don’t want full-speed mode)

      • 10/day × 20s = 0.8C for motor use
      • 96/day × 2s = 0.77C for temperature/wind velocity measurements

Is the quiescent present ( I_{pd} ) vital right here? Properly, it is determined by how we get the ability. A DC/DC converter might need terrible effectivity at sub-1mA present, because it has its personal quiescent present; possibly we swap off the DC/DC converter and use a low-quiescent-current linear regulator when the microcontroller is in its sleep mode. Or we might energy our microcontroller off of a supercapacitor when it’s in sleep mode — bear in mind, the longest it could be in sleep mode is the quarter-hour (900s) between temperature/wind velocity measurements. That’s 900μC / μA — it’s simple to make a 3.3V 1F capacitor financial institution to carry 3.3C. A 100μA present draw would use 90mC, which is sufficiently small that if we use a 3.3C supercapacitor vitality reservoir, there wouldn’t be a lot voltage droop, so a supercapacitor is perhaps a good suggestion throughout sleep. After which we might reap the benefits of the DC/DC converter’s 90% effectivity to recharge it. Anyway if we now have 100μA working 24 hours a day, that’s 8.64C at 3.3V, transformed from about 2.6C at 12V. That is a lot decrease in comparison with the digicam and transceiver that we don’t want to fret:

  • looking for a microcontroller with ultra-low present doesn’t make sense; we don’t care whether it is 10μA or 100μA
  • however we do care if its quiescent present is, say, 10mA.

Perhaps there are just a few different analog parts however they take solely a small fraction of the microcontroller’s present since they’re turned off when not in use, and we will ignore them.

Anyway, the web complete cost per day drawn from the 12V battery is:

  • temperature and wind velocity: 3.84C + 0.72C for SD card + 0.77C from microcontroller = 5.33C
  • digicam: 57.5C + 0.38C for SD card + 1.5C from microcontroller = 59.4C
  • motor: 20C + 0.8C from microcontroller = 20.8C
  • information transmission: 245C + 120C from microcontroller = 365C

Whole: 450.5C / day = 125mAh (1000mAh = 1A × 3600s = 3600C)

So we will count on about 40 days operation (5000mAh / 125mAh/day) from our 12V battery earlier than having to interchange it with a freshly-charged battery.

And that’s an influence price range. This one was considerably handwavy; an actual energy price range for a tool in manufacturing would have extra cautious accounting, but it surely’s the identical thought. There are just a few key ideas right here:

The Protruding Stake Is Hammered Down

(出る杭は打たれる Deru kui wa utareruJapanese proverb)

To search out methods of decreasing energy, go after the biggest vitality usages, not the smallest. Within the BBB2000 instance, the biggest wrongdoer is information transmission, utilizing 365/450 = 81% of vitality drawn from the battery. A method round this is perhaps to obtain thumbnails of pictures slightly than the photographs themselves, after which obtain the extra fascinating footage later. (Or simply exchange the SD playing cards and produce them again when the battery is changed, which doesn’t require any community visitors, form of like El Paquete Semanal.)

For instance, the thumbnail under of a jaguar is 43kB, whereas the bigger image I transformed it from was 280kB.

If we might lower down the info transmitted by 80%, then we might scale back the every day information transmission prices from 365C to 73C and the entire value from 450.5C/day to 158.5C/day = 44mAh/day, extending battery life from 40 days to 113 days between replacements. (Maybe you possibly can think about the arguments between junior grad college students who must go off into the sector throughout winter, as a result of the batteries don’t final as lengthy, and senior biologists who need the high-resolution pictures with out having to be choosy about what they will obtain. Perhaps we should always simply attempt to discover some photo voltaic panels for the BBB2000 and hope the solar retains shining.)

Then again, reducing down quiescent microcontroller present from, say, 100μA to 10μA draw at 3.3V will get us from 2.6C/day to 0.26C/day, which is a drop within the bucket of the entire 450.5C/day value — which is why I mentioned for this utility that we shouldn’t trouble attempting to cut back quiescent present that’s solely a small fraction of the entire vitality use.

When you’ve been studying my articles earlier than, and this level sounds vaguely acquainted, possibly you’re eager about the one on Amdahl’s Regulation / Gustafson’s Regulation which covers the identical precept.

It’s the Vitality, Not the Energy, Silly!

The subject of low-power design ought to actually be referred to as low-energy design, as a result of ultimately it’s the vitality that issues. I might have a laser that pulls 1kW of energy and fires for a complete of 10μs a day, together with a 1μW steady load. Which one makes use of much less vitality? 1kW × 10μs = 0.01J, whereas 1μW × 86400s = 0.086J — so the kilowatt pulsed laser makes use of just one/8 as a lot because the microwatt steady load.

As all the time, whether or not or not the quiescent energy draw dominates is determined by the circumstances, so do the maths.

Racing to Sleep (What’s the Frequency, Kenneth?)

There’s a corollary to the earlier level — suppose the majority of the work I’m doing with my microcontroller is number-crunching. I’ve to run some algorithm (possibly it’s taking wildlife pictures and changing them to smaller thumbnails) after which fall asleep, till I get up and do it another time. What CPU clock frequency ought to I exploit? Ought to I run in a low-power mode, or full velocity?

If the CPU work is computation-bound — that means I’m ready for a sure variety of directions to run, slightly than ready for some exterior process to finish like an Ethernet packet to reach or a bunch of ADC samples to finish — then the most effective reply is nearly all the time to run within the full-power mode. We “race to sleep”; this sounds considerably counterintuitive, however usually the connection between clock velocity and energy draw is determined by two issues:

  • static energy consumption — there are issues within the microcontroller that draw the identical quantity of energy, impartial of clock frequency, like voltage regulators or analog circuitry
  • dynamic energy consumption — fashionable CMOS logic makes use of most of its energy (and vitality!) charging and discharging the entire parasitic capacitance of the gates of transistors. It’s all about ( CV^2f ); doubling the clock frequency doubles the ability requirement.

If we have a look at one of many characterization graphs (Fig 32-6) of the dsPIC33EP256MC506, we will see this conduct:

The ability draw is generally linear with clock frequency, however intercepts the y-axis above zero: even with 0 MIPS, we’ll nonetheless draw some static energy. I’m going to eyeball the info in Fig 32-6 and say that it matches a curve for worstcase ( I_{dd} approx ) 10mA + 0.75mA/MIPS.

After we have a look at the vitality utilization as a operate of the variety of cycles, and all we alter is the CPU clock frequency, then the dynamic vitality consumption doesn’t change. 1000 cycles at, say, 0.75mA for 1MIPS operation takes us 1ms and makes use of 0.75μC of cost; 1000 cycles at 7.5mA for 10MIPS operation takes us 0.1ms and likewise makes use of 0.75μC of cost. What does change is the cost we have to assist static energy consumption throughout these intervals. The quicker clock frequency lets us get the identical job achieved quicker, so we want much less vitality for static energy consumption. So race to sleep!

It will get extra difficult if the CPU core voltage is allowed/required to alter (once more, it’s all about ( CV^2f )); in your case, work out your choices and do the maths.

Exterior Circuitry: Ready for Godot

Not each process in low-power design is computation-bound. When you’re working with exterior {hardware} like motors or an RF radio, and even inner peripherals like an ADC or a UART, the CPU is perhaps ready round so much for these circuits to do their work, and in that case, operating in a lower-power mode can use much less vitality for a similar required duties. Within the excessive, the place the CPU wants to attend for tons of of milliseconds and even seconds for one thing else to do its factor, the optimum technique could also be price going to sleep and having some exterior circuit get up the processor when it’s achieved.

The robust conditions are when there are a LOT of very brief time delays; for instance, if we now have to attend 100 cycles for some ADC cycle to complete, and it occurs 1000 occasions a second, amid a bunch of different computations, then there’s no means we will reap the benefits of a low-power mode. The overhead of attempting to get to sleep or swap out and in of energy modes is just too excessive. It’s form of like when a three-year-old little one retains pestering you each couple of minutes whilst you’re attempting to get one thing achieved; generally you would like you can simply clonk them over the pinnacle and make them take a nap for just a few hours. The most effective we will hope for is to attempt to schedule some computations in parallel with the brief time delay, slightly than sit there and ready for 100 cycles. This will get messy from a very good software program design standpoint (your algorithms are carried out in a means that’s extra carefully coupled to the conduct of the factor you’re ready for) however generally it’s the one solution to save vitality.

Powering Up Is Onerous to Do

One factor that’s simple to miss is the quantity of vitality for a microcontroller to return out of sleep. It’s not an on the spot transition from sleep to operating at full velocity. This, in fact, is determined by quite a few issues, however the two dominant ones, in my expertise, are the next:

  • time for the clock to stand up and operating once more
  • time for this system to recover from any “disorientation” from sleep and proceed what it was doing.

Contemplate the dsPIC33EP256MC506 once more. (It’s undoubtedly not one of many lowest-power 16-bit microcontrollers, however you’re caught with it for instance as a result of that’s the one I’m most aware of.) It has a spec (OS52) of three.1ms max for the PLL oscillator to lock correctly. That is going to dictate the time, and the vitality, it takes every time we wish to come out of a sleep mode and into full-power operation. If I simply have to examine just a few issues and return to sleep, then possibly I don’t wish to run at full velocity in any case; this half has a quick 7.37MHz RC oscillator that takes a most of 54μs (SY37) to start out up.

I estimated earlier from the graph in Fig 32-6 that worstcase ( I_{dd} approx ) 10mA + 0.75mA/MIPS, so let’s say our options are:

  • full velocity (70 MIPS, fosc = 140MHz): attracts 60mA, takes 3.1ms &approx; 186μC to start out up
  • quick RC at low velocity (3.685MIPS, fsoc = 7.37MIPS): attracts 13mA, takes 54μs &approx; 0.7μC to start out up

Which of those we would wish to select is determined by the variety of directions we have to run. On the extremes, if we execute no directions earlier than going again to sleep, the quick RC oscillator makes extra sense (0.7μC vs. 186μC), whereas if we now have hundreds of directions, the full-speed oscillator is extra power-efficient, with 19 occasions the CPU velocity at about 4.5 occasions the present.

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

n_instructions = np.arange(0,160000,100)
q_fullspeed = 60e-3 * (3.1e-3 + n_instructions / 70.0e6)
q_fastrc = 13e-3 * (54e-6 + n_instructions / 3.685e6)

fig = plt.determine()
ax = fig.add_subplot(1,1,1)
ax.plot(n_instructions, q_fullspeed*1e6, shade="pink", label="fullspeed")
ax.plot(n_instructions, q_fastrc*1e6, shade="blue", label="quick RC")
ax.legend(loc="finest", fontsize=10)
ax.set_ylabel('cost used, uC')
ax.set_xlabel('# of instruction cycles')

This evaluation is considerably shocking: the breakeven level isn’t till about 70,000 cycles! So in the event you’re utilizing this half and you’ve got, say, 40,000 instruction cycles to carry out earlier than going again to sleep for 10 minutes, you’ll use much less vitality in the event you simply keep in FRC mode.

The opposite facet of the powering up “value” is that some elements provide sleep modes that trigger the firmware program to lose context. The PIC24FJ256GA412 household has an ultra-low 80nA typical (800nA max) present draw when in “deep sleep” at Vdd = 2V, however the worth you pay for this low-power draw is that the RAM is left unpowered and the machine wakes up from sleep in a reset state slightly than resuming from this system handle the place it entered deep sleep. (You get two 16-bit context registers DSGPR0 and DSGPR1 which do retain their worth throughout deep sleep, so essential flags or state machine states could possibly be saved there.) There are a variety of various low-power modes on this machine:

The query is, when ought to we use deep sleep vs. “strange” sleep? (DC70 = 80nA typ for deep sleep at 2V, 25°C; DC61 = 630nA typ for low-voltage sleep at 2V, 25° C) If we’re utilizing the 8MHz (4MIPS) inner FRC oscillator on these elements, typical working present DC23 = 1mA.

Suppose our firmware takes 1000 cycles (250μsec) to reconstruct the state it was in, after popping out of deep sleep, whereas with the opposite low-power modes we will restart instantly. These 1000 cycles at 4MIPS operation are price 1mA × 250μsec = 0.25μC of cost, which we get again after 0.25μC / (630nA – 80nA) = 450ms of time in sleep — any longer and deep sleep makes use of much less vitality.

If our firmware takes 100 cycles to reconstruct its program state, we break even with deep sleep being engaging after 45ms of sleep.

If our firmware takes 10,000 cycles to reconstruct its program state, we break even with deep sleep being engaging after 4.5s of sleep.

At any price, you possibly can’t simply have a look at the quiescent energy; every of the sleep modes incurs totally different prices to return out of sleep and re-enter regular operation.

Measure, don’t assume!

The datasheet can have numbers that provide you with an thought of how a lot energy a tool attracts, or how lengthy it takes to return out of sleep.

When you’ve got a very essential system design choice, don’t simply use the datasheet — examine with actual measurements! They could be topic to variations in working circumstances (temperature, part-to-part variation, and so on.) however they gives you precise measurements slightly than on-paper calculations.

It may be actually tough to make a few of these measurements, particularly when you may have a large dynamic vary of present consumption — for instance, alternating 10μA and 10mA for Idd — and my finest recommendation is to make use of a small sequence resistor within the Vdd line of your circuit board, together with an RC filter and a high-input-impedance low-offset-voltage amplifier to measure that voltage drop within the sense resistor. Don’t count on excessive achieve and excessive dynamic vary and excessive bandwidth and excessive accuracy all in the identical op-amp circuit. What you care about is cost drawn over a sure time window, so excessive bandwidth isn’t normally wanted.


I touched on just a few areas of low-power design. For extra data, see these utility notes:

Some tutorial papers on the thought of “racing to sleep” are:


We talked about quite a few areas of low-power design at present:

  • Don’t guess; make an influence/vitality price range by measuring quiescent present and the cost drawn over every occasion that takes the system out of a quiescent state — bear in mind, cost = present × time
  • The system elements that trigger the very best vitality utilization are very depending on circumstances; don’t guess, make an influence/vitality price range
  • Energy conversion effectivity issues when the ability supply is at one voltage, and the consuming electronics is at one other
  • Budgeting accuracy and energy financial savings are most crucial for the elements of the system that use probably the most vitality
  • When occasions are brief and rare, vitality issues, not energy
  • Racing to sleep is the precept that for a similar variety of instruction cycles (= duties which are computation-bound), dynamic energy consumption (mA/MIPS) results in the identical quantity of cost consumed, whereas static energy consumption results in much less cost consumed if we run at a better clock frequency and end quicker
  • Working at a low-power mode is usually vital when there are duties that aren’t computation-bound, however depend upon ready for another machine that makes use of energy
  • Going to a low-power mode has an vitality value to get up and look ahead to the oscillator to stabilize, and to get well system state (in instances w/o retention of RAM or this system counter) — do your homework, as a result of whenever you’re searching for the bottom general vitality consumption:

    • generally it doesn’t all the time make sense to enter the bottom energy mode, as a result of it prices extra to get up
    • generally it doesn’t all the time make sense to get up and race to sleep on the quickest clock frequency, as a result of it prices extra for that clock frequency to change into operational

  • Measure your system’s present draw! It’s going to sanity-check your calculations.
  • It’s price repeating: Don’t guess — make an influence/vitality price range!

Received any extra low-power hints? Let me know!

Better of luck saving vitality in your designs, and have a terrific new 12 months!

© 2019 Jason M. Sachs, all rights reserved.

You may additionally like… (promoted content material)

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button