ARM Architecture

The way to Construct a Mounted-Level PI Controller That Simply Works: Half II

In Half I we talked about a number of the points round discrete-time proportional-integral (PI) controllers:

  • numerous kinds and whether or not to make use of the canonical type for z-transforms (do not do it!)
  • order of operation within the integral time period: whether or not to scale after which combine (my suggestion), or combine after which scale.
  • saturation and anti-windup

On this half we’ll speak concerning the points surrounding fixed-point implementations of PI controllers. First let’s recap the conceptual construction and my “most popular implementation” for floating-point.

For a PID controller with out saturation:

(Diagram from the Wikipedia entry on PID controllers. Actuator sign = output of the rightmost summing block = enter to plant/course of just isn’t labeled, however corresponds to x(t).)

For a PI controller with saturation, utilizing floating-point arithmetic:

if ((sat < 0 && e < 0) || (sat > 0 && e > 0))
/* do nothing if there may be saturation, and error is in the identical route;
* for those who're cautious you may implement as "if (sat*e > 0)"
x_integral = x_integral + Ki2*e;
(x_integral,sat) = satlimit(x_integral, x_minimum, x_maximum);
x = restrict(Kp*e + x_integral, x_minimum, x_maximum);/* satlimit(x, min, max) does the next:
* if x is between min and max, return (x,0)
* if x < min, return (min, -1)
* if x > max, return (max, +1)
* restrict(x, min, max) does the next:
* if x is between min and max, return x
* if x < min, return min
* if x > max, return max

Right here e is the error, and x is the output, and Kp and Ki2 are the proportional and integral features, the place Ki2 = Ki * the timestep dt.

Mounted-point fundamentals

In fixed-point arithmetic, we sometimes use 16-bit or 32-bit portions (hardly ever 8-bit or 64-bit) to signify engineering portions. Since bodily portions have models, you might be pressured to decide for the scaling issue that relates the integer saved in software program to the bodily amount. This most likely sounds summary, so let’s use a particular instance:

Suppose I’ve a 12-bit analog-to-digital converter measuring an analog enter voltage from 0 to three.0V, and the enter voltage comes from a 20:1 voltage divider. This turns into pretty simple: 0 counts on the ADC represents 0V and 4096 counts (properly, actually 4095 counts however ignore that for now) represents 60V. That is a scaling issue of 14.65mV/depend.

In fixed-point arithmetic, for conceptual functions we frequently think about a binary level (analogous to a decimal level) that’s scaled by 2Q for some quantity Q. In my instance above, the selection of Q=12 is handy: 4096 counts = 212 * 1.0 which represents 1.0 instances a sure full-scale ADC scaling issue, which is 60V on this case. This “Q” quantity is used to discuss with a fixed-point illustration: Q12 signifies that floating-point numbers are scaled by 212 to be represented as integers, together with an additional scaling issue for engineering models. The total specification of this engineering encoding is 60V Q12. To transform from integer to engineering values, we divide by 4096 and multiply by 60V; to transform from engineering values, we divide by 60V and multiply by 4096 and spherical to the closest integer.

If I measured some voltage V1, and I needed to multiply by a acquire Ok the place Ok = 1.237, I might additionally signify Ok by a Q12 quantity. 1.237*4096 = roughly 5067, so 5067 counts represents 1.237 in Q12 encoding.

A floating-point instance (and right here I will use {} to indicate floating-point portions): {V1} = 38.2V, {Ok} = 1.237, {V2} = {V1} * {Ok} = 47.25V. Quite simple.

To do that in mounted level, we might begin out with V1 = {V1} / 60V * 4096 = 2608 counts, and Ok = {Ok} * 4096 = 5067 counts.

So as to get {V2} = {V1} * {Ok}, we substitute:

{V2} = V2 * 60V / 4096
{V1} = V1 * 60V / 4096
{Ok} = Ok / 4096

V2 * 60V / 4096 = (V1 * 60V / 4096) * (Ok/4096)

and simplifying:

V2 = V1 * Ok / 4096

For our instance, V1 = 2608 counts and Ok = 5067 counts, so V2 = V1 * Ok / 4096 = 3226 counts. Let’s examine if that is smart, once we translate again to engineering models:

{V2} = V2 * 60 / 4096 = 3226 * 60 / 4096 = 47.26V

The ensuing computation nearly matches the floating-point case; the discrepancy is because of the restricted decision of fixed-point.

An necessary word: Please perceive that in fixed-point math, the floating level numbers {V1} = 38.2V, {V2} = 47.26V, and {Ok} = 1.237 are by no means truly calculated except we have to convert them into significant numbers for human beings or different customers of information; as a substitute, we solely cope with the fixed-point portions of V1 = 2608, V2 = 3226, Ok = 5067.

The generalization of this multiply step

{C} = {A} * {B}

with fixed-point illustration

{A} = A * A_U / 2QA
{B} = B * B_U / 2QB
{C} = C * C_U / 2QC

might be simplified as follows. Observe the Q numbers might all be completely different, and so might the engineering unit scaling elements A_U, B_U, C_U.

C * C_U / 2QC = (A * A_U / 2QA) * (B * B_U / 2QB)

C = A * B * (A_U * B_U / C_U) / 2(QA + QB – QC)

We often select the unit elements to be associated by an influence of two, in order that the online calculation

C = (A * B) >> okay

might be carried out, the place an integer okay = QA + QB – QC + log2 (A_U*B_U/C_U)

How do you decide a Q quantity, unit scaling issue, and integer bit measurement?

These are crucial steps within the design of mounted level programs. What it’s worthwhile to take a look at are problems with decision and overflow. Overflow pertains to the minimal and most values that may be represented by an integer amount, whereas decision pertains to the smallest potential incremental worth.

If I’ve an unsigned 16-bit integer representing voltage with models 60V Q12, the minimal voltage is 0, the voltage per depend is 60V / 4096 = 14.65mV/depend, and the utmost voltage is 65535 * 60V / 4096 = 959.99V.

We might select a bigger scaling issue for Q12 illustration, and it could increase each the utmost voltage and the voltage per depend.

In the event you select too giant a scaling issue, you will not have the decision you want. In the event you select too small a scaling issue, the “ceiling” of your numeric vary shall be too low and you will not be capable of signify giant sufficient portions.

Please notice additionally that these two portions (the Q quantity and unit scaling issue) usually are not distinctive for a similar illustration: 60V Q12 = 15V Q10 = 960V Q15: in all three circumstances, the integer 2608 represents 38.2V. So the selection of whether or not to name a sure illustration Q12 or Q10 or Q15 or one thing else is de facto arbitrary, it simply forces a distinct engineering unit scaling issue.

If you’re operating into each issues (poor decision and overflow) in numerous conditions, it means it’s worthwhile to use a bigger variety of bits to retailer your engineering portions.

My rule of thumb is that 16-bit integers are nearly at all times ample to retailer most ADC readings, output values, and acquire and offset parameters. 32-bit integers are nearly at all times ample for integrators and intermediate calculations. There have been instances that I’ve wanted 48-bit and even 64-bit intermediate storage values, however that is uncommon besides in conditions of extensive dynamic vary.

If in case you have an ADC between 14 and 16 bits, you need to use a 16-bit integer to retailer the uncooked studying, however you should still want to use 32-bit storage to deal with acquire/offset calibration on your ADC — in any other case, with features which can be barely offset from 1 (e.g. 1.05 or 0.95), you might even see issues within the low-order bits attempting to retailer the end in a 16-bit quantity — if the uncooked ADC depend will increase by one, typically the scaled consequence will increase by 1 depend and typically by 0 or 2 counts. It is a decision drawback and the remedy is to make use of additional bits to attenuate quantization error with out operating into overflow.

For PI controller inputs and outputs, the plain selection is to choose the scaling issue such that the integer overflow vary corresponds to the utmost enter or output worth, however typically this works properly and typically it would not.

Again to the PI controller in fixed-point

OK, able to implement a PI controller in mounted level? Right here goes:

int16 x, sat;
int32 x_integral, p_term;
int16 Kp, Ki2;
const int16 N = [controls proportional gain scaling: see discussion later]
int16 x_min16, x_max16;
const int32 nmin = -(1 << (15+N));
const int32 nmax = (1 << (15+N)) - 1;

/* ... different code skipped ... */

int16 e = u - y;
if ((sat < 0 && e < 0) || (sat > 0 && e > 0))
/* do nothing if there may be saturation, and error is in the identical route;
 * for those who're cautious you may implement as "if (sat*e > 0)"
 x_integral = x_integral + (int32)Ki2*e;
const int32 x_min32 = ((int32)x_min16) << 16;
const int32 x_max32 = ((int32)x_max16) << 16;
(x_integral,sat) = satlimit(x_integral, x_min32, x_max32);
p_term = restrict((int32)Kp*e, nmin, nmax);
x = restrict((p_term >> N) + (x_integral >> 16), x_min16, x_max16);

/* satlimit(x, min, max) does the next:
 * if x is between min and max, return (x,0)
 * if x < min, return (min, -1)
 * if x > max, return (max, +1)
 * restrict(x, min, max) does the next:
 * if x is between min and max, return x
 * if x < min, return min
 * if x > max, return max

There are a few subtleties right here.

Integrator scaling

Within the implementation I’ve given above, the integrator is a 32-bit worth with 65536=2^16 counts of the integrator equal to 1 depend of the output. Or said one other approach, the underside 16 bits of the integrator are additional decision to build up will increase over time, and the highest 16 bits of the integrator are the integrator’s output.

You want extra bits in integrator state variables than common enter/outputs, to deal with the results of small timesteps. That is additionally true for low-pass filter state variables. What you need is for the most important anticipated integral acquire to not trigger overflow, and the smallest anticipated integral acquire to do one thing helpful.

If Ki2 may be very small (e.g. 1 depend), the integrator will take some time to build up errors, however you’ll not lose any decision. If we applied the integrator as a 16-bit worth like

 x_integral = x_integral + (Ki2*e) >> 16;

then for low values of the integral acquire, small error values would simply disappear and by no means accumulate within the integrator.

A corollary of the “you want extra bits in integrator state variables” rule, is that you must by no means execute a PI loop at a charge that’s extra slowly or extra shortly than is cheap. In the event you run the PI loop too slowly, the bandwidth and/or stability of the loop will endure. That is fairly simple. However for those who run the PI loop too shortly, that has an issue too: integrators that combine 100,000 instances per second must combine smaller quantities than integrators that combine 1,000 instances per second. The sooner the speed you execute an integrator or filter, the extra decision you want in state variables. In the event you’re executing your management loop lower than 5 instances the supposed bandwidth, that is too gradual, and for those who’re executing your management loop greater than 500 instances the supposed bandwidth, that is most likely too quick.

So one answer for coping with numerical issues is to run the management loop at a slower charge — simply be sure to filter your inputs so you do not run into aliasing issues.

The integrate-then-scale vs. scale-then-integrate debate, but once more

Once I introduced up the problem of when to use the integral acquire (integrate-then-scale vs. scale-then-integrate), I stated there have been 5 causes to choose scale-then-integrate. Right here is purpose #5:

Scaling after which integrating is a really pure technique to decide fixed-point scaling elements. When you select a scaling issue for the enter error, and a scaling issue for the PI controller output, it very naturally comes collectively and the integrator often simply works for those who use an additional 16 bits of integrator decision, as I mentioned above.

In the event you combine after which scale, the integrator is measured on this bizarre intermediate scaling issue that it’s a must to handle, and it’s a must to deal with decision and overflow twice: as soon as within the integrator, and as soon as within the remaining acquire scaling step. I discover it extra cumbersome to design round, so it is but another excuse to not implement the combination step earlier than making use of the integral acquire.

Proportional acquire scaling

I’ve written this management loop’s proportional time period with a variable shift time period N between 0 and 16. Choosing N=0 is not applicable in some circumstances, and selecting N=16 is not applicable in some circumstances. It is advisable to work out what vary of proportional features you need, and be sure that the very best acquire might be applied with out overflow, however the smallest acquire has ample decision: if the smallest acquire interprets to mounted level as a depend of 1, and also you need to modify it by 10%, you are caught — the acquire is both 0, 1, or 2 counts. Minimal acquire values ought to be no less than 5 or 10 counts when transformed to mounted factors. In the event you want a proportional acquire adjustment vary of greater than 3000:1 (which may be very uncommon), you will most likely want to make use of 32-bit scaling elements and 32×32-bit math reasonably than 16×16 math.

The connection between acquire scaling issue is mounted by the selection of enter and output scaling elements and the selection of N. For instance, suppose the enter scaling issue is 2A = 32768 counts, and the output scaling issue is 14.4V = 32768 counts, and N = 8.

For a acquire of 10V/A, you’d scale an enter of 1A = 16384 counts, to an output of 10V = 22756 counts.

(16384 * Ok) >> 8 = 22756 signifies that Ok = 355.56 counts corresponds to 10V/A. In case your system acquire must be between 1V/A (=35.556 counts) and 100V/A (=3555.6 counts), the selection of N=8 is an efficient one. In case your system acquire must be between 0.1V/A (=3.5556 counts) and 10V/A, then N=8 is just too small; N=10 or N=12 is a greater one. In case your system acquire must be between 10V/A and 1000V/A (=35556 counts) then N=8 is just too giant; N=6 or N=4 is a greater one.

That Darned By-product!

Let’s take a break for a second and are available again to the spinoff time period that is been unnoticed of all this dialogue: we have been speaking about PI controllers however have talked about PID controllers a number of instances.

The D time period is one thing I hardly ever if ever use, as a result of it gives low acquire for slowly-varying errors and excessive acquire for the excessive frequency content material within the error time period, which is often simply noise.

If you may get by with a PI controller with out the D time period, achieve this — your system shall be easier and you will not have to fret about it.

There are programs that do profit from a D time period. They often contain programs with lengthy delay or section lag the place there is a amount that you simply’d like to watch, however cannot.

For instance, take into account a thermal controller the place you’ve gotten a heater block and a temperature sensor. If the heater block is giant and the temperature sensor is mounted on the surface, it could take a very long time earlier than the sensor sees any change in energy utilized to the heating factor. Ideally you’d additionally measure the temperature deep contained in the heater block and use a comparatively quick management loop to control that, after which a slower management loop to control the surface temperature. However if you cannot add an additional temperature sensor, you want a technique to discover when the heater block is beginning to warmth up. In the event you use a PI loop, the proportional time period is not quick sufficient: by the point the temperature error drops appreciably, you have already utilized energy for a very long time to the heating factor and even for those who all of the sudden flip the heating factor off, the temperature on the sensor goes to proceed heating up as warmth diffuses to the surface of the heating block. The integral time period is even slower — integral phrases are there to deal with DC and low-frequency errors; they’re deliberately the slowest-responding a part of your management loop. So a spinoff time period might help you throttle again the output of your controller when the sensor studying begins to extend however the error continues to be optimistic.

Due to noise content material, it often is smart to implement the spinoff time period with a rolloff or low-pass filter: take variations between readings, however then filter out the actually high-frequency content material so you might be left with frequencies that you simply care about. In a thermal management loop, except the system in query is de facto small, the response instances are measured in seconds, so something greater than 10 or 20Hz most likely is not going to be helpful.

Simply keep in mind that you should not use a D time period except it’s worthwhile to.

OK, now again to the dialogue of arithmetic and overflow.

Overflow as well as and subtraction

Let us take a look at this straightforward line:

int16 e = u - y;

There cannot presumably be any errors right here, proper?

WRONG! If y = 32767 counts and u = -32767 counts, then e = -65534 counts. However that may’t slot in an int16 variable, which solely holds values between -32768 and +32767 counts; -65534 counts will alias to +2 counts which is an incorrect calculation.

What we’ve to do as a substitute is certainly one of three issues:

1. use 32-bit math (which is form of a ache) to depart room for the calculation to achieve its full vary of +/- 65535 counts

2. make sure that underneath the worst-case circumstances, we are able to by no means get overflow (e.g. if we’re completely optimistic u and y are restricted to the vary 0-4095 counts) — which is not at all times potential

3. saturate the calculation:

int16 e = restrict((int32)u - y, -32768, +32767);

Some processors have built-in arithmetic saturation, nevertheless it tends to be inaccessible from high-level languages like C.

Different strains which can be causes for concern are the next:

x_integral = x_integral + (int32)Ki2*e;
x = restrict((p_term >> N) + (x_integral >> 16), x_min16, x_max16);

That is proper — it’s worthwhile to take a look at any calculation that has an addition or subtraction.

The higher line (the integrator replace) just isn’t an issue so long as the right-hand aspect would not trigger overflow, and the most important elements listed below are the utmost values of x_integral (decided by x_max32 and x_min32) and the utmost integral acquire and error. You could have to restrict the Ki2*e product earlier than including it in, or use a short lived 64-bit variable first.

The decrease line (limiting the sum of the proportional time period and integral time period) just isn’t a priority so long as the intermediate worth beneath is a 32-bit worth:

(p_term >> N) + (x_integral >> 16) 

In C, this would be the case as a result of p_term and x_integral are each 32-bit values.

Overflow in multiplication

So long as you point out to the compiler that you really want a 16×16 = 32-bit multiply (or 8×8=16 or 32×32=64, because the case could also be), you’ll by no means get an overflow. (Attempt it! Examine with the extremes for each signed and unsigned integers.) Sadly, the arithmetic guidelines for C/C++ are to make the results of a multiplication the identical kind as its promoted inputs, so in strains like this:

int16 a = ..., b = ...;
int32 c = a*b;

the consequence for c might be incorrect, as a result of a*b has an implicit kind of int16. To get intermediate values in C to calculate 16×16=32 accurately, it’s a must to promote one of many operands to a 32-bit integer:

int16 a = ..., b = ...;
int32 c = (int32)a*b;

An excellent compiler will see that and perceive that you simply need to do a 16×16=32 multiply. A mediocre compiler will promote the 2nd operand b to 32-bit, do a 32×32 bit multiply, and take the underside 32 bits of the consequence — which provides the proper reply, however wastes CPU cycles in a library perform name. If in case you have this sort of compiler, you’ve gotten a number of decisions. One selection is to make use of a greater compiler, and a second selection is that you will have entry to a compiler intrinsic perform that does the fitting factor:

int32 c = __imul1632(a,b);

the place __imul1632 is the suitable intrinsic perform that does a 16×16 multiply; I made that perform title up however there could also be one in your system. When the compiler sees an intrinsic perform, it replaces it with the suitable quick sequence of meeting directions reasonably than truly making a library perform name.

The third selection is that you must complain very loudly to your compiler vendor to verify there’s a mechanism so that you can use in C to calculate a 16×16=32 multiply.

As soon as you have executed the multiply, you both have to retailer it in a 32-bit worth, or it’s worthwhile to shift and solid again all the way down to a 16-bit worth:

int16 a = ..., b = ...;
int16 c = ((int32)a*b) >> 16;

This calculation is overflow-free. However shift counts of lower than 16 are liable to overflow, and you will have to restrict the intermediate consequence:

int16 a = ..., b = ...;
const int32 limit12 = (1 << (12+15)) - 1;
int16 c = restrict((int32)a*b, -limit12, limit12) >> 12;

That is true even for a shift of 15, for one unlucky case…

Different pathological circumstances of overflow

In the event you sq. -2^15 = -32768, you will get 2^30. Shift proper by 15 and also you get 2^15, which is not going to slot in a signed 16-bit integer. (-32768)*(-32768) is the one calculation that won’t slot in a  c = (a*b)>>15 calculation. If you already know definitively that one or each operands can’t be -32768 (e.g. if one quantity is a nonnegative acquire), then you do not have to fret, however in any other case you will must restrict the outcomes earlier than right-shifting:

int16 a = ..., b = ...;
const int32 limit15 = (1 << (15+15)) - 1;
int16 c = min32((int32)a*b, limit15) >> 15;
// min32(x,max) = minimal of the 2 values 

The opposite pathological circumstances of overflow additionally contain this “evil” worth of -32768.

int16 evil = -32768;
int16 good1 = -evil;
int16 good2 = abs(evil);

Sadly good1 additionally evaluates to -32768, though logically it ought to be 32768 (which does not match right into a signed 16-bit integer). In the event you use an abs() perform, be sure to perceive the way it handles -32768: if it is applied like these, then you’ve gotten the identical drawback:

#outline abs(x) ((x)>=0 ? (x) : (-x))
inline int16 abs(x) { return x>=0 ? x : -x; }

If it is applied utilizing a built-in meeting instruction, it’s worthwhile to examine how that instruction handles the enter. The TI C2800 DSP has an ABS instruction, and it behaves in another way if an overflow mode bit is about; if OVM is about, then ABS(-32768) = +32767, however in any other case, ABS(-32768) = -32768.

A step again

Whoa! The place had been we once more? Weren’t we speaking a few PI controller? Let’s zoom again out to the large image, and summarize.

So far as the selection of whether or not to make use of floating-point vs. fixed-point arithmetic, maybe now you may perceive that

  • if we use floating-point arithmetic, we are able to signify engineering values instantly in our software program, we’ve a number of subtleties to cope with, and it could value us in sources (some processors haven’t got floating-point directions, so floating-point math needs to be dealt with in library perform calls which can be a lot slower; different processors do have floating-point directions however they often run extra slowly than integer math)
  • if we use fixed-point arithmetic, the maths is de facto easy and quick for the pc to carry out, however as management system designers we’ve to do a number of grunt-work to make sure that we do not run into decision or overflow errors in our programs.

In the event you do determine to make use of fixed-point arithmetic, take into account these points:

  • select scaling elements for enter and output correctly
  • add an additional 16 bits to the integrator measurement for the added decision
  • the scaling issue for proportional acquire might have to have 32 bits if in case you have a system with giant dynamic vary
  • examine all your arithmetic steps for overflow, even easy add, subtract, and negate operations

In any case, remember you could implement a PI controller and have it simply work proper — it simply might take some additional work in your half to verify it does.

Good luck in your subsequent management system!

You may also like… (promoted content material)

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button