<![CDATA[Zero State Machine]]> 2019-11-23T09:54:25+01:00 http://www.radiofreerobotron.net/blog/ Octopress <![CDATA[How to: Zero Delay State Variable Filter Part 4]]> 2019-11-23T00:00:00+01:00 http://www.radiofreerobotron.net/blog/2019/11/23/how-to-zero-delay-state-variable-filter-part-4 Nonlinear ZDF SVF part 2

Previously we derived a nonlinear zero delay feedback filter where the nonlinear tanh() elements were placed at the inputs of the integrators. This is not ideal, since it will lead to instabilities at high input amplitudes even with implicit integration.

Another way to model nonlinearities present in active analog integrators is to place the tanh() clipping at the “output” of the integrator, where it will limit the system to the $[-1, 1]$ interval.

#### Nonlinear “derivative”

Expressing tanh()-limited integration as a differential equation is not really possible with traditional methods since it breaks the fundamental theorem of calculus and the definition of the derivative as a linear difference quotient.

However, it is easy to express as an equation of discrete integration where for example

taken as the linear slope approximation (Euler integration)

becomes

which is more than enough for the needs of dealing with approximations of differential equations as difference equations.

#### Nonlinear SVF difference equation

Let’s now use this nonlinear discrete integrator as a basis for building the discrete-time version of the state-variable filter.

Taking the state-variable filter differential equation

and applying the trapezoidal rule to discretize it as a difference equation yields the following system.

The latter difference equation can be then nonlinearized as the following system.

If we want to use this as a practical time-stepping integration scheme, we need a way to solve for $bp_{t+1}$ to calculate the filter state from. It is possible to solve it from this system numerically with Newton-Raphson iteration as an implicit integration method.

By substitution $a=\frac{1}{2}h$ and gathering terms the equation for $bp_{t+1}$ becomes

Further

and substituting $C_{t} = bp_{t} - alp_{t}-a\sqrt{2}bp_t+au_{t}+au_{t+1}$ as a constant

Using tanh summation identity

and gathering terms to LHS

Further substitutions $D_t=lp_{t}+\frac{1}{2}hbp_{t}$, $E_t=tanh(C_{t})$ and $x=bp_{t+1}$ reduce the equation to

This still looks more complicated than it has to be so substituting the functions

and notating it as a function of x

Newton-Raphson needs a derivative of this function

and of the substituted functions

The iteration for solving $x=bp_{t+1}$ can now be carried over as

from where the full filter state can be integrated as

The rough pseudocode for achieving the full implicit time-stepping is as follows.

#### Conclusions

This method is implemented in my state-variable filter module for VCV Rack together with other methods from this blog. The performance is excelent with a constant Q factor across the frequency spectrum and practically unconditional stability, although the SVF filter structure together with nonlinear tanh() integration is stable even with the most basic explicit integration methods.

The mathematical derivation can be carried over to other nonlinear filter structures where stability is not so great (such as the Moog ladder filter) with the same basic steps so hopefully this serves the reader mainly as a good example and a learning stepping stone for further experimentation.

Happy hacks.

]]>

The challenge description together with the ELF binaries are located at https://ropemporium.com/challenge/pivot.html. The walkthrough will focus on Radare2 for the binary analysis and debugging, so basic knowledge is assumed. You should definitely learn how to use this free open source software!

In this post, the 64-bit binary will be cracked and analysed (see the 32-bit binary walkthrough here).

The basic idea in this challenge is to pivot the stack using a ROP chain to a heap address which is given by the executable and leak the addresses of the library functions which contain the target, the ret2win() function which prints out the flag.

There are multiple strategies to attack this CTF. In the 32-bit version the library function addesses were leaked with puts() libc function and in this 64-bit walkthrough the library offsets are calculated with a ROP chain.

#### Prerequisites

There are a few basic computing concepts used in this walkthrough the reader should be familiar with:

• Intel x86/x86-64 processor instruction set and architecture
• ELF binary structure
• Dynamic linking, relocatable code and Procedure link table and Global offset table (PLT/GOT)
• Virtual memory
• Stack frames
• Return oriented programming and stack overflow
• Stack smashing mitigation techniques such as NX, ASLR and canaries

## 64-bit binary

#### ELF analysis

The ELF binaries, pivot and libpivot.so link library, can be analyzed for basic properties with the File utility and Radare 2:

Listing the functions in the main binary:

Disassembly of the functions:

Analysis of the libpivot.so link library:

Essentially, the functionality is the same as the 32-bit version. The main difference being the 64-bit register set calling convention.

Running the binary:

To conclude the analysis, as with the 32-bit version, there are no import entries for the target function ret2win() in the Procedure link or Global offset tables (PLT/GOT). The foothold_function() is the only imported function from the target link library.

#### Attack strategy

The attack on the 32-bit version consisted of two stages, but attacking the 64-bit binary will be done in a single stage. The 32-bit binary can be attacked with the technique described in this walkthrough and vice versa (although the two-stage attack on the 64-bit requires special care).

First the stack will be smashed to execute a ROP chain which will pivot the stack to the heap address after which a second ROP chain will call foothold_function() to fill in the GOT with the function address. The ROP chain will then calculate the address for ret2win() using knowledge of the static link library offsets and a call to it will print the flag and exit the binary.

Overview of the attack:

• Stack smash
• Stack pivot
• address offset calculation for ret2win()
• call to ret2win()
###### Stack smash

As with the 32-bit version, the stack smash offset needs to be uncovered for ROP chain execution. This is done with the cyclic buffer generated by ragg2:

Radare2 supports stdin input read from a file when debugging in the following setup:

Debugging the binary with these inputs and determining the stack smash offset (note that in the 64-bit case the return address is at the top of the stack):

The stack smash offset is at 40 and once again, the stack size is limited to 3 qwords or 3 RETs in the ROP chain.

The Python exploit can now be started:

Running this and attaching to the process with Radare2 to confirm stack smashing:

Done – stack was smashed and instruction pointer was overwritten! We can now start building the ROP chains to do our bidding.

###### Stack pivot

The stack space is limited to 3 RETs, meaning the pivoting must be done with 3 ROP gadgets. As on the 32-bit walkthrough, Ropper will be used to search for gadgets.

The XCHG gadget with RAX is what will be used to rewrite the stack pointer. Next we need to POP a value into RAX:

We now have what we need to pivot the stack. All that is left is to read the heap address from the process stdout.

Constructing the pivot ROP chain:

The exploit now looks like this:

Running with Radare2 attached to confirm the pivot:

Success: RSP was overwritten to the given heap address. Next up is leaking an address from the link library which contains the target function ret2win().

Even though ASLR randomizes the library memory map location, the link library has a static structure in the form of offsets to various functions, which can be exploited to calculate arbitrary addresses to unlinked functions.

Thus, if you were to add 0x14e to the foothold_function() address and call that address, you would reach ret2win() and print the target flag.

Picking up the offsets into the exploit:

The ROP chain to call the foothold_function() is simply the offset to the PLT:

Injecting this with the first fgets() should now call and fill out the GOT. The exploit now looks like this:

Running the exploit and attaching with Radare2:

The function gets linked and the GOT is filled with the relocatable code offset. Only thing left to do is to use the leaked address to calculate a callable address to the target function.

The full second ROP chain, which will be located in the new pivoted stack, will need to call foothold_function() so as to dynamically link the library and fill the GOT entry. Afterwards a series of gadgets need to move the GOT entry from memory into a register and add the function offset to reach the ret2win() address.

Lets use Ropper to find the gadgets:

The second ROP chain can now be constructed, as described above.

The full exploit now looks like the following:

Running it prints out the flag:

Which means that we now have succesfully completed the CTF!

##### Conclusions

The walkthrough presented two different approaches, one for the 32-bit and 64-bit binaries, both resulting in succesful exploitation.

All-in-all, this is a good challenge to learn how to handle a stack pivoting case in the ROP chain scenario of exploit writing. The helper functionality of the binary is well designed to focus on the meat of the exploit.

I will be writing a few more ROPEmporium writeups in the near future. Stay tuned and happy hacking.

]]>

The challenge together with the ELF binaries are located at https://ropemporium.com/challenge/pivot.html. This walkthrough will focus on using the Radare2 for the binary executable analysis and debugging, so basic knowledge of this awesome tool is assumed.

In this post, the 32-bit binary will be cracked and analysed and a future post will do the same for the 64-bit binary.

UPDATE: 64-bit walkthrough is here

The challenge in short is to pivot the stack using a ROP chain to a heap address which is given by the executable and leak the addresses of the library functions which contain the target, the ret2win() function which prints out the flag.

There are multiple ways to approach this challenge. The first strategy is to build a ROP chain to leak the library function addresses with the standard C library puts() function, second is to compile a ROP chain which does the same thing using registers. The first one will be used in this post for the 32-bit version and the second one for the 64-bit in the next walkthrough post.

#### Prerequisites

There are a few basic computing concepts used on this walkthrough the reader should be familiar with:

• Intel x86/x86-64 processor instruction set and architecture
• ELF binary structure
• Dynamic linking, relocatable code and Procedure link table and Global offset table (PLT/GOT)
• Virtual memory
• Stack frames
• Return oriented programming and stack overflow
• Stack smashing mitigation techniques such as NX, ASLR and canaries

## 32-bit binary

#### ELF analysis

Downloading the ELF binaries gives you the pivot32 executable and the libpivot32.so link library. Radare2 and File linux utility can be used to quickly determine the ELF binary properties:

Further, we need to identify the general structure of the binary. The following analyzes the source code structure and prints out the functions and imported functions in the binary:

Disassembly of the main() function gives a good overview of the code:

Doing the same for libpivot32.so:

Now, running the binary and deducing from the above gives a good idea what the binary is doing:

Concluding the analysis, several key points should be noted before going further. There are no import entries for the target function ret2win() in the Procedure link or Global offset tables (PLT/GOT). The only function imported from the target link library is the aptly named foothold_function(), which doesn’t do anything else except print a string to stdout.

#### Attack strategy

The attack will consists of two stages, where the stack will be smashed to gain ROP execution, pivot the stack to a heap address and ROP chain a link library address leak and a call to ret2win(). The address leak will be done using the puts() function which is imported in the binary from the system libc link library, and the ROP chain will be looped back to main() to smash the stack again with a ROP call to the leaked and offset calculated ret2win() function address.

Overview of the attack:

• 1st stage:
• Stack smash
• Stack pivot
• 2nd stage:
• Stack smash to call ret2win()

#### 1st stage

###### Stack smash

The second fgets() function call in pwnme() is vulnerable to a stack overflow attack and will grant the attacker ROP execution, although only for 3 RETs since the stack space is truncated.

First the stack overflow offset needs to determined using a cyclic buffer. Radare2 framework include a handy utility called ragg2 which can be used to generate such buffers:

Radare2 can also debug a program and feed the stdin during the execution from a simple text file. We need two lines in this file, as follows:

Radare2 requires a profile.rr2 file to define the stdin feed:

Now it is possible to debug the binary with the input feeds automatically:

The wopO command gives the stack offset from the cyclic De Bruijn Pattern.

To verify the limited stack space, viewing the stack makes it obvious:

We can now start to write the Python exploit. Define the offset for the stack smash and doublecheck to see if the EIP can be overwritten with it:

Running the exploit with Radare2 attached to the process:

Stack is successfully smashed and we can start executing a ROP chain on the stack.

###### Stack pivot

As mentioned before, there is only space for 3 RETs in the stack and these will be used to rewrite the ESP stack pointer to the address provided by the executable.

Ropper is a good tool to search for ROP gadgets. We need to find a gadget to rewrite ESP:

Looks like the XCHG gadget is what is needed here (XCHG swaps the register contents, effectively writing EAX to ESP). Next we need a POP gadget to write a value to EAX:

We have everything we need to pivot the stack to an arbitrary address. First we need to grab the heap address from the stdout as provided by the binary:

Writing the pivot ROP chain:

The full exploit now looks like this:

Running this with Radare2 attached will now rewrite ESP and pivot the stack to the heap address:

You can now start writing the second ROP chain to leak and calculate the libpivot32 addresses.

To leak the library address, we need to find offsets for functions in the PLT and GOT. This can be done easily with Radare2:

Picking up the offsets into the exploit:

The second pivoted ROP chain should now do the following: Call foothold_function() to engage the dynamic linker to fill in the GOT, call puts() with the GOT entry as the argument and leak the address as raw bytes to the stdout stream and call back the binary main() function to enter the second stage of the exploit.

Let’s construct the second ROP chain (which consists of just function calls):

The foothold_function will print a string to the stdout and puts() will leak the address from the GOT. You can now read the leak in a variable in the exploit:

The overall exploit is now

Running it and attaching to the process with Radare2 shows how the dynamic linker fills the GOT with the function address:

All that is left to do now is to calculate the correct offset to ret2win() and receive the flag!

#### 2nd stage

Analysis of libpivot32.so reveals the static relative offset from foothold_function() to ret2win():

Thus we need to add 0x1f7 to the address of foothold_function() to reach ret2win():

###### Stack smash to call ret2win()

The second stage stack smash ROP chain will simply be a RET to the ret2win() address:

Adding a recv() for the flag string, the final exploit now looks like this:

Running it will now succesfully receive the flag and exit():

###### Conclusion

The walkthrough described all the steps necessary to succesfully exploit the binary with multiple ROP chains in two stages. It is also possible to achieve the same steps in a single stage exploit. This will be the focus of the next blog post where the 64-bit binary will be cracked.

Happy hacking until then!

]]>
<![CDATA[How to: Zero Delay State Variable Filter Part 3]]> 2016-07-06T14:51:46+02:00 http://www.radiofreerobotron.net/blog/2016/07/06/how-to-zero-delay-state-variable-filter-part-3 Zero delay feedback SVF

We will now build a linear and nonlinear zero delay feedback filters starting from the state variable filter differential equation

### Linear model

The linear model simply assumes the differential equation above, without adding any nonlinear elements. Changing the variables with more descriptive ones, we get

where $u(t)$ is the filter input function.

Starting with the trapezoidal rule

as in part 2, the expression for $bp_{t+1}$ becomes

where

The equation ultimately resolves to

This equation now has two unknown variables on the left hand side, $bp_{t+1}$ and $lp_{t+1}$, so we need to first express $lp_{t+1}$ in terms of $bp_{t+1}$ and $bp_{t}$.

Since $lp’(t) = bp(t)$, applying trapezoidal rule results into

which can now be substituted back into the first equation. Gathering $bp_{t+1}$ terms to the left-hand side of the equation after the substitution

or after combining terms and solving for $bp_{t+1}$

This and

provide the final closed-form time stepping equations for the linear model. In this case, fortunately, there was no need to solve an implicit equation with Newton’s method.

The pseudocode for the integration then looks like

The end product here is practically the same as what a semi-implicit integration would get you - the added benefit is the stability of the fully implicit integration.

However, it usually doesn’t make sense to apply implicit integration methods to fully linear differential equations. Nonlinear differential equations are where they show their teeth.

### Nonlinear model

Starting with the same differential equation as previously and adding $tanh()$ as nonlinear elements (which is what ideal transistor curves usually resolve to in circuits mathematically)

and once again applying the trapezoidal rule, first for $lp_{t+1}$ (which is trivial)

and for $bp_{t+1}$

which requires an implicit solution. Gathering $bp_{t+1}$ terms to the left-hand side of the equation, substituting for $lp_{t+1}$ and gathering the right-hand side as a constant

we get

Forming the Newton’s method iteration functions (see part 2), substituting $x=bp_{t+1}$, $a=lp_{t}$, $b=bp_{t}$ and $c=u_{t+1}$

and it’s derivative

which gets us the following to iterate Newton’s method with

Pulling it all together, the pseudocode for the filter is now

### Conclusions

We have now succesfully built a zero delay feedback filter and detailed the steps to achieve it. It is left to the reader to experiment and determine if zero delay feedback is actually worth the trouble.

]]>
<![CDATA[How to: Zero Delay State Variable Filter Part 2]]> 2016-01-01T02:24:45+01:00 http://www.radiofreerobotron.net/blog/2016/01/01/how-to-zero-delay-state-variable-filter-part-2 In this part we will define what implicit integration is and develop the tools to solve differential equations with it.

### Implicit vs. Explicit

Explicit integration of difference equations relies on information that is derived from the current and past steps. This seems like a very intuitive and natural idea - surely all dynamics we observe in our physical reality happen in this causal way. One thing follows from the past things - this is taken for granted by everyone.

In practice however, this idea does not work in all situations as will be shown in the next sections. In contrast to explicit integration, implicit integration uses information on the next state as well as the current and past states. But then there is an obvious problem - where do you get information on the next state if you’re not there yet? Naively, this seems absurd.

As discussed before in the first part analog circuitry seems to solve this issue naturally, there is no need for explicit integration schemes since the feedback paths are zero delay. How can a deterministic finite state computer compete with this, surely it must be impossible to achieve this with digital filters!

### Worst case nonlinear differential equation

Let us now compare implicit and explicit integration in detail with a nasty example that has an exact solution.

#### Exact closed-form solution

The nonlinear differential equation

has an exact closed-form solution by separating the variables.

For initial conditions $x=1, y(1)=1$, we get

It’s obvious that there is a singularity at $\sqrt{3}$ for this solution: #### Solution by explicit integration

Let’s build an algorithm to solve this nonlinear differential equation by explicit integration. Euler method will suffice to make the point clear.

The derivative is evaluated as $y’(x)=f(x, y)=xy^2$ and the state integrating algorithm is

The pseudocode for the integration is simply

With initial conditions $x=1, y(1)=1$ and $h=0.0001$ the explicit integration gets us the result in the next figure. It’s obvious that something went horribly wrong! The integration can’t resolve the singularity and instead just shoots up to infinity.

Noteworthy, that analog computation wouldn’t do any better. The integration would just hit the positive voltage rail instead of reaching to infinity exponentially. Even if an analog circuit can be called a zero delay feedback system, it is still working with explicit integration.

Enter implicit integration.

#### Solution by implicit integration

The usual starting point for implicit integration is the trapezoidal rule:

Since $y’=f(x, y)=xy^2$, we get

The left-hand side of the equation now contains a quadratic function of the $y_{i+1}$ term, which you need to solve to advance to the next integration step. This is where the term implicit comes from - you do not have an explicit function to calculate the next step from.

We need to attack the equation with a tool developed by the foremost alchemist/occultist to have ever lived - our friend Sir Isaac Newton. The plan is to arrange the equation as an quadratic function of $y_{i+1}$ and use the Newton’s method to iterate towards a numerical approximation.

The right-hand side of the equation can be simplified as a constant $A_i=y_{i} + \frac{1}{2} h x_{i} y_{i}^2$ and

Further, swapping the variables $x=y_{i+1}, a=x_{i+1}$, we get the function

and it’s derivative

Newton’s method can now be constructed as

to compute $x=y_{i+i}$ to arbitrary approximation.

The pseudocode for the implicit solver is then

The iteration in the loop takes from 3 to 8 cycles in the worst case to reach the maximum floating point accuracy, which should be more than enough for practical purposes.

The result is the following: If the timestep $h=0.0001$ is chosen to be small enough, the integration now succesfully captures the singularity, unlike explicit integration.

Obviously, there are errors near the point of singularity $\sqrt{3}$ (which can be succesfully minimized using adaptive step-lengths) but it is truly a show of force for the implicit methods to take down the absolute worst you can throw at an integration algorithm and come out of it unscathed.

We will finally conclude the quest for the zero-delay feedback state-variable filter in the next part, having first built up the tools and concepts to deal with it.

Happy hacking, as always!

]]>
<![CDATA[How to: Zero Delay State Variable Filter Part 1]]> 2015-09-04T14:24:45+01:00 http://www.radiofreerobotron.net/blog/2015/09/04/how-to-zero-delay-state-variable-filter Preface

According to traditional textbook DSP, digital IIR filter structures achieve phase shifting by using cascaded single sample delay blocks. This is in contrast to ‘real’ filters in the analog domain, where phase shifting is due electronical properties of capacitors and inductors - these electronic circuits contain no signal delaying elements and can be then called zero delay.

It then seems like achieving zero delay filtering in the digital domain is impossible, since we are stuck with delay elements by definition. This is not the case. To see how this is possible, we first have to understand what digital IIR filters are mathematically.

Almost all the engineering tools in DSP are tools for simplifying the handling of Ordinary Differential Equations with the use of transfer functions. Handling and solving general ODEs requires a significant amount of mathematical baggage and this is the reason why DSP has not adopted the usage of these equations directly, but have instead developed their own jargon and engineering tradition based on the discretization of ODEs.

‘Infinite Impulse Response’ filter is then nothing more than a solver for a differential equation. This is also true for an analog circuit, which is just a physical implementation of a differential equation solver.

In any case, a practical ODE solver (in contrast to the exact closed-form symbolic solution) is based on integration. Traditional DSP techniques involve transforming the ODE to a difference equation which is then integrated numerically.

This is usually done by means which is equivalent to the explicit Euler method, the most basic approximation method available for solving ODEs by explicit integration. The resulting structure is simple enough to be interpreted as a cascade of delay elements - hence all the silly diagrams of delay element networks in DSP textbooks.

### Defining zero delay

What is then a ‘zero delay’ digital filter? Nothing more than a numerical integration method. Now, instead of working with explicit discrete delayed values, you use various ways to get a better implicit approximation of the derivative you are going to integrate the state of the equation with.

This might seem like cheating, to call something zero delay when you’re obviously still working with discrete delayed values and I agree to the notion to some degree. Zero delay seems like a confusing and misleading term in this context.

### From a differential equation to an algorithm

Let’s define a state variable filter as an ordinary second order differential equation:

where $u(t)$ is the input function.

Knowing the initial conditions of the state functions and the input function, a closed-form solution for $x_1$ and $x_2$ could be constructed, but we are interested in transforming this equation to an integration algorithm.

We are now going to look into explicit integration methods.

##### Euler method

The traditional way to do this is done by applying the Euler method to discretize the differential equation to an explicit difference equation.

In the method, the derivatives in the equation are evaluated as $y’(t)=f(t, y)$ and the algorithm for integrating the state is

where $h=t_{i+1}-t_{i}$ is the time step between $t_{i+1}$ and $t_{i}$.

Applying this to the differential equation gets us

It is now easy to form a practical pseudocode integration algorithm out of this:

To get the well known Chamberlin SVF you need to apply a slight modification and use the semi-implicit Euler method:

and in pseudocode, replacing the state variables with more descriptive names:

##### Heun’s method

To improve the Euler method, you can form a corrector step after the first Euler integration step. This is called the Heun’s method:

Being an explicit method, although comparable to implicit trapezoidal integration, it is straight-forward to apply to pseudocode:

We have looked at the two simplest explicit integration methods. In the next part we will go into the implicit methods and build a ‘zero delay’ digital state variable filter.

Happy hacking!

]]>