# Secure Fiddusion — Acko.web

*by*Phil Tadros

A naive answer doesn’t work properly nonetheless. It’s because the spectrum of a subregion doesn’t match the spectrum of the entire.

The Fourier remodel assumes every sign is periodic. If you happen to take a random subregion and forcibly repeat it, its new spectrum can have aliasing artifacts. This could trigger you to constantly misjudge swaps.

To repair this, it is advisable to window the sign within the area/time-domain. This forces it to begin and finish at 0, and eliminates the impact of non-matching boundaries on the scoring. I used a `smoothStep`

window as a result of it is low cost, and have not wanted to strive anything:

*16×16 windowed information*

$$ w(t) = 1 – (3|t|^2 – 2|t|^3) , t=-1..1 $$

This nonetheless alters the spectrum, however in a predictable means. A time-domain window is a convolution within the frequency area. You do not even have a alternative right here: *not* utilizing a window is mathematically equal to utilizing a *very dangerous* window. It is successfully a field filter overlaying the cut-out space contained in the bigger quantity, which causes spectral ringing.

The impact of the chosen window on the goal spectrum will be modeled by way of convolution of their spectral magnitudes:

$$ mathbf{goal}’ = |mathbf{goal}| circledast |mathcal{F}(mathbf{window})| $$

This may be accomplished by way of the time area as:

Word that the ahead/inverse Fourier pairs are usually not redundant, as there’s an absolute worth operator in between. This discards the part part of the window, which is irrelevant.

Curiously, whereas you will need to window the noise information, it is not essential to window the goal. The impact of the spectral convolution is small, amounting to a small blur, and the additional error is random and absorbed by the graceful scoring perform.

The ensuing native loss tracks the worldwide loss perform fairly carefully. It massively hastens the search in bigger volumes, as a result of the massive FFT is the bottleneck. But it surely stops working properly earlier than something resembling convergence within the frequency-domain. It doesn’t make true blue noise, solely a lookalike.

The general downside continues to be that we won’t inform good swaps from dangerous swaps with out making an attempt them and verifying.

## Sleight of Frequency

So, let’s characterize the impact of a pixel swap.

Given a sign `[A B C D E F G H]`

, let’s swap C and F.

Swapping the 2 values is identical as including `F - C = Δ`

to `C`

, and subtracting that very same delta from `F`

. That’s, you add the vector:

`V = [0 0 Δ 0 0 -Δ 0 0]`

This stays true if you happen to apply a Fourier remodel and do it within the frequency area.

To finest perceive this, it is advisable to develop some instinct round FFTs of Dirac deltas.

Take into account the brief filter kernel `[1 4 6 4 1]`

. It is just a little recognized reality, however you’ll be able to truly sight-read its frequency spectrum straight off the coefficients, as a result of the filter is symmetrical. I’ll educate you.

The extremes are simple:

- The DC amplitude is the sum 1 + 4 + 6 + 4 + 1 = 16
- The Nyquist amplitude is the modulated sum 1 – 4 + 6 – 4 + 1 = 0

So we already know it is an ‘excellent’ lowpass filter, which reduces the Nyquist sign +1, -1, +1, -1, … to precisely zero. It additionally has 16x DC acquire.

Now all the opposite frequencies.

First, keep in mind the Fourier remodel works in symmetric methods. Each assertion *“____ within the time area = ____ within the frequency area”* continues to be true if you happen to swap the phrases *time* and *frequency*. This has result in the grotesquely named sub-field of *cepstral* processing the place you have got *quefrencies* and *vawes*, and it kinda feels such as you’re having a stroke. The cepstral convolution filter from earlier known as a *lifter*.

Often cepstral processing is utilized to the true magnitude of the spectrum, i.e. $ |mathrm{spectrum}| $, as an alternative of its true complicated worth. It is a coward transfer.

So, decompose the kernel into symmetric pairs:

```
[· · 6 · ·]
[· 4 · 4 ·]
[1 · · · 1]
```

All however the first row is a pair of actual Dirac deltas within the time area. Such a row is generally what you get while you Fourier remodel a *cosine*, i.e.:

$$ cos omega = frac{mathrm{e}^{iomega} + mathrm{e}^{-iomega}}{2} $$

A cosine in time is a *pair* of Dirac deltas within the frequency area. The part of a (actual) cosine is zero, so each its deltas are actual.

Now flip it round. The Fourier remodel of a *pair* `[x 0 0 ... 0 0 x]`

is a *actual cosine* in frequency area. Should be true. Every new pair provides a brand new larger cosine on high of the present spectrum. For the central `[... 0 0 x 0 0 ...]`

we add a DC time period. It is only a Fourier remodel within the different course:

```
|FFT([1 4 6 4 1])| =
[· · 6 · ·] => 6
[· 4 · 4 ·] => 8 cos(ɷ)
[1 · · · 1] => 2 cos(2ɷ)
= |6 + 8 cos(ɷ) + 2 cos(2ɷ)|
```

Usually you must use the z-transform to research a digital filter. However the above is a shortcut. FFTs and inverse FFTs do have reverse part, however that does not matter right here as a result of `cos(ɷ) = cos(-ɷ)`

.

This works for the symmetric-even case too: you offset the frequencies by half a band, ɷ/2, and there’s no DC time period within the center:

```
|FFT([1 3 3 1])| =
[· 3 3 ·] => 6 cos(ɷ/2)
[1 · · 1] => 2 cos(3ɷ/2)
= |6 cos(ɷ/2) + 2 cos(3ɷ/2)|
```

So, symmetric filters have spectra which can be made up of standard cosines. Now you recognize.

For the aim of this trick, we centered the filter round $ t = 0 $. FFTs are sometimes aligned to *array index* 0. The distinction between the 2 is nonetheless simply part, so it may be disregarded.

What in regards to the delta vector `[0 0 Δ 0 0 -Δ 0 0]`

? It isn’t symmetric, so we’ve to decompose it:

```
V1 = [· · · · · Δ · ·]
V2 = [· · Δ · · · · ·]
V = V2 - V1
```

Every is now an unpaired Dirac delta. Every vector’s Fourier remodel is a fancy wave $ Δ cdot mathrm{e}^{-i omega okay} $ within the frequency area (the okay’th *quefrency*). It lacks the standard complementary oppositely twisting wave $ Δ cdot mathrm{e}^{i omega okay} $, so it is *not* real-valued. It has fixed magnitude Δ and ranging part:

```
FFT(V1) = []
FFT(V2) = []
```

These are *vawes*.

The impact of a swap continues to be simply so as to add `FFT(V)`

, aka `FFT(V2) - FFT(V1)`

to the (complicated) spectrum. The impact is to switch vitality between all of the bands concurrently. Therefore, `FFT(V1)`

and `FFT(V2)`

perform as a *supply* and *vacation spot* masks for the switch.

Nevertheless, ‘masks’ is the fallacious phrase, as a result of the magnitude of $ mathrm{e}^{i omega okay} $ is at all times 1. It does not have various amplitude, solely various part. `-FFT(V1)`

and `FFT(V2)`

outline the complicated *course* wherein so as to add/subtract vitality.

When added collectively their phases intrude constructively or destructively, leading to an amplitude that varies between 0 and 2Δ: an precise masks. The ensuing part will probably be midway between the 2, as it is the sum of two equal-length complicated numbers.

`FFT(V) = []`

For any given pixel A and its delta `FFT(V1)`

, it could possibly pair up with different pixels B to kind N-1 completely different interference masks `FFT(V2) - FFT(V1)`

. There are N(N-1)/2 distinctive interference masks, if you happen to account for (A,B) (B,A) symmetry.

Value declaring, the FFT of the primary index:

`FFT([Δ 0 0 0 0 0 0 0]) = [Δ Δ Δ Δ Δ Δ Δ Δ]`

That is the DC quefrency, and the fourier symmetry continues to work. Shifting values in time causes the vawe’s quefrency to alter within the frequency area. That is the upside-down model of how transferring vitality to a different frequency band causes the wave’s frequency to alter within the time area.

## What is the Gradient, Kenneth?

Utilizing vectors as masks… shifting vitality in instructions… this implies gradient descent, no?

Nicely.

It is certainly attainable to calculate the by-product of your loss perform as a perform of enter pixel brightness, with the standard bag of computerized differentiation/backprop methods. You may as well do it numerically.

However, this does not enable you to straight as a result of the one means you’ll be able to act on that per-pixel gradient is by swapping a *pair* of pixels. You have to discover two quefrencies `FFT(V1)`

and `FFT(V2)`

which intrude in *precisely* the suitable approach to lower the loss perform throughout all dangerous frequencies concurrently, whereas leaving the great ones untouched. Even when the gradient have been that will help you decide a very good beginning pixel, that also leaves the issue of discovering a very good associate.

There are nonetheless O(N²) attainable pairs to select from, and all the spectrum adjustments just a little bit on each swap. Which implies new FFTs to research it.

Random grasping search is definitely difficult to beat in apply. No matter additional work you spend on getting higher samples interprets into much less samples tried per second. e.g. Taking a best-of-3 method is worse than simply making an attempt 3 swaps in a row. Swaps are nearly at all times orthogonal.

However `random()`

nonetheless samples inconsistently as a result of it is white noise. If solely we had…. oh wait. Certainly if you have already got blue noise of the suitable dimension, you should use that to mildly pace up the seek for extra. Use it as a random permutation to drive sampling, with some inevitable whitening over time to maintain it recent. You possibly can’t nonetheless use the noise you are producing to speed up its personal search, as a result of the 2 are extremely correlated.

What’s actually occurring is all a consequence of the loss perform.