It is common enough when looking at time series, to see a cyclic pattern. The values in our series will oscillate predictably. This may be driven by an underlying effect based on the time of day, or the day of the week, or even the day of the year. We can model this with a little feature engineering.

What we are talking about is a wave. Any wave can be modelled as a sine function. Suppose we have a time variable t that is normalised to have values between \(0\) and \(1\), then an arbitrary wave will be of the form

where

- A is the amplitude of the wave, that is, how high and low each crest and trough are
- f is the frequency, that is, how many times the wave repeats for a single unit of time
- o is the offset, how much the wave is shifted forwards

Now, these terms are quite non-linear, so they will by hard to model. However, we can do a little trigonometry. Thanks to the identity

we can rewrite the wave function as

We can then define \(B = A\sin(o)\) and \(C = A\cos(o)\), and then write any wave in the form

Now the only nonlinear term is the frequency of the oscillations.

Let's look at an example, suppose we have a time series that looks like,

By eyeballing it, we can see a seasonal effect which repeats four times. So let's try some linear regression against the features \(\sin(x*4*2*\pi)\) and \(\cos(x*4*2*\pi)\).

```
import numpy as np
from sklearn.linear_model import LinearRegression
X, Y = # wherever our data comes from
sin_feature = np.sin(X*4*np.pi)
cos_feature = np.cos(X*4*np.pi)
harmonic_model = LinearRegression().fit([[s,c] for s,c in zip(sin_feature, cos_feature)], Y)
```

Now, I ran this regression myself, and the coefficents I got where, 1.69 and 0.38. So, we have

Which means our model's offset is

and our amplitude is

This is pretty good, as I generated the above data with an amplitude of 1.8 and an offset of 0.2 via

```
import numpy as np
X = np.arange(0, 1, 0.002)
Y = 1.8*np.sin(0.2 + X*4*2*np.pi)
```

We won't always have such an obvious cyclical pattern where we can spot the frequency by eye. We'll be looking at how to determine the appropriate frequency in these cases in a future post.