View Jupyter notebook on the GitHub.
Embedding models#
This notebooks contains examples with embedding models.
Table of contents
Using embedding models directly
Using embedding models with transforms
Baseline
EmbeddingSegmentTransform
EmbeddingWindowTransform
Saving and loading models
[1]:
import warnings
warnings.filterwarnings("ignore")
1. Using embedding models directly#
We have two models to generate embeddings for time series: TS2VecEmbeddingModel
and TSTCCEmbeddingModel
.
Each model has following methods:
fit
to train model:encode_segment
to generate embeddings for the whole series. These features are regressors.encode_window
to generate embeddings for each timestamp. These features aren’t regressors and lag transformation should be applied to them before using in forecasting.freeze
to enable or disable skipping training infit
method. It is useful, for example, when you have a pretrained model and you want only to generate embeddings without new training duringbacktest
.save
andload
to save and load pretrained models, respectively.
[2]:
from pytorch_lightning import seed_everything
seed_everything(42, workers=True)
wandb: WARNING Disabling SSL verification. Connections to this server are not verified and may be insecure!
Global seed set to 42
[2]:
42
[3]:
from etna.datasets import TSDataset
from etna.datasets import generate_ar_df
df = generate_ar_df(periods=10, start_time="2001-01-01", n_segments=3)
ts = TSDataset(df, freq="D")
ts.head()
[3]:
segment | segment_0 | segment_1 | segment_2 |
---|---|---|---|
feature | target | target | target |
timestamp | |||
2001-01-01 | 1.624345 | 1.462108 | -1.100619 |
2001-01-02 | 1.012589 | -0.598033 | 0.044105 |
2001-01-03 | 0.484417 | -0.920450 | 0.945695 |
2001-01-04 | -0.588551 | -1.304504 | 1.448190 |
2001-01-05 | 0.276856 | -0.170735 | 2.349046 |
Now let’s work with models directly.
They are expecting array with shapes (n_segments, n_timestamps, num_features). The example shows working with TS2VecEmbeddingModel
, it is all the same with TSTCCEmbeddingModel
.
[4]:
x = ts.df.values.reshape(ts.size()).transpose(1, 0, 2)
x.shape
[4]:
(3, 10, 1)
[5]:
from etna.transforms.embeddings.models import TS2VecEmbeddingModel
from etna.transforms.embeddings.models import TSTCCEmbeddingModel
model_ts2vec = TS2VecEmbeddingModel(input_dims=1, output_dims=2)
model_ts2vec.fit(x, n_epochs=1)
segment_embeddings = model_ts2vec.encode_segment(x)
segment_embeddings.shape
[5]:
(3, 2)
As we are using encode_segment
we get output_dims
features consisting of one value for each segment.
And what about encode_window
?
[6]:
window_embeddings = model_ts2vec.encode_window(x)
window_embeddings.shape
[6]:
(3, 10, 2)
We get output_dims
features consisting of n_timestamps
values for each segment.
You can change some attributes of the model after initialization, for example device
, batch_size
or num_workers
.
[7]:
model_ts2vec.device = "cuda"
2. Using embedding models with transforms#
In this section we will test our models on example.
[8]:
HORIZON = 6
2.1 Baseline#
Before working with embedding features, let’s make forecasts using usual features.
[9]:
from etna.datasets import load_dataset
ts = load_dataset("m3_monthly")
ts.drop_features(features=["origin_timestamp"])
ts.df_exog = None
ts.head()
[9]:
segment | M1000_MACRO | M1001_MACRO | M1002_MACRO | M1003_MACRO | M1004_MACRO | M1005_MACRO | M1006_MACRO | M1007_MACRO | M1008_MACRO | M1009_MACRO | ... | M992_MACRO | M993_MACRO | M994_MACRO | M995_MACRO | M996_MACRO | M997_MACRO | M998_MACRO | M999_MACRO | M99_MICRO | M9_MICRO |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature | target | target | target | target | target | target | target | target | target | target | ... | target | target | target | target | target | target | target | target | target | target |
timestamp | |||||||||||||||||||||
0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 1428 columns
[10]:
from etna.metrics import SMAPE
from etna.models import CatBoostMultiSegmentModel
from etna.pipeline import Pipeline
from etna.transforms import LagTransform
model = CatBoostMultiSegmentModel()
lag_transform = LagTransform(in_column="target", lags=list(range(HORIZON, HORIZON + 6)), out_column="lag")
pipeline = Pipeline(model=model, transforms=[lag_transform], horizon=HORIZON)
metrics_df, _, _ = pipeline.backtest(ts, metrics=[SMAPE()], n_folds=3)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 4.3s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 8.4s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 13.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 13.1s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.4s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.8s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.1s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s finished
[11]:
print("SMAPE: ", metrics_df["SMAPE"].mean())
SMAPE: 14.719683971886594
2.2 EmbeddingSegmentTransform#
EmbeddingSegmentTransform
calls models’ encode_segment
method inside.
[12]:
from etna.transforms import EmbeddingSegmentTransform
from etna.transforms.embeddings.models import BaseEmbeddingModel
def forecast_with_segment_embeddings(emb_model: BaseEmbeddingModel, training_params: dict) -> float:
model = CatBoostMultiSegmentModel()
emb_transform = EmbeddingSegmentTransform(
in_columns=["target"], embedding_model=emb_model, training_params=training_params, out_column="emb"
)
pipeline = Pipeline(model=model, transforms=[lag_transform, emb_transform], horizon=HORIZON)
metrics_df, _, _ = pipeline.backtest(ts, metrics=[SMAPE()], n_folds=3)
smape_score = metrics_df["SMAPE"].mean()
return smape_score
You can see training parameters of the model to pass it to transform.
Let’s begin with TSTCCEmbeddingModel
[13]:
?TSTCCEmbeddingModel.fit
[14]:
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
emb_model = TSTCCEmbeddingModel(input_dims=1, tc_hidden_dim=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_segment_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 34.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 1.1min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.7min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.7min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 1.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 2.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 3.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 3.1s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s finished
[15]:
print("SMAPE: ", smape_score)
SMAPE: 14.214904390075835
Better then without embeddings. Let’s try TS2VecEmbeddingModel
.
[16]:
emb_model = TS2VecEmbeddingModel(input_dims=1, hidden_dims=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_segment_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 27.7s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 58.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.7min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.7min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 1.7s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 3.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 4.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 4.1s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s finished
[17]:
print("SMAPE: ", smape_score)
SMAPE: 13.549340740762041
Much better. Now let’s try another transform.
2.3 EmbeddingWindowTransform#
EmbeddingWindowTransform
calls models’ encode_window
method inside. As we have discussed, these features are not regressors and should be used as lags for future.
[18]:
from etna.transforms import EmbeddingWindowTransform
from etna.transforms import FilterFeaturesTransform
def forecast_with_window_embeddings(emb_model: BaseEmbeddingModel, training_params: dict) -> float:
model = CatBoostMultiSegmentModel()
output_dims = emb_model.output_dims
emb_transform = EmbeddingWindowTransform(
in_columns=["target"], embedding_model=emb_model, training_params=training_params, out_column="embedding_window"
)
lag_emb_transforms = [
LagTransform(in_column=f"embedding_window_{i}", lags=[HORIZON], out_column=f"lag_emb_{i}")
for i in range(output_dims)
]
filter_transforms = FilterFeaturesTransform(exclude=[f"embedding_window_{i}" for i in range(output_dims)])
transforms = [lag_transform] + [emb_transform] + lag_emb_transforms + [filter_transforms]
pipeline = Pipeline(model=model, transforms=transforms, horizon=HORIZON)
metrics_df, _, _ = pipeline.backtest(ts, metrics=[SMAPE()], n_folds=3)
smape_score = metrics_df["SMAPE"].mean()
return smape_score
[19]:
emb_model = TSTCCEmbeddingModel(input_dims=1, tc_hidden_dim=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_window_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 53.9s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 1.8min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 2.7min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 2.7min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 10.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 20.9s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 31.6s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 31.6s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s finished
[20]:
print("SMAPE: ", smape_score)
SMAPE: 104.68988621650867
Oops… What about TS2VecEmbeddingModel
?
[21]:
emb_model = TS2VecEmbeddingModel(input_dims=1, hidden_dims=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_window_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 34.5s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 1.2min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.8min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.8min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 8.6s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 17.4s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 26.3s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 26.3s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s finished
[22]:
print("SMAPE: ", smape_score)
SMAPE: 29.776520212845234
Window embeddings don’t help with this dataset. It means that you should try both models and both transforms to get the best results.
3. Saving and loading models#
If you have a pretrained embedding model and aren’t going to train it on calling fit
, you should “freeze” training loop. It is helpful for using the model inside transforms, which call fit
method on each fit
of the pipeline.
[23]:
MODEL_PATH = "model.zip"
[24]:
emb_model.freeze()
emb_model.save(MODEL_PATH)
Now you are ready to load pretrained model.
[25]:
model_loaded = TS2VecEmbeddingModel.load(MODEL_PATH)
If you need to fine-tune pretrained model, you should “unfreeze” training loop. After that it will start fitting on calling fit
method.
[26]:
model_loaded.freeze(is_freezed=False)
To get information about whether model is “freezed” or not use is_freezed
property.
[27]:
model_loaded.is_freezed
[27]:
False