What is Tangent?

Tangent is Tangent Works’ automatic model building engine. It is designed specifically for time series forecasting and anomaly detection.

Automatic model building

Tangent differs from traditional time series modeling techniques because the focus lies on feature engineering rather than hyperparameter tuning and model selection.

In time series modeling, we believe that identification of significant features and the overall modeling framework (how to address changing dynamics in a time series, dynamic data availability, multi-situational forecasts, etc.) are far more important than the choice of a specific modeling technique and its associated hyperparameters.

Tangent generates one single high-quality model with a single pass through the data. It is a modeling strategy that identifies relevant features present in the data. And, it works significantly faster (seconds up to minutes on standard hardware) than traditional strategies such as AutoML.

Real time model building

We took forecasting to the next level and unlocked the use of machine learning technology to work almost in real time. How does it work?

Tangent will generate a model and produce a forecast in one single request. The model is then discarded, as the next time a forecast is required the entire process is simply repeated. This is inherently different to AutoML. All Tangent needs to know from the user is how many steps ahead the forecast should be calculated.

Designed for time series

Changing data patterns

Phenomena represented by time series are dynamic. Consequently, a working model is not guaranteed to stay up to date. Numerous examples in different industries illustrate this reality: changing portfolios of assets in finance, changing influential factors in trading, where changes are so frequent it is mandatory to rebuild models from scratch, changing portfolios of production assets in the utility sector…

Previously valuable suddenly become useless in these new situations. This forces users to repeat the model building process, by for example performing a new AutoML search, though this is often not an optimal solution. These new situations tend to require the identification of new significant features rather than a different modelling technique or slightly adjusted hyperparameters.

Tangent empowers users to adapt to new situations by allowing models to be rebuilt or recalibrated continuously. Whereas model recalibration only adjusts the model’s parameters and leaves the model’s structure (features) intact, model rebuilding starts by identifying new features and then builds a completely new model. Through the identification of new features, the process of model rebuilding is made robust to change.

Data availability

Data availability introduces another complexity, as the data availability situation at the time the model is used always needs to be considered. This situation can vary from time to time. Imagine your model uses lagged features of the target variable from 6 hours ago, i.e. y(t-6). It is possible that this data is not yet available when a forecast needs to be made. As a result of this unavailable feature, the model cannot be used for forecasting.

In Tangent, a data availability scheme is attached to each model building effort. This allows Tangent to take expected data availability conditions into account when building models. Even so, it is possible that data availability changes over time or that some unexpected changes in data availability occur. Thanks to Tangent’s real time capabilities, a model can instantly be rebuild from scratch using the constrains of this new situation. In the above example, Tangent can easily build a new model without this feature, that can be used to make the desired forecast.

Multi-point forecasts

Situations where only a single-point forecast is required are rare in industrial practice. Multi-point forecasts are far more common, as they are required in many industrial verticals. Multi-point forecasts have traditionally been addressed by multi-output models and recurrent strategies.

Intuitively, building and optimizing a multi-output model is harder than doing so for a single-output model, because model parameters need to be optimized against all the outputs simultaneously. These models thus tend to exhibit a higher complexity compared to single-output models (e.g. more hidden layers in a neural net, a larger decision tree…). Sometimes, this can even result in a contradictory optimization problem.

Recurrent strategies, on the other hand, optimize a single-output model that is then recurrently propagated in time, i.e. the forecast for y(t+1) is reused for calculating y(t+2). Recurrent strategies are, however, prone to fast divergence. This renders them impractical for a widespread industrial adoption.

Tangent addresses multi-point forecasts by creating a set of single-point models. In this way, every forecast point has its own individual model that considers the corresponding data availability constraints. This set of models is referred to as a Model Zoo. Out of this Model Zoo, Tangent automatically dispatches the correct model for the calculation of each point of the multi-point forecast.

This approach has several advantages. First, by addressing each point individually, Tangent can often achieve greater accuracy, because each model is optimized for a single point while considering the specific data availability. Secondly, each model’s features can be examined and compared. As there might be rather different features driving different forecasts, this can provide great insights.

Often, multiple forecasts are needed at multiple points in time. In these cases, the user defines a forecasting routine, referring to a set of forecasting situations with their corresponding data availability schemes. In these cases, Tangent’s multi-situational layer assembles a Model Zoo that accounts for all the multi-point forecasts in all the situations of the user’s forecasting routine.

Transparent models

The models Tangent generates are not black-box models. The features Tangent identified in a model can be provided to the user for further inspection. Tangent also extracts canonical features from a Model Zoo to provide an overview of the driving features of the dataset at hand. This white-box view of the models can enable the user to gain important insights about the data and its structure. The tree map shown below is an example of how Tangent can demonstrate the importance of the different identified features.

In many industries, it is legally obligated to only base decisions on the output of explainable models. In other cases, explainability is preferred simple due to the possibilities it creates in terms of gaining new insights. Tangent is designed to provide the benefits of white-box models while minimizing any loss in performance. In many situations, Tangent performs better than black-box alternatives.