Curve fitting is an important subfield of statistics that deals with the estimation of mathematical functions that describe the relationship between an independent variable and one or more dependent variables.
In this article, we will provide an overview of the theory, methods, and applications of curve fitting. Initially, we will remain quite general.
Theory of Curve Fitting
The theory of curve fitting is based on the assumption that the relationship between the independent and dependent variables can be described by a mathematical function.
The goal of curve fitting is to find this function by analyzing a sample of data. The choice of function depends on the type of data to be analyzed.
We will refrain from enumerating and explaining these functions (quadratic or exponential, among others, below) at this point because we are focusing on trading.
It is not necessary to understand all correlations to the last detail. Much more important is the practical application of different methods and how we avoid curve fitting.
So let’s first deal with the application areas of curve fitting in general.
Applications of Curve Fitting
Curve Fitting is used in many areas including:
- Finance: In financial analysis, curve fitting is used toCurve fitting is used in many areas, including: analyzing and making predictions about future price movements of stocks, bonds, and other securities.
- Engineering: In engineering, curve fitting is used to describe the relationship between physical quantities such as force and strain or stress and deformation.
- Biology: In biology, curve fitting is used to describe the growth rate of populations or the concentration of substances in the body over time.
- Medicine: In medicine, Curve Fitting is used to describe the relationship between different parameters such as dosage and effect of drugs or between age and certain health indicators.
- Earth Sciences: In Earth Sciences, Curve Fitting is used to describe the relationship between various geological factors such as the depth of the soil and the concentration of minerals.
- Climatology: In climatology, curve fitting is used to describe the relationship between climatic factors such as temperature, humidity, and rainfall.
- Data Analysis: In data analysis, curve fitting is used to describe the relationship between different variables and to make predictions about future trends. Here, for example, opinion polls or election projections would be practical examples.
Do they notice anything?
Curve Fitting is quite appreciated as a scientific method. Strictly speaking, curve fitting in its pure form is nothing more than backtesting. In trading, however, we want to clearly distinguish between these terms. Because for us traders or system developers, curve fitting is the over-optimization of trading systems. And that is not a good thing.
This is what we are dealing with now.
The dangers of curve fitting in trading systems
Curve fitting or over-optimization is a common problem in trading system development. It consists of optimizing a trading system so much for historical data that it no longer works for future data.
Note: this is also where our Quant Master 1 training comes in, graphically contrasting in- and out-of-sample periods to allow the developer to adjust the parameters of their indicators to reduce over-optimization.
Overfitting: Overfitting happens when a trading system becomes so optimized for historical data that it no longer works for future data. A trading system that is over-optimized for historical data becomes too specific and does not adapt to future data. It then cannot handle the future and unknown events. Overfitting leads to a system that only works on historical data and is unreliable for future data. Unfortunately, we see this very often in trading.
Data mining: this issue refers to the practice of sifting through a large amount of data to develop a trading system. And overshooting the mark in the process. Using data mining can lead to a trading system finding random correlations between variables that don’t really exist. A trading system based on random correlations will be useless for future data. We see this danger as more theoretical in nature.
Overzealous parameter optimization: this occurs when a trading system is optimized so much for certain parameters that it no longer works for other parameter values. We already had the topic “Optimize until the doctor comes” in another place…
How to avoid curve fitting?
There are several ways to avoid curve fitting in trading systems. Here are some tips:
Use a data set large enough to test and refine the trading system and small enough to avoid falling into the trap of data mining.
Use out-of-sample validation to ensure that the trading system can be applied to future data.
Use conservative optimization to ensure that the trading system is not too specific. Intentionally sacrifice performance in backtesting if they find more stable regions in all the zooming around of prices and parameters, even if they are not among the best regions.
Avoid overzealousness by optimizing conservatively. Less is often more – and always pay attention to statistical relevance. And measure – we are back to Quant Master 1 – which parameter has which influence on system relevant key figures like APR or drawdown.
What is curve fitting - Video
In this video Thomas is going to explain how to avoid curve fitting.
Conclusion Curve Fitting
Curve fitting, in trader parlance, is a common problem in the development of trading systems. It occurs when a model is optimized so much for historical data that it no longer works for future data.
Curve fitting can be done unintentionally or intentionally. The effect is the same: the trading systems function significantly worse – in the worst case not at all – than the backtest would suggest.
There are several approaches to avoid curve fitting, all of which are covered in our Quant Master 1 training module.
Overall, curve fitting is a serious problem that can affect the accuracy and reliability of trading systems. By avoiding curve fitting, you can ensure that your trading system can be applied to future data and that you get reliable results and good returns.