Log–log plot: Difference between revisions

Content deleted Content added
adding == Log-log linear regression models == section
Line 79:
A_{(m=-1)} &= F_0 \cdot x_0 \cdot \ln \frac{x_1}{x_0}
\end{align}</math>
 
== Log-log linear regression models ==
 
Log–log plots are often use for visualizing log-log linear regression models with (roughly) [[log-normal]] errors. In such models, after log-transforming the dependent and independent variables, a [[Simple linear regression]] model can be fitted, with the errors becoming [[Homoscedasticity|homoscedastic]]. This model is useful when dealing with data that exhibits exponential growth or decay, while the errors continue to grow as the independent value grows (i.e., [[heteroscedasticity|heteroscedastic]] error).
 
As above, in a log-log linear model the relationship between the variables is expressed as a power law. Every unit change in the independent variable will result in a constant percentage change in the dependent variable. The model is expressed as:
 
:<math>y = a \cdot x^b</math>
 
Taking the logarithm of both sides, we get:
 
:<math>\log(y) = \log(a) + b \cdot \log(x)</math>
 
This is a linear equation in the logarithms of `x` and `y`, with `log(a)` as the intercept and `b` as the slope.
 
[[File:Visualizing Loglog Normal Data.png|thumb|Figure 1: Visualizing Loglog Normal Data]]
 
Figure 1 illustrates how this looks. It presents two plots generated from a dataset of 10,000 points. The left plot, titled 'Concave Line with Log-Normal Noise', displays a scatter plot of the observed data (y) against the independent variable (x). The red line represents the 'Median line', while the blue line is the 'Mean line'. This plot illustrates a dataset with a power-law relationship between the variables, represented by a concave line.
 
When both variables are log-transformed, as shown in the right plot of Figure 1, titled 'Log-Log Linear Line with Normal Noise', the relationship becomes linear. This plot also displays a scatter plot of the observed data against the independent variable, but after both axes are on a logarithmic scale. Here, both the mean and median lines are the same (red) line. This transformation allows us to fit a [[Simple linear regression]] model (which can then be transformed back to the original scale - as the median line).
 
[[File:Sliding Window Error Metrics Loglog Normal Data.png|thumb|Figure 2: Sliding Window Error Metrics Loglog Normal Data]]
 
The transformation from the left plot to the right plot in Figure 1 also demonstrates the effect of the log transformation on the distribution of noise in the data. In the left plot, the noise appears to follow a [[log-normal distribution]], which is right-skewed and can be difficult to work with. In the right plot, after the log transformation, the noise appears to follow a [[normal distribution]], which is easier to reason about and model.
 
This normalization of noise is further analyzed in Figure 2, which presents a line plot of three error metrics ([[Mean Absolute Error]] - MAE, [[Root Mean Square Error]] - RMSE, and [[Mean Absolute Logarithmic Error]] - MALE) calculated over a sliding window of size 28 on the x-axis. The y-axis gives the error, plotted against the independent variable (x). Each error metric is represented by a different color, with the corresponding smoothed line overlaying the original line (since this is just simulated data, the error estimation is a bit jumpy). These error metrics provide a measure of the noise as it varies across different x values.
 
Log-log linear models are widely used in various fields, including economics, biology, and physics, where many phenomena exhibit power-law behavior. They are also useful in regression analysis when dealing with heteroscedastic data, as the log transformation can help to stabilize the variance.
 
 
== Applications ==