Log-transformation walk through

# Log-transformation walk through ## Installation ***In RStudio:*** ```r= install.packages("remotes") # dev version remotes::install_github("qdm/qdmtools") ``` ## Example 1 N = 30 in each group SD = 0.6 Example 1 (LOGT): > - MV = 0.83; TV =0.83; Confidence MV = 80%; Confidence TV = 80% > - Prior: LOGT (-0.203, 0.0465) [ 0.1,2] DF = 3 Design using [`design_normal_two()`](https://warp-view.gsk.com/qdm/qdmtools/reference/design_normal_two.html) from `qdmtools` with 'regular' **MV** and **TV** values: ***In RStudio:*** ```r= des <- design_normal_two(30, 30, 0.6, 0.83, 0.83, 0.8, 0.8) ``` To generate the wave plot, we log the MV and TV of *des* before running the [`evaluate()`](https://warp-view.gsk.com/qdm/qdmtools/reference/evaluate.html) function to match the design with logged values: ***Application Code:*** ![](https://i.imgur.com/lb4Uzjs.png) ***In RStudio:*** ```r= des[['mv']] <- log10(des[['mv']]) des[['tv']] <- log10(des[['tv']]) ``` Then feeding the normal two design into the `des_rng()` function from `qdmtools` to generate the waveplot limits and then the `evaluate()` function (note: I use only two values, but there would be many more) we get values needed to generate the wave plot: ***In RStudio:*** ```r= xlim <- des_rng(des, c(0.01, 0.99)) xlim ``` ![](https://i.imgur.com/5W6J7NH.png) ***In RStudio:*** ```r= dat <- evaluate(des, seq(xlim[1], xlim[2], length.out = 505)) dat ``` ***Generates the following output***: ![](https://i.imgur.com/38xJzWU.png) We antilog the values above (**val**, **mv**, **tv**, **crit_mv**, **crit_tv**) to generate the wave plot with the values as expected in 'regular' form. See below: ***In RStudio:*** ```r= des %>% plotly_design(logscale_x = T) ``` ![](https://i.imgur.com/8CEeGx8.png) ### Assurance Issue To get the assurance values, the values generated from the `evaluate()` function from `qdmtools` are then multiplied by the values generated by the density function using the specified prior distribution: ***Application code:*** ![](https://i.imgur.com/ojIKCRM.png) ***In RStudio:*** ```r= evaluate(design, xlim)[, type] * dens(dist, xlim) ``` where *type* refers to one of **go**, **stop**, **inter** or **incon** and `x` being a range of values, which in this case is the x limits. These values are then fed into the [`stats::integrate` function](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/integrate.html). ### Example 1 Log-t prior LOGT (-0.203, 0.0465) [ 0.1,2] DF = 3 The same x limits for the waveplot generated using logged MV and TV design in the [`dens()` function](https://warp-view.gsk.com/qdm/qdmtools/reference/dens.html): ***In RStudio:*** ```r= dens(dist_logt, xlim) [1] NaN 0.001183207 ``` Normal prior with logged mean and sd: ***In RStudio:*** ```r= dist_norm <- dist_norm(-0.0379, 0.0294) dens(dist_norm, xlim) [1] 4.916308e-71 1.351117e-49 ``` So I then tried to use the antilog xlimits for the log_t distribution but the values don't match: ***In RStudio:*** ```r= antilog_xlim <- 10^xlim dens(dist, antilog_xlim) [1] 3.166382e-04 9.033388e-05 ```` This is where I think we need to rethink the underlying arithmetic as the numbers need to match to get the matching assurance values ### Gamma Example 9 ```r= des <- design_normal_two(30, 30, 0.6, 1, 1.5, 0.9, 0.99) dist <- dist_gamma(4.78, 1.28, 0.5) assurance(des, dist) # A tibble: 1 × 4 go stop inter incon <dbl> <dbl> <dbl> <dbl> 1 0.996 0.00323 0 0.00116 ``` ## Possible Solution (Example 1) > [name=Naveen] Hi All, > I looked at the example, just brainstorming with options…. > For the log t example, I think the approach of just LOG10 works… could you check this once? I will check the same for other distributions also…. ```r= # Normal Norm <- rnorm(100000, -0.0379, 0.0294) quantile(Norm, probs = c( 0.025, 0.05, 0.1, 0.15, 0.2, 0.25, 0.50, 0.75, 0.8, 0.85, 0.90, 0.95, 0.975 )) mean(Norm) Lnorm <- 10^Norm quantile(Lnorm, probs = c( 0.025, 0.05, 0.1, 0.15, 0.2, 0.25, 0.50, 0.75, 0.8, 0.85, 0.90, 0.95, 0.975 )) mean(Lnorm) # Shifted log student t Logt <- 0.1 + exp(-0.203 + 0.0469 * rt(n = 100000, df = 3)) quantile(Logt, probs = c( 0.025, 0.05, 0.1, 0.15, 0.2, 0.25, 0.50, 0.75, 0.8, 0.85, 0.90, 0.95, 0.975 )) mean(Logt) t <- log10(Logt) quantile(t, probs = c( 0.025, 0.05, 0.1, 0.15, 0.2, 0.25, 0.50, 0.75, 0.8, 0.85, 0.90, 0.95, 0.975 )) mean(t) win.graph(8, 8) par(yaxt = "n") # plot prior plot(density(Lnorm), xlab = "GMR", xlim = c(0.1, 2), ylim = c(0, 20), main = "GMR", col = 3, lwd = 3) lines(density(Logt), col = 2, lwd = 3, lty = 2) # plot prior plot(density(Norm), xlab = "GMR", xlim = c(-0.5, 0.5), ylim = c(0, 20), main = "GMR", col = 3, lwd = 3) lines(density(t), col = 2, lwd = 3, lty = 2) ``` ![](https://i.imgur.com/w7A63ux.png) > [name=Naveen] Alternatively, there was a procedure in QDM package earlier, which will fit the distribution and provide the parameters. SO, if we pass the Log t as I have shown below, you may take a log and then use this function to fit a distribution and then obtain the parameters for assessment of assurance. ```r= approx_unimodal <- function(x, .lwr, .upr) { .probs <- seq(0.1, 0.9, length.out = 9) .vals <- as.numeric(quantile(x, .probs)) .fit <- SHELF::fitdist( vals = .vals, probs = .probs, lower = .lwr, upper = .upr ) qdmtools_dens_names <- c( normal = "norm", t = "t", gamma = "gamma", lognormal = "lnorm", logt = "logt", beta = "beta" ) ll <- list( normal = c( mean = .fit$Normal$mean, sd = .fit$Normal$sd ), t = c( mean = .fit$Student.t$location, sd = .fit$Student.t$scale, df = .fit$Student.t$df ), gamma = c( shape = .fit$Gamma$shape, rate = .fit$Gamma$rate, lower = .fit$limits$lower ), lognormal = c( meanlog = .fit$Log.normal$mean.log.X, sdlog = .fit$Log.normal$sd.log.X, lower = .fit$limits$lower ), logt = c( meanlog = .fit$Log.Student.t$location.log.X, sdlog = .fit$Log.Student.t$scale.log.X, df = .fit$Log.Student.t$df.log.X, lower = .fit$limits$lower ), beta = c( alpha = .fit$Beta$shape1, beta = .fit$Beta$shape2, lower = .fit$limits$lower, upper = .fit$limits$upper ) ) best_fit <- as.character(.fit$best.fitting[1, 1]) rslt <- list( dens = as.character(qdmtools_dens_names[best_fit]), params = ll[[best_fit]] ) return(rslt) } draws <- 0.1 + exp(-0.203 + 0.0469 * rt(n = 100000, df = 3)) dr <- data.frame(draws) approx_unimodal(log10(draws), .lwr = -0.5, .upr = 0.5) t1 <- -0.03781589 + 0.01821280 * rt(n = 100000, df = 3) quantile(t1, probs = c( 0.025, 0.05, 0.1, 0.15, 0.2, 0.25, 0.50, 0.75, 0.8, 0.85, 0.90, 0.95, 0.975 )) mean(t1) Logt1 <- 10 ^ t1 quantile(Logt1, probs = c( 0.025, 0.05, 0.1, 0.15, 0.2, 0.25, 0.50, 0.75, 0.8, 0.85, 0.90, 0.95, 0.975 )) mean(Logt1) win.graph(8, 8) par(yaxt = "n") # plot prior plot( density(Lnorm), xlab = "GMR", xlim = c(0.1, 2), ylim = c(0, 20), main = "GMR", col = 3, lwd = 3 ) lines(density(Logt), col = 2, lwd = 3, lty = 2) lines(density(Logt1), col = 1, lwd = 3, lty = 2) # plot prior plot( density(Norm), xlab = "GMR", xlim = c(-0.5, 0.5), ylim = c(0, 20), main = "GMR", col = 3, lwd = 3 ) lines(density(t), col = 2, lwd = 3, lty = 2) lines(density(t1), col = 1, lwd = 3, lty = 2) ``` ![](https://i.imgur.com/MfK2DXM.png) ## Questions for Naveen 1. When you get the prior elicitation, what distributions are these in? Are they in geometric mean ratio? - Do you get the log10 transformation of this scale? ## Changes to `qdmtools` 1. Create new `design_log_normal_one()` and `design_log_normal_two()` functions and classes 2. Handle the transformation in the `des_rng()` function here 3. Create a *possible* dedicated method for `assurance()`