Factorial Experiments

--- title: Factorial Experiments useMath: True --- # Factorial Experiments $e^{i \pi} = -1$ *My understanding is that a full factorial design set is the best training data set we have available. We get our testing data set using the TrueBetas that is found on earth to do a dot product on the full factorial design data. The crux of where the project lies is in the epsilon we add on earth and the epsilon we add in mars. That would be the artificial variance and covariance we add to our simulation and that is also what we are ultimately trying to estimate at the end.* *We do that by using our bastardized bisection method to converge on to new xis from the beta estimates that is nudged with epsilons. Hence having the new xis, beta coefficient estimates, variance covariance matrices, and the estimated Yis nudged with epsilons, we use what is call the multivariate delta theorem to generate the variance covariance 2x2 matrix that should in theory do a feature mapping from earth to mars.* ### Beta Variables - **TrueBetas0**: (B00,B01,B02,B03,B04,B05),(0,1,0.05,0.2,0.1,0.05) - **TrueBetas1**: (B10,B11,B12,B13,B14,B15),(0,0.07,1,0.05,0.1,0.1) ### Equations - **Equation0**: 0=b00+b01x0+b02x1+b03x0^2 +b04x1^2+b05x0x1-y0 - **Equation1**: 0=b10+b11x0+b12x1+b13x0^2 +b14x1^2+b15x0x1-y1 ### Variable Definitions *greendots come from everything below that says assumed* - **(assumed_est_x0,assumed_est_x1)**: true values generated from -1,1 - **(assumed_new_Y0,assumed_new_Y1)**: true Yis get out with TrueBeta0,TrueBeta1(no epsilons) - **(assumed_new_ye0,assumed_new_ye1)**: equals assumed_new_Y0,assumed_new_Y1 only for this case,greed dots - assume epsilon equals 0 - **(assumed_guess_ye0,assumed_guess_ye1)**: get that from assumed_est_x0,assumed_est_x1,beta0_est,beta1_est - does not use epsilons - **(assumed_top_left,assumed_bottom_right)**: comes from assumed_est_x, beta0_est,beta1_est and assumed_guess_ye0,assumed_guess_ye1 - no epsilons - comes squared rooted in simulation --- *bluedots come from everything below that says assumed* - **(blue_est_x0,blue_est_x1)**: new_x0,new_x1 - **(blue_new_Y0,blue_new_Y1)**: true Yis get out with TrueBeta0,TrueBeta1(no epsilons) - equals new_Y0,new_Y1 - **(blue_new_ye0,blue_new_ye1)**: get that from blue_est_x0,blue_est_x1,beta0_est,beta1_est - does not use epsilons - **(blue_top_left,blue_bottom_right)**: comes from blue_est_x, beta0_est,beta1_est and blue_guess_ye0,blue_guess_ye1 - no epsilons --- *reddots come from everything below that says assumed* - **(red_new_x0,red_new_x1)**: same thing as new_x0,new_x1 - **(red_est_x0,red_est_x1)**: actual est_x0,est_x1 - **(red_new_Y0,red_new_Y1)**: true Yis get out with TrueBeta0,TrueBeta1(no epsilons) - is same as new_Y0,new_Y1 - based on new_x0,new_x1 - **(red_new_ye0,red_new_ye1)**: exact same as new_ye0,new_ye1 - **(red_guess_ye0,red_guess_ye1)**: get that from red_est_x0,red_est_x1,beta0_est,beta1_est - does not use epsilons - **(red_top_left,red_bottom_right)** - no epsilons --- - **(new_x0,new_x1)**: random generated pair between -1 to 1 - **(est_x)**: consists of estimated est_x0 and estimated est_x1(only talking about the last set) - est_x baesd on beta0_est,beta1_est and new_ye0,new_ye1 - **(new_Y0,new_Y1)**: from new_x0,new_x1 and TreuBeta0,TrueBeta1 - **(new_ye0,new_ye1)**: new_Y0,new_Y1 + randomly distributed normal epsilon term - **(guess_ye0,guess_ye1)**: comes from est_x to equation (refers to last set of guess_yis) - differences between new_ye0,new_ye1 and guess_ye0,guess_ye1 should be 0e^(-10) - should use beta0_est,beta1_est ### Sigma *working recipe: (general rule of thumb sigmas need to be the same)* - Standard - step 1 - sigma0 = std * 2 - sigma1 = std - keep original TrueBeta order - ALT - step 1 - sigma0 = std - sigma1 = std * 1 - switch the TrueBetas ##### One Full Factorial Design Matrix(1 repetitions) | I | x0 | x1 | R x0^2 | R x1^2 | R x0x1 | |---|----|----|-----------|-----------|--------------| | 1 | 1 | 1 | -0.333333 | -0.333333 | -1 | | 1 | 1 | 0 | -0.333333 | 0.666667 | 4.78359e-16 | | 1 | 1 | -1 | -0.333333 | -0.333333 | 1 | | 1 | 0 | 1 | 0.666667 | -0.333333 | -3.08196e-16 | | 1 | 0 | 0 | 0.666667 | 0.666667 | 2.21333e-17 | | 1 | 0 | -1 | 0.666667 | -0.333333 | 2.15998e-16 | | 1 | -1 | 1 | -0.333333 | -0.333333 | 1 | | 1 | -1 | 0 | -0.333333 | 0.666667 | -3.64027e-16 | | 1 | -1 | -1 | -0.333333 | -0.333333 | -1 | ##### 3 Full Factorial Design Matrix (3 repetitions) | I | x 0 | x 1 | x 0 ^ 2 | x 1 ^ 2 | x0 x1 | |---|-----|-----|---------|---------|-------| | 1 | 1 | 1 | -0.33 | -0.33 | -1 | | 1 | 1 | 0 | -0.33 | 0.67 | 0 | | 1 | 1 | -1 | -0.33 | -0.33 | 1 | | 1 | 0 | 1 | 0.67 | -0.33 | 0 | | 1 | 0 | 0 | 0.67 | 0.67 | 0 | | 1 | 0 | -1 | 0.67 | -0.33 | 0 | | 1 | -1 | 1 | -0.33 | -0.33 | 1 | | 1 | -1 | 0 | -0.33 | 0.67 | 0 | | 1 | -1 | -1 | -0.33 | -0.33 | -1 | | 1 | 1 | 1 | -0.33 | -0.33 | -1 | | 1 | 1 | 0 | -0.33 | 0.67 | 0 | | 1 | 1 | -1 | -0.33 | -0.33 | 1 | | 1 | 0 | 1 | 0.67 | -0.33 | 0 | | 1 | 0 | 0 | 0.67 | 0.67 | 0 | | 1 | 0 | -1 | 0.67 | -0.33 | 0 | | 1 | -1 | 1 | -0.33 | -0.33 | 1 | | 1 | -1 | 0 | -0.33 | 0.67 | 0 | | 1 | -1 | -1 | -0.33 | -0.33 | -1 | | 1 | 1 | 1 | -0.33 | -0.33 | -1 | | 1 | 1 | 0 | -0.33 | 0.67 | 0 | | 1 | 1 | -1 | -0.33 | -0.33 | 1 | | 1 | 0 | 1 | 0.67 | -0.33 | 0 | | 1 | 0 | 0 | 0.67 | 0.67 | 0 | | 1 | 0 | -1 | 0.67 | -0.33 | 0 | | 1 | -1 | 1 | -0.33 | -0.33 | 1 | | 1 | -1 | 0 | -0.33 | 0.67 | 0 | | 1 | -1 | -1 | -0.33 | -0.33 | -1 | ### Plan of how system should work - step 0: - step 0a: generate 1 rep design matrix,(generate non orthogonal x0 squared, x1 squared, x0 * x1) - step 0b: get orthogonal vector(residuals coefficients), find residuals(orthogonalize step 0a) - step 0c: make 3 reps of design matrix - step 0d: instantiate random number generator with seed * scale=0.01=standard deviation(need it for statistics)(will do scale =0.0 for test next time) - step 1: - step 1a: use design matrix and the TrueBetas0(0,1,0.05,0.2,0.1,0.05) and the epsilons to simulate (design line * TrueBetas) + epsilon(Sigma*2) ← set 0 (alternative 0,0.07,1,0.05,0.1,0.1;Sigma) - every time step 1 repeated, use different epsilons - step 1b: use design matrix and the TrueBetas1(0,0.07,1,0.05,0.1,0.1) and the epsilon to simulate (design line * TrueBetas) + epsilon(Sigma) ← set 1 (alternative 0,1,0.05,0.2,0.1,0.05;Sigma*2) - every time step 1 repeated, use different epsilons - step 2: - step 2a: estimate betas(beta0_est) from set 0 * get cov_beta0, MSE0 - step 2b: estimate betas(beta1_est) from set 1 * get cov_beta0, MSE1 - step 2c: - put together variance covariance matrix as below * get 14 x 14 matrix that is only diagonal non zero - step 3: - step 3a: randomly generate new new_x0(true values from martian atmostphere),new_x1 value between -1,1(generate just one set) totally separate from design matrix, only use first set - step 3b: create design line corresponding to those values - step 3b1: design line is 1 new_x0 new_x1, R x0^2, R x1^2, x0x1*-1 - step 3c: use design line and TrueBetas(never no what they really are, only estimate) to create new_Y0 new_Y1(generate just 1 set) - step 3d: take new_y0(measure from martian atmosphere),new_y1 add new epsilon new_e0(Sigma*2;alt Sigma),new_e1(Sigma;alt Sigma*2)(be 0.0)(end up with new_ye0,new_ye1) - step 4: ACCURATE TO 10 DECIMAL PLACES - step 4a: - step 4a1: initial guess est_x0,est_x1 (is 0,0 if first time through),design line, step_size = 0.5 new_ye0(measure on mars),new_ye1,beta0_est(estimate betas on earth),beta1_est,step_size=0.1 - step 4a2: make all 4 points around initial +- step size and design line(make the 4 points) - step 4a3: evaluate 2 functions using beta0_est and evaluate 2 functions using beta1_est,and 4 points around initial guess; gives me 4 y guesses - step 4a4: metric is mean squared error - step 4a5: initial guess that gives the least mean squared error,gets guess guess_ye0,guess_ye1 - step 4b: (keep repeating 4a until, use initial est_x until guess_y – new_ye < 0e-100) * don’t care of new_x0, est_x0(these will be = if epsilon 0 else be with in ball park but don’t care as long as its near by) - step 5: *get gradient matrix - step 5a: dynamically generate symbolic equation,xis,betas, yis - step 5b: implicitly differentiate equation 0 with respect to all betas + y0,y1 - step 5c: implicit differential equation 1 with respect to all betas + y0,y1 - step 5d: set d of equation 0 to equation 1 equal to each other * substitute dx1_db00 first then all the betas,xis,yis afterwards * verify dx1_dy1 - step 5e: get gradient matrix - step 6 plot meshgrid - plot - get var_cov 2x2 - get inverse of 2x2 - (est_x - true_x) dot() inv 2x2 multiplied (est_x - true_x transposed) - 1x2, 2x2, 2x1 - refer to "Get Assumed plots" ## Get Assumed Plots **Layout he grid (got through steps 1-4 just like normal)** - alternate step 1: let assumed_est_x go from -1 to 1 in steps of 0.1(441 points each), - alternate step 2: plug in assumed_est_x,beta0_est and beta1_est in to equation to get the assumed_guess_ye0,assumed_guess_ye1 - alternate step 3: use assumed_guess_ye0,assumed_guess_ye1 to get variance (nothing to do with true values),use assumed_ext_x and beta0_est, beta1_est too - put in assumed_guess_ye0,assumed_guess_ye1 with assumed_est_x gets variance(assumed_top_left,assumed_bottom_right) - alternate step 4: estimate each point (x0,x1,variances),plot each point and do some thing for top left, bottom right(2 plots, top left plot, bottom right plot) - alternate step 5: do step 4 for red and blue dots, get est_x, get guess_ye0,guess_ye1,get variance from guess_ye0,guess_ye1 - plot est_x, variance top left (on top left plot),corresponding new_x0,new_x1(actual generated new_x0,new_x1) - plot est_x,variance bottom right (on bottom right plot) *new_x0,new_x1 and est_x should be different any number of decimals places. guess_ye0,guess_ye1 and new_ye0,new_ye1 should be 10 decimals of each other.* ##### Evaluation algorithm - for each green dot, have assumed_top_left, assumed_bottom_right,assumed_new_x0,assumed_new_x1 - calculate assumed_new_x0 +/- 2.08 * SQRT(assumed_top_left) - calculate assumed_new_x1 +/- 2.08 * SQRT(assumed_bottom_right) - get confidence interval box with these 2 - see if red dot is inside the 2d box - if it is keep green - else make black