owned this note changed 4 years ago
Published Linked with GitHub

Script: Generating Stochastic Processes

tags: cadCAD edu


Transcript


Slide 1 (Logo)

  • Welcome to cadCAD Hacks, the format in which we introduce you to the tips and tricks used by professional Web 3 cadCAD engineers.

Slide 2 (How to)

  • In this cadCAD Hack, we will demonstrate step by step how to generate stochastic processes as part of a cadCAD model.

Slide 3 (Why Should You Care)

  • Why should you care about this Hack as a cadCAD Engineer?
  • You'll find this very useful for simulating values that are representative of real data for modeling and counterfactuals!

Slide 4 (What this video covers)

  • Alright, here is what we'll cover in this video is
    • How to generate Gamma random numbers
    • How to generate Gaussian random numbers

Slide 5 (Let's get started)

  • Now, let’s jump straight into the Jupyter Notebook, and get started!

Notebook (Table of Contents)

  • Alright, our Notebook is structured along with the cadCAD Edu Standard Notebook Layout and it is based on the Live API Data hack, which shows how to load the Ethereum per Bitcoin historical price data inside a cadCAD model; If the Live API Data hack is new to you, we highly recommend revisiting cadCAD Hack number 7 before continuing.
  • We’ll first import any required dependencies.
  • In the Setup and Preparatory Steps we will download historical prices from Coingecko API.
  • In the modeling and simulation sections we will use the same Live API Data hack model.
  • We'll spend most of our time here in the Analysis section (NOTE: show with mouse), where we're going to generate Normal and Gamma random distributions, and visually compare them agains historical data.
  • Alright, let's jump in.

Notebook (Dependencies)
As in any cadCAD Hack, we have two dependency sections

  • The first one contains the cadCAD standard dependencies you are familiar with from the Complete Foundations Bootcamp or the cadCAD Edu Cheat Sheet.
  • As for the hack-specific dependencies we'll import:
    • json and requests libraries for pulling the price data from the API;
    • math for generating random numbers;
    • Scipy Stats module for generating stochastic processes. This is a scientific library that contains all kinds of statistical and numerical features that one should desire!;
    • Plotly Express and pyplot for visualization.
  • Given that our hack is based on the Hack number 7, Live Data, let's jump straight to the analysis section.

Notebook (Analysis)

  • Okay. So for a refresher, our output for the Live Data API hack are figures that show how the Ethereum price as denominated in Bitcoins evolves daily over time, as well as the percent change on each day.
  • Suppose now that we want to simulate data points for the price feed that resembles the real data. One way of doing that, is by taking random numbers that follows a similiar probability distribution, and one typical choice is to use the Gamma Distribution.
    • Long history short, the Gamma distribution generates random numbers that are between zero and infinity, and it contains several statistical properties that makes it a natural choice for price data.
    • In order to generate data with it, we define the gamma_proc function, which takes as a input a number of points to be generated and returns a list of random numbers. It works by invoking the scipy stats gamma class, and by using the rvs method of it. We are setting arbitrary numbers here for generating something that is close to the real data, although this can be done automatically by using Statistical Fitting procedures.
    • Then, we create a function called generate_gamma that takes a cadCAD results dataframe, and returns a new dataframe that contains two additional columns: one which is called origin, which will keep track of what data is generated or real, and another called daily_price, which contains the real numbers. By taking the old and the new dataframe together, we concatenate them so that we can compare them into a single visualization.
    • With that function ready, we perform our data preparation step: we get the simulation results, we drop the first row, we assign the origin column so that it indicates that the source is the real data, and we pipe that dataframe into our generate_gamma function.
    • The result is a dataframe that contains generated and real price data for each timestep, and we then plot it through the px.histogram method.
    • By passing daily_price into x, we can visualize the distribution of the prices, and by passing color equals origin, we can differentiate between the source. Also, we pass margin equals violin so that we have a second visualization of the distribution.
  • And that's it! Notice that the real data - indicated by blue - bars and margin is remarkebly similiar to the generated ones - indicated by red. This demonstrates that the generated data should look relatively similiar to the real one.
  • We can also use a different probability distribution for a different variable. For instance, this time we'll be using a Normal Distribution for modelling the relative daily price changes. The procedure is just like the previous one, but we use the normal_proc function instead of the gamma_proc, where we invoke st.norm.rvs for generating normal numbers rather than gamma ones. Also, on the fig_df definition, we set the variable of interest as being normed, which is equal to the difference on daily prices divided by the current price.
    • By doing that, we can visualize again the distribution when comparing real data vs simulated data. Notice that although they're generally similiar, the real data is long-tailed while the simulated one is not. This is a common phenomenae on complex systems, and is the basis of a lot of common knowledge and concepts in finance, like the notion of anti-fragility.

Notebook (OUTRO)

  • And that's it for this Hack! By having a grasp of how to generate random numbers that follow different probability distributions and to compare them with real data, you're now equipped to perform stochastic simulations that are more realistic and nuanced. This is a key skill when dealing with real world complex systems!

Slide 5 (Happy Hacking)

  • We hope you enjoyed this Hack and that it will be useful on your journey as an cadCAD engineer.
  • As always, happy hacking, and see you next time!
Select a repo