Skunkworks 2020

# Skunkworks 2020 ## Kube-fs Kubefs will allow you to mount a Kubernetes view as a virtual filesystem. Kubernetes objects will be represented as directories when they could "contain other objects" and files when they not. Makes sense to have a "namespace" to be a directory, because it is a container primitive that holds other objects inside. A Pod can also be a "holder" because it could have its own set of Containers, which could be accessed as files also. ## Tech * Rust: Want to learn something, better if it is interesting! * libfuse: And some Rust wrapper around that idea. This needs to work on both Macos and Linux * Gitlab + CI: Gitlab CI is great, and I already have a pretty good understanding how to make this work. * Github + Actions: A good plan would be to try to use Actions for CI/CD. If I get to have a sane product in a few days, I could implement this ### Reading/watching material * [awesome-rust-streaming](https://github.com/jamesmunns/awesome-rust-streaming) ## Libfuse This is how you write filesystems in user-space. For libfuse to work you need a kernel module (I'm not sure about Darwin's terminology), which means we don't have to play with a running Kernel (risking breaking it) to build our filesystem. ## Rust wrappers around libfuse There's one actively maintained crate on crates.io, ### fuser * url: https://crates.io/crates/fuser Documentation wise this crate is not spectacular, but maybe that's something I could help with. They have at least a TravisCI environment, but I could not run `cargo test` locally, it required `pkg-config` and `osxfuse`. The ones provided with nix or from https://osxfuse.github.io/ don't seem to work at first -- need to get this running. # Project description * I can mount a Kubernetes context as a FS * I can navigate this FS using a terminal * I can view a file (like my-pod.pod.yaml) and get a yaml representation of it * I can edit this file and kubefs will patch the object in kube-api with my changes * I can copy a yaml object into this filesystem and it will be materialized into an actual object created on the kubernetes cluster. For instance `cp my-deployment.yaml <context>/<namespace>/` will create whatever is defined in that file into that namespace. ## Open questions 1. What if I want to see the contents of a directory? how many requests to kube-api? is this doable? 2. How to handle cluster-wide objects? Maybe a directory called: `cluster` that contains, certificates, ClusterRoles, etc? ## Goals ### Friday 23rd * [**DONE**] Be able to run fuser (installing pkg-config and `libfuse` in osx). This was a bit tricky, I had to: ```shell nix-env -iA nixos.pkg-config nix-env -iA nixos.osxfuse git clone git@github.com:cberner/fuser.git cd fuser PKG_CONFIG_PATH=${HOME}/.nix-profile/lib/pkgconfig cargo test ``` `PKG_CONFIG_PATH` needs to be exported and pointing at where osxfuse saves its `osxfuse.pc` file. * [**DONE**] Be able to run `cargo test` and succeed * [**DONE**] Get familiar with one of the examples in https://github.com/cberner/fuser/tree/master/examples. Play with it for a bit. * [**DONE**] ~~Be able to mount a kubecontext and list namespaces as directories in /~~ (see next day TODO). I started with this small [project](https://github.com/rodrigovalin/kubefs) that has the dependencies at least, and a copied file from `fuser`. It can be compiled so we should be good to start doing some stuff to it (listing namespaces for instance). ### Monday 26th * [**DONE**] Make first docs contribution into fuser if needed! See it [here](https://github.com/cberner/fuser/pull/63). **Merged!** * [TODO] Be able to show namespaces as directories on top-level. * [TODO] When navigating into a directory, see contents (Service, Secret, ConfigMap, etc.) # Notes * **WRONG** ~~https://osxfuse.github.io/ is not needed, we can get the same with nix-env. `osfuse` is required and needs to be installed. The nix-env `osxfuse` package is needed because of pkg-config in nix will not know about osxfuse dependencies.~~ * Need to restart macos to disable system integrity protection to be able to run `dtrace`. This is bad, I won't even try. Next, try with Linux. + this is true, but not needed anymore. # Kubernetes Client [kube create](https://crates.io/crates/kube) is the most active one it seems. It has a nice API, support for `CustomResourceDefinition`. It seems to be fairly active in fact, the [changelog](https://github.com/clux/kube-rs/blob/master/CHANGELOG.md) seems pretty active for 2020. It can derive the `Resource` trait for a struct if annotated. ## Problems I was getting this error message: Error: ReqwestError(reqwest::Error { kind: Request, url: "https://127.0.0.1:49721/api/v1/namespaces?&fieldSelector=metadata.name%3Ddefault", source: hyper::Error(Connect, Custom { kind: Other, error: "invalid dnsname" }) }) I had to change `clusters[*].cluster.server` from `127.0.0.1` to `localhost`. No idea why, probably a bug on `reqwests`. # Tokio I had a bad time with Tokio today. In my `Cargo.toml` file I had `tokio = {version="0.3.1", features = ["all"]}`. And it was constantly failing with: there is no timer running, must be called from the context of Tokio runtime When calling `api::Api<namespace>::list()`. I figured that it was because `kube` create depends on `tokio = "0.2.22"` which seems to be incompatible. It took me hours to figure this out, but hey, I made it! # Organizing the Filesystem Now, how do we get to traverse our file system. When listing a directory, let's say we do `ls /mount-point`, `fuser` will execute `readdir` passing a few arguments. The important ones are: 1. `ino`: Inode of the directory 2. `offset`: This seems to be related to reading the contents in multiple passes, starting from `offset`. 3. `reply`: struct we have to populate with results. Currently we are only listing from `inode == 1`, the root directory, but when changing directory to one of the "namespaces" then, the inode received will be completely different, and we'll have to find it in some kind of structure. I'm planning on using a [BTreeMap](https://doc.rust-lang.org/stable/std/collections/struct.BTreeMap.html), and populate it with the inodes and the corresponding directory... actually it could point at a Enum that has the KubernetesResource in question... this way I could figure which object/resource and even API to use. Ok, there's not `std::collection` for a hierarchical data structure, and I've spent the day looking for a solution to this problem. For a simple tree structure, I will use what is proposed in here: [Arena-Allocated trees in Rust](https://dev.to/deciduously/no-more-tears-no-more-knots-arena-allocated-trees-in-rust-44k6). # Github Actions It is surprisingly easy to enable Action in GH, I'm glad I gave it a try. GH UI allows you to click the "Enable actions" button and suggests Rust. It created a yaml file like this: ```yaml name: Rust on: push: branches: [ main ] pull_request: branches: [ main ] env: CARGO_TERM_COLOR: always jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Build run: cargo build --verbose - name: Run tests run: cargo test --verbose ``` I only had to add the following to it to make it work (as one if the `steps`): ```yaml - name: Install deps run: sudo apt-get install libfuse-dev pkg-config ``` # Tree Filesystem? Most of the functions to implement in the `Filesystem` trait receive the inode of the file-like object in question. Examples: `readdir()` called when listing contents of a directory. It receives the directory's inode. For instance, the directory with id = 1 corresponds to the root. `read` I presume this file is used when reading a file, but I'm not sure yet... I will check now in fact. * Update: still not sure, I think I need to first go over `getattr` to be able to read. Anyway it seems that the tree structure will be required so I'll stop pretending I'll get anywhere without it. There's just one function, `lookup` that receives the names of a file (type `OsStr`) and the inode of the directory where this file resides. It is not straightforward and we'll have to do a bit of structural design to satisfy all of these functions. Ideally I would have a structure with the full hierarchy of the filesystem, the way I want it to be represented. Each node will have as its leafs anything that can be represented as a container (directory) with some content on it. Let's say we have node with inode 1, this represents the root `/` directory. The leaves of this node will be the namespaces represented as root-level directories. Now, let's assume the user goes into the `default` directory, and it needs to `readdir`, one leaf will be added per each one of the directory. Now, every time we go into a directory we'll call `lookup` on each file contained in it. The `lookup` function takes a deviation from what the other functions do. It receives the inode of the parent directory and the `name`, type `OsStr` of the file in question, so we can't use tree traversal to find it all the way to the leaf, but only to the node (directory) containing it. ## End of the story In this [PR](https://github.com/rodrigovalin/kubefs/pull/3/files) I simplified the data structure handling files and directories. It is still really far away from complete, but at least it is good enough to satisfy the initial needs of the project. It is based on a `Map<Inode, KubernetesResource>`, each time a directory is traversed, it will add all of its "children" as new elements in the Map, and will add a list of their Inodes to the parent's (this directory) `subresources`. This is an `Option` field, that, if `Some`, then it means that this resource contains more resources. I know that keeping track of resource removal will be a pain in the ass later, but for now we are ok.