# Overview
## High-level architecture
<iframe frameborder="0" style="width:100%;height:458px;" src="https://viewer.diagrams.net/?tags=%7B%7D&highlight=0000ff&edit=_blank&layers=1&nav=1&title=Untitled%20Diagram.drawio#Uhttps%3A%2F%2Fdrive.google.com%2Fuc%3Fid%3D1WMaUYc9isImpJ8m1_HF4OVcvCY0fhC0Q%26export%3Ddownload"></iframe>
The Funda platform is composed of different websites, applications and services, as well as supporting infrastructure. Given the thousands of concurrent users using the platform at any given time, the need for scalability informs and influences the overall architecture.
Funda, FundaInBusiness, Fundadesk and other web applications run in a multi-VM high-available environment, while the dozens of support services and APIs that are used by said web applications run as Docker containers on a Kubernetes cluster. The same applies for legacy back-end applications and services, and newly developed back-end applications and services, which run on VMs and on a Kubernetes cluster, respectively.
A load balancer and router sits at top of all our web-facing applications, and an extra layer of caching optimization and security is added via a CDN.
Media flows from frontend and backend applications, is uploaded to cloud storage, processed (resized, transcoded, merged) and made available via a CDN.
Clickstream events are collected via Segment, sent to BigQuery for modelling, and made available for reporting.
## Data Setup
```mermaid
flowchart LR
stats(Statistic events)--> sv{Stat Verwerking}
sv --> os[Listing statistics]
os --> f
os --> fdk
ext(External events) --> seg[Segment]
cs(Clickstream) --> seg
seg --> bq[BigQuery]
cs -- GTM --> ga[Google Analytics]
bq -- dbt --> af{Airflow}
af -- dbt --> bq
af --> rpy((R<br/>Python))
rpy --> af
rpy --> f>Funda]
rpy --> fdk>Fundadesk]
bq --> ds((Data<br />Studio))
class stats,sv,os oldStats
classDef oldStats stroke:#f66,stroke-width:2px,stroke-dasharray: 5 5
````
To summarize, we collect both in-platform and off-platform/external events via Segment, funnel them to our data warehouse in BigQuery, and then use Airflow to orchestrate both our data modeling within the warehouse itself (with dbt) and our more advanced data operations and data science workloads. The modeled data in BigQuery can then be used as a source for data analysis via Data Studio, and for advanced data modeling for data scientists via R/Python libraries, as well as be used within product features.
Google Analytics is used as a fallback to validate that our collection and internal analytics are accurate.
The red dotted parts represent the old Statistics flow, that it's still used to power most Fundadesk dashboards, but that will soon be replaced by the new flow.
In order to effectively work on all parts of the data journey, from collection, to transformation, modelling and reporting/visualization, we have a dedicated multi-disciplinary team, composed of data engineers, data scientists and data analysts.
## Fundadesk
Fundadesk is our broker focused platform. It serves different purposes, which are summarized below.
| CMS | CRM | Billing | Analytics |
|----- |----- |--------- |----------- |
|- Upload brochures <br/>- Manage sponsored listings<br/>- Configure video, floorplans, 360 <br/> - Manage open house days <br/> - Configure office's landing page |- Access consumers' contact requests<br/>- Request reviews from customers<br/>- Access media suppliers deliverables<br/>- Link customers to listings as sellers |- Order promotion and presentation products per listing, office promotion products<br/>- Manage running product purchases<br/>- Configure billing details<br/>Access invoices and payment status |- Listings statistics <br/>- Listing performance indicators<br/>- Housing market analysis |
## Tools
| CRM | Marketing Automation | CMS | ERP | ATS | Customer service | CDP |
| -------- | -------- | -------- | -------- | -------- | -------- | -------- |
| Salesforce | **B2B**: Pardot<br/>**B2C**: Iterable | GX (*in the process of <br/>migrating to headless CMS*)| AFAS | Recruitee | Zendesk | Segment |