---
title: Practical Project - 02
tags: data-engineer
---
# Overview
This exercise helps you know how to use Kafka to ingest streaming data and process these data to detect anomaly.
**Level:** `Intermediate`
**Estimated time**: 15 hours
**Prerequisites:**
- Understand Kafka basic concepts
- Have basic knowledge of programming
- Know how to run Docker (since we suggest you fire up your own Kafka cluster (one node is good to go) in your local machine)
# Requirements
In this exercise, you will build a simple system to process and detect abnormal transactions of bank customer, as follows:

There are two main tasks you have to do:
1. Build a data generator, which should:
- Produce a streaming transaction data, realtime-alike (100 transactions per second)
- Produce data has the following structure:
```json=
{
"transactionId": "93151357815SJFHB",
"accountId": "19084637648936",
"customerId": "0931573195", // each customer can have multiple accounts
"targetAccountId": "8913850315984579",
"serviceId": "8356",
"amount": "10500000",
"currency": "VND"
}
```
- Publish data to a Kafka topic
2. Build a abnormal detection, which should:
- Read transaction data in Kafka then process these data to detect abnormal transaction
- Write the abnormal transaction to a different topic in Kafka
**How to label a transaction as abnormal?**
- Any transaction whose amount is more than 200 millions dong.
- Any account which takes more than 10 transactions per minutes with total amount more than 200 millions dong.
- Any customer produces more than 20 transaction per minutes with total amount more than 200 millions dong.
# Guide
<p>
<details>
<summary>Click this to collapse/fold.</summary>
> Hints
</details>
</p>