---
title: 'Getting Started with Spring Cloud Data Flow'
disqus: hackmd
---
# Getting Started with Spring Cloud Data Flow
[TOC]
## What is this?
Spring Cloud Data Flow workshop on Kubernetes.
1. The workshop is primarily geared towards the students who attend the [SCDF workshop at SpringOne 2020](https://springone.io/2020/workshops/spring-cloud-data-flow).
2. Students will use the [Strigo](https://strigo.io/) platform to learn how to prepare the environment and exercise SCDF's feature capabilities on Kubernetes.
> It is possible to follow the instructions to repeat this guide in your laptop or an EC2 instance (image: `ami-03e97315b2269f290`; region: `us-west-2`).
## Prerequisite
1. A [Strigo](https://strigo.io/) account.
2. Registered for the SCDF workshop and you have an "access code".
3. The workshop is set up for 90mins total. However, the labs could take between 30-45mins.
> Ignore this if you're following this workshop to repeat it outside of the Strigo platform.
### Machine Requirements
We will be using Minikube as the single-node Kubernetes cluster with `--vm-driver=docker` as the driver. You need at least 4CPUs and 10GB RAM is required to run the lab exercise.
```
minikube start --vm-driver=docker --kubernetes-version v1.17.0 --memory=10240 --cpus=4
```
> If you are repeating it in EC2, you will have to launch the `ami-03e97315b2269f290` (region: `us-west-2`) AMI with `t2.xlarge` instance type.
> Kubernetes v1.17.x is used in the lab, but you can pick a version that is [compatible](https://dataflow.spring.io/docs/installation/kubernetes/compatibility/) with SCDF.
### Spring Cloud Data Flow
To walkthrough SCDF's streaming, batch, analytics, and observability features, the following components will be provisioned in the single-node Kubernetes cluster.
1. Spring Cloud Data Flow
2. Spring Cloud Skipper
3. Prometheus
4. Grafana
5. Apache Kafka
6. MariaDB
The lab relies on the Spring Cloud Data Flow's [Bitnami chart](https://github.com/bitnami/charts/tree/master/bitnami/spring-cloud-dataflow). Students will have to run the `scripts/deploy-scdf.sh` script that is available at [sabbyanandan/SpringOne2020](https://github.com/sabbyanandan/SpringOne2020).
### Tools / Programs
The `ami-03e97315b2269f290` (region: `us-west-2`) is prepared and ready to launch with the following tools.
1. Docker
2. Minikube
3. Helm
4. kubectl
5. K9s
6. Java
7. Maven
8. VS Code
## Agenda
```sequence
Strigo->Lab: Start the Lab in Strigo
Note left of Prepare: You'll do this *once*
Lab->Prepare: Start minikube
Lab->Prepare: Deploy SCDF stack
Lab->Prepare: Build applications
Lab->Prepare: Generate docker images
Prepare-->Lab: Stuck? Cleanup and repeat
Prepare->Streaming Lab: Build an IoT streaming data pipeline
Prepare->Streaming Lab: Deploy a stream from SCDF to K8s; verify results
Prepare->Streaming Lab: Monitor performance using Prometheus & Grafana
Streaming Lab-->Prepare: Stuck? Cleanup and repeat; ask questions
Prepare->Batch Lab: Build and design a batch data pipeline
Prepare->Batch Lab: Launch the batch-job from SCDF to K8s; verify results
Prepare->Batch Lab: Schedule the batch-job in SCDF to K8s; verify results
Prepare->Batch Lab: Monitor performance using Prometheus & Grafana
Batch Lab-->Prepare: Stuck? Cleanup and repeat; ask questions
```
## Labs
To get started with the labs, first, you will have to prepare the environment.
When you log in to Strigo with your access token, you will have to click the "My Lab" button on the left-nav to prepare the lab VM.
![My Lab](https://i.imgur.com/kglWS7X.png)
> After you login to the VM, test the environment by running the `scripts/test-env.sh` script — see example below.
```bash
[ec2-user@ip ]$ cd SpringOne2020
[ec2-user@ip ]$ git pull
[ec2-user@ip SpringOne2020]$ sh scripts/test-env.sh
docker is ready
minikube is ready
helm is ready
kubectl is ready
k9s is ready
java is ready
mvn is ready
```
### Lab 1: Prepare Environment
The first lab starts with setting up a single-node Kubernetes cluster using Minikube. Let's review the steps involved.
#### Set up Kubernetes
Start a single-node Minikube cluster with the `minikube start --vm-driver=docker --kubernetes-version v1.17.0 --memory=10240 --cpus=4` command.
```
[ec2-user@ip-100 ~]$ minikube start --vm-driver=docker --kubernetes-version v1.17.0 --memory=10240 --cpus=4
* minikube v1.12.2 on Amazon 2 (xen/amd64)
* Using the docker driver based on user configuration
* Starting control plane node minikube in cluster minikube
* minikube 1.12.3 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.12.3
* To disable this notice, run: 'minikube config set WantUpdateNotification false'
* Creating docker container (CPUs=4, Memory=10240MB) ...
* Preparing Kubernetes v1.17.0 on Docker 19.03.8 ...
* Verifying Kubernetes components...
* Enabled addons: default-storageclass, storage-provisioner
* Done! kubectl is now configured to use "minikube"
```
> We are using the Docker driver to install Kubernetes into an existing Docker daemon in the VM. This also simplifies the setup because we don't need to worry about extra virtualization to be enabled.
Verify that Minikube is successfully started: `minikube status`.
```
ec2-user@ip-100 ~]$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Runningkubeconfig: Configured
```
Confirm that the Minikube container is running.
```
[ec2-user@ip-100 ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS
NAMES
539c113531ae gcr.io/k8s-minikube/kicbase:v0.0.11 "/usr/local/bin/entr…" 9 minutes ago Up 9 minutes 127.0.0.1:32771->22/tcp, 127.0.0.1:32770->2376/tcp, 127.0.
0.1:32769->5000/tcp, 127.0.0.1:32768->8443/tcp minikube
```
This command roughly takes about 2-3mins to complete. If for any reason, you notice startup errors, likely, the Docker daemon is still starting in the VM, so re-run the same start command once again. The cluster will start eventually.
#### Set up Spring Cloud Data Flow
It is now time to deploy SCDF! To get started, you will run the `scripts/deploy-scdf.sh` script.
```
[ec2-user@ip-172-31-19-111 ~]$ sh SpringOne2020/scripts/deploy-scdf.sh
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "bitnami" chart repository
Update Complete. ⎈ Happy Helming!⎈
Error from server (NotFound): namespaces "monitoring" not found
A namespace called monitoring for prometheus should exist, creating it
A namespace called monitoring exists in the cluster
Error: release: not found
Install bitnami/prometheus-operator prometheus_release_name=prom prometheus_namespace=monitoring
....
....
....
```
This command will take ~5-6mins to finish. Behind the scenes, the script uses Bitnami's Prometheus and Grafana operator to provision the monitoring stack. With that foundation up and running, the script runs SCDF's [Bitnami chart](https://github.com/bitnami/charts/tree/master/bitnami/spring-cloud-dataflow) to deploy the following.
- MariaDB
- Apache Kafka + Zookeeper
- Spring Cloud Data Flow
- Spring Cloud Skipper
While all this is starting, open a terminal session in a new tab and run the [`k9s`](https://k9scli.io/) command to verify the current deployment status.
```
[ec2-user@ip-172-31-19-111 ~]$ k9s
```
![Review SCDF components](https://i.imgur.com/FVsZLQ2.jpg)
#### Verify Environment
When the script completes successfully, you will see the following output.
```bash
[ec2-user@ip-172-31-19-111 ~]$ sh SpringOne2020/scripts/deploy-scdf.sh
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "bitnami" chart repository
...
...
...
### Stack succesfully deployed ###
Connect to Data Flow
$ helm status scdf
Grafana password
$ kubectl -n monitoring get secret graf-grafana-admin -o jsonpath={.data.GF_SECURITY_ADMIN_PASSWORD} | base64 --decode
Forward grafana
$ kubectl port-forward -n monitoring svc/graf-grafana 3000:3000
```
Likewise, you can verify what version of the [Spring Cloud Data Flow's Bitnami helm chart](https://github.com/bitnami/charts/tree/master/bitnami/spring-cloud-dataflow) is currently in use.
```bash
[ec2-user@ip-100 SpringOne2020]$ helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
scdf default 1 2020-08-20 21:19:45.728756477 +0000 UTC deployed spring-cloud-dataflow-0.6.1 2.6.0
```
Before we start the next lab, though, let's verify that SCDF is up and running. We will apply port-forwarding rules for the following pods so that we can test the newly deployed components.
1. ***SCDF***: click `shift+f` on the `scdf-spring-cloud-dataflow-server-***` pod to open the port-forward window in `k9s`; change the "Address" to `0.0.0.0`.
2. ***Grafana***: click `shift+f` on the `graf-grafana-b6bf96c9c-***` pod to open the port-forward window in `k9s`; change the "Address" to `0.0.0.0`.
3. ***Prometheus***: click `shift+f` on the `prometheus-prom-prometheus-operator-prometheus-**` pod to open the port-forward window in `k9s`; change the "Address" to `0.0.0.0`.
Example:
![Port forwarding SCDF](https://i.imgur.com/kDNrRes.jpg)
Now that the stack is ready and the port-forwarding rules are active, we can access SCDF's dashboard by clicking the "SCDF Dashboard" tab in the Strigo platform. If the page doesn't load, hit the "refresh page" button inside the tab.
![SCDF Dashboard](https://i.imgur.com/VZ0MAez.png)
Alternatively, you can review SCDF's Shell access. To do that, switch to the terminal tab, and run the following.
```bash
[ec2-user@ip-172-31-19-111 ~]$ cd scdf-shell
[ec2-user@ip-172-31-19-111 scdf-shell]$ java -jar spring-cloud-dataflow-shell-2.6.0.jar --dataflow.uri=http://localhost:8080
```
![SCDF shell](https://i.imgur.com/cVzZ5WM.png)
> This command might take a few seconds to start. Hang tight. You will see the `dataflow:>` prompt.
Congrats! You have completed lab #1. :slightly_smiling_face:
### Lab 2: Event Streaming Data Pipelines
This lab will review how to build and deploy event-streaming applications using Spring Cloud Data Flow. Let's begin with reviewing the applications we will use in this lab.
#### Explore Application Code (optional)
To open VS Code, from the terminal window, you will have to start the `code-server` process inside the Strigo lab VM.
```bash
[ec2-user@ip-172-31-19-111 ]$ cd code
[ec2-user@ip-172-31-19-111 code]$ bin/code-server &
[1] 103335
[ec2-user@ip-172-31-19-111 code]$ info Using config file ~/.config/code-server/config.yaml
info Using user-data-dir ~/.local/share/code-server
info code-server 3.4.1 48f7c2724827e526eeaa6c2c151c520f48a61259
info HTTP server listening on http://0.0.0.0:1111
info - No authentication
info - Not serving HTTPS
```
> The `code-server` process takes a ~5-10 seconds to start. You can verify whether the port `1111` is running with `sudo lsof -i -P -n | grep 1111` command in the terminal.
When the `code-server` process is running, click the "IDE" tab in Strigo to open VS Code. Go to `home/ec2-user/SpringOne2020` folder to review the applications we will be using for the labs.
> If the window doesn't load, please click the "refresh page" button inside the "IDE" tab.
![VS Code 1](https://i.imgur.com/9jVNjeO.png)
![VS Code 2](https://i.imgur.com/9GKGWWy.png)
#### Build Applications
Now that we have had a look at the applications, let's build them.
```bash
[ec2-user@ip-172-31-19-111 ~]$ cd SpringOne2020
[ec2-user@ip-172-31-19-111 SpringOne2020]$ git pull
Already up to date.
[ec2-user@ip-172-31-19-111 SpringOne2020]$ ls -ltr
total 28
-rw-rw-r-- 1 ec2-user ec2-user 6608 Aug 12 20:17 mvnw.cmd
-rwxrwxr-x 1 ec2-user ec2-user 10070 Aug 12 20:17 mvnw
drwxrwxr-x 2 ec2-user ec2-user 47 Aug 14 20:16 scripts
-rw-rw-r-- 1 ec2-user ec2-user 3520 Aug 17 16:18 README.md
-rw-rw-r-- 1 ec2-user ec2-user 2609 Aug 17 19:02 pom.xml
drwxrwxr-x 5 ec2-user ec2-user 104 Aug 17 19:06 trucks
drwxrwxr-x 5 ec2-user ec2-user 104 Aug 17 19:06 brake-temperature
drwxrwxr-x 5 ec2-user ec2-user 104 Aug 17 19:06 brake-logs
drwxrwxr-x 5 ec2-user ec2-user 104 Aug 17 19:07 thumbinator
```
The build attempts to create container images for these applications using the `jib` plugin.
To configure your local environment to re-use the Docker daemon running inside the Minikube instance, we will run the `eval $(minikube docker-env)` before building the code and generating the docker images.
```bash
[ec2-user@ip-172-31-19-111 SpringOne2020]$ eval $(minikube docker-env)
```
Run the Maven build and generate Docker images using `mvn clean install com.google.cloud.tools:jib-maven-plugin:dockerBuild -DskipTests` command.
```bash
[ec2-user@ip-172-31-19-111 SpringOne2020]$ mvn clean install com.google.cloud.tools:jib-maven-plugin:dockerBuild -DskipTests
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] labs [pom]
[INFO] trucks [jar]
[INFO] brake-temperature [jar]
[INFO] brake-logs [jar]
[INFO] thumbinator [jar]
...
...
[INFO] Container entrypoint set to [java, -cp, /app/resources:/app/classes:/app/libs/*, com.springone.trucks.TrucksApplication]
[INFO]
[INFO] Built image to Docker daemon as dev.local/trucks, dev.local/trucks:0.0.1-SNAPSHOT
[INFO] Executing tasks:
[INFO] [==============================] 100.0% complete
...
...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for labs 0.0.1-SNAPSHOT:
[INFO]
[INFO] labs ............................................... SUCCESS [ 6.388 s]
[INFO] trucks ............................................. SUCCESS [ 42.855 s]
[INFO] brake-temperature .................................. SUCCESS [ 10.615 s]
[INFO] brake-logs ......................................... SUCCESS [ 4.851 s]
[INFO] thumbinator ........................................ SUCCESS [ 5.697 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:12 min
[INFO] Finished at: 2020-08-19T18:36:29Z
[INFO] ------------------------------------------------------------------------
```
Verify application images in the local registry under the `dev.local` repository.
```bash
[ec2-user@ip-172-31-19-111 SpringOne2020]$ docker images | grep dev.local
dev.local/thumbinator 0.0.1-SNAPSHOT a2f62650b367 50 years ago 233MB
dev.local/thumbinator latest a2f62650b367 50 years ago 233MB
dev.local/brake-logs 0.0.1-SNAPSHOT e914bf6237ab 50 years ago 250MB
dev.local/brake-logs latest e914bf6237ab 50 years ago 250MB
dev.local/trucks 0.0.1-SNAPSHOT f1da7c4eb7aa 50 years ago 250MB
dev.local/trucks latest f1da7c4eb7aa 50 years ago 250MB
dev.local/brake-temperature 0.0.1-SNAPSHOT 61e84d293eb9 50 years ago 264MB
dev.local/brake-temperature latest 61e84d293eb9 50 years ago 264MB
```
#### IoT — Real-time Truck's Brake Temperature Analysis
If you haven't already prepared the environment, please review [Lab-1-Prepare-Environment](https://hackmd.io/@sabbyanandan/B1bDf74fv#Lab-1-Prepare-Environment).
Secondly, this lab assumes that you have [locally built the applications](https://hackmd.io/@sabbyanandan/B1bDf74fv#Build-Applications) and that the container images are available in the Docker daemon running inside the single-node Minikube cluster.
*Use-case: Imagine there are 100s of freight-trucks on the road and that you're interested in finding out the fleet's performance in real-time. To narrow it down further, imagine if you want to understand the truck's peak performance, given the current load it is carrying. A vital factor to consider is the brake condition, which would have significant wear and tear depending on the freight. It could even be dangerous if it goes unnoticed.*
Given that background, we will deploy a streaming data pipeline in SCDF with three applications.
1. [`trucks`](https://github.com/sabbyanandan/SpringOne2020/tree/master/thumbinator) — generates truck data in a random interval
2. [`brake-temperture`](https://github.com/sabbyanandan/SpringOne2020/tree/master/brake-temperature) — computes moving average of a truck's brake temperature in 10s interval
3. [`brake-logs`](https://github.com/sabbyanandan/SpringOne2020/tree/master/brake-logs) — logs the output in real-time
> Applications rely-on and use Spring Cloud Stream's [Apache Kafka binder implementation](https://cloud.spring.io/spring-cloud-static/spring-cloud-stream-binder-kafka/3.0.6.RELEASE/reference/html/spring-cloud-stream-binder-kafka.html#_apache_kafka_binder). However, the real-time computations inside the `brake-temperature` application uses the Spring Cloud Stream's [Kafka Streams binder](https://cloud.spring.io/spring-cloud-static/spring-cloud-stream-binder-kafka/3.0.6.RELEASE/reference/html/spring-cloud-stream-binder-kafka.html#_kafka_streams_binder).
#### Review Applications
The source code for `trucks`, `brake-temperature`, and `brake-logs` are under the `SpringOne2020` directory. You can open the code in IDE as described at [Explore-Application-Code](https://hackmd.io/@sabbyanandan/B1bDf74fv#Explore-Application-Code-optional) section.
Given the lab's time constraints, we will attempt only to build and deploy the applications instead of extending or customizing the behavior. If you manage to run the lab quickly, feel free to crack at any customizations, and redeploy as you find appropriate.
#### Deploy Stream
1. Let's open SCDF and register the three new applications that are already available in the Docker registry.
The coordinates for the 3 applications are:
```properties
source.trucks=docker:dev.local/trucks:0.0.1-SNAPSHOT
processor.brake-temperature=docker:dev.local/brake-temperature:0.0.1-SNAPSHOT
sink.brake-log=docker:dev.local/brake-logs:0.0.1-SNAPSHOT
```
> If in case you aren't able to locally build the applications, you can register the applications from Docker Hub.
>```properties
>source.trucks=docker:sabby/trucks:0.0.1-SNAPSHOT
>processor.brake-temperature=docker:sabby/brake-temperature:0.0.1-SNAPSHOT
>sink.brake-log=docker:sabby/brake-logs:0.0.1-SNAPSHOT
Navigate to "Application(s)" section and select "Bulk import application" as the option. On the right-frame, copy+paste the three application coordinates to import the applications to SCDF's application registry.
![Bulk register stream apps](https://i.imgur.com/rk78efy.png)
![Register Apps](https://i.imgur.com/becR6tm.png)
2. Now that we have the applications registered, it is time to create and deploy a stream.
Click the "Streams" link from the left-navigation and open the "Create Stream(s)" page. Copy the following streaming DSL command into the DSL text area in the dashboard. Alternatively, you can drag +drop the apps in the canvas and interactively configure the desired properties.
```dsl
truck-performance = trucks --spring.cloud.stream.function.bindings.generateTruck-out-0=output | brake-temperature --spring.cloud.stream.function.bindings.processBrakeTemperature-in-0=input --spring.cloud.stream.function.bindings.processBrakeTemperature-out-0=output | brake-log --spring.cloud.stream.function.bindings.log-in-0=input
```
> In case you're wondering about `--spring.cloud.stream.function...` in-line properties, and what they mean, you can learn more about Spring Cloud Stream's function bindings and the naming conventions from the [reference guide](https://cloud.spring.io/spring-cloud-static/spring-cloud-stream/3.0.6.RELEASE/reference/html/spring-cloud-stream.html#_functional_binding_names).
![Build Stream](https://i.imgur.com/kXH1tSu.png)
Click "Create Stream(s)" button and deploy the `truck-performance` stream from the list page.
![Create Stream](https://i.imgur.com/Kwcbknn.png)
SCDF is now in the process of programmatically creating the Kubernetes deployment manifests for the three applications and deploying them onto Kubernetes. You can switch to the "Terminal" tab and review the new pods from the `k9s` output.
![Deploying Stream Pods](https://i.imgur.com/BrvLcDw.jpg)
> The applications take ~1-2 mins to start and for the liveness/readiness probes to be running correctly.
3. Let's review the results by tailing the logs of `truck-performance-brake-log-v1-***` application. Alternatively, you can open the logs from SCDF's dashboard, too.
![Truck Logs](https://i.imgur.com/AMK8H2f.jpg)
You will notice the computed moving-average in the logs, in real-time. See below an example that includes the average brake temperature for the truck with the ID="JH4KA8170MC002642".
```json
{
"average": 14.967257499694824,
"count": 3,
"end": 1597875620000,
"id": "JH4KA8170MC002642",
"start": 1597875610000,
"totalValue": 44.90177
}
```
#### Monitor Stream Performance
Now that the real-time streaming data pipeline is running in Kubernetes, we will next review the steps to monitor the streaming applications using Prometheus and Grafana.
All the heavy-lifting of configuring Prometheus, Grafana, and preparing the applications for metrics scrapping is already handled automatically by SCDF.
Click the `Grafana Dashboard` tab to navigate to the Grafana GUI. For the login, you need to find the admin password from the Kubernetes secrets. To retrieve and decode the password, run the following in the terminal window.
```command
kubectl -n monitoring get secret graf-grafana-admin -o jsonpath={.data.GF_SECURITY_ADMIN_PASSWORD} | base64 --decode
```
Example:
```cli
[ec2-user@ip-172-31-30-179 ~]$ kubectl -n monitoring get secret graf-grafana-admin -o jsonpath={.data.GF_SECURITY_ADMIN_PASSWORD} | base64 --decode
MaHOlr9Vxv
```
In my case, I am using `admin/MaHOlr9Vxv` as the credentials to log-in to Grafana dashboard.
> The user is `admin`, but the password would be a randomly generated token for every student. You will have to run the above `kubectl` command to retrieve it. The password will be different for *every* student.
The SCDF-specific metrics dashboards are preloaded in the Grafanaa service running in your Kubernetes cluster. First time when you login, you will have to go to the "Manage" section to find the dashboards.
![Grafana 1](https://i.imgur.com/fICQ2gZ.png)
Click the "Applications" dashboard to view the real-time performance of the streaming applications.
![Grafana 2](https://i.imgur.com/HAQs2dD.png)
That's it! You have completed lab #2. :boom: :rocket:
### Lab 3: Batch-style Data Pipelines
If you haven't already prepared the environment, please review [Lab-1-Prepare-Environment](https://hackmd.io/@sabbyanandan/B1bDf74fv#Lab-1-Prepare-Environment).
This lab assumes that you have already [built the applications](https://hackmd.io/@sabbyanandan/B1bDf74fv#Build-Applications) and that the container images are available in the Docker daemon running inside the single-node Minikube cluster.
#### Cloud-native ETL Batch Job
This lab aims to highlight how short-lived and ephemeral-style batch applications can be orchestrated, scheduled, and monitored using Spring Cloud Data Flow.
To demonstrate the features, we will build a Task application ([`thumbinator`](https://github.com/sabbyanandan/SpringOne2020/tree/master/thumbinator)) that includes two batch-jobs internally.
*The first job includes 3-steps to simulate extract, transform, and the loading of data — an ETL job.*
```java
@Bean
public Job extractImage() {
// extract an image
}
@Bean
public Job transformImage() {
// create a thumbnail for the image
}
@Bean
public Job loadImage() {
// load the thumbnail to a different directory
}
```
*The second job is going to query and print the result.*
```java
@Bean
public Job statusImage() {
// print the size of the original and the thumbnail images
}
```
> To keep it simple and to repeat the lab easily, this workshop includes multiple jobs inside the *same* application. However, you can choose to create new Task applications for each of the jobs instead, which would allow you to evolve them with bug-fixes and improvements independently, so you can continuously deliver them.
#### Review Application Code
In your Strigo VM, you can follow the [Explore-Application-Code](https://hackmd.io/@sabbyanandan/B1bDf74fv#Explore-Application-Code-optional) steps to review the code of `thumbinator` application.
#### Build and Launch Tasks
Click the "SCDF Dashboard" tab in Strigo to launch SCDF's dashboard.
1. First, you will have to register the locally built application in SCDF's application registry.
The Docker coordinate for the `thumbinator` applications is:
```properties
task.thumbinator=docker:dev.local/thumbinator:0.0.1-SNAPSHOT
```
> If in case you aren't able to locally build the application, you can register the application from Docker Hub.
>```properties
>task.thumbinator=docker:sabby/thumbinator:0.0.1-SNAPSHOT
![Register Task App](https://i.imgur.com/teM9hOT.png)
2. Open "Tasks" -> "Create Task(s)"
![Build Task](https://i.imgur.com/qGZU2a4.png)
Give a name to the task definition and create the task.
3. Let's launch the task. Click the "play" button from the task list page to manually launch the task with the default parameters and arguments.
![Task Launch 1](https://i.imgur.com/PX8Utea.png)
![Task Launch 1](https://i.imgur.com/BzP0RO6.png)
Look for a newly launched task pod in the `k9s` terminal.
![Task Pod](https://i.imgur.com/6kpZdEs.png)
4. Verify the results by tailing the logs of the task pod running in Kubernetes. Switch to `k9s` terminal and look for the "pod" with the name that you assigned to the task. You will find the results from both the jobs in the log.
![Task Logs](https://i.imgur.com/nlnU3kE.jpg)
#### Schedule Tasks
In SCDF, there's an out-of-the-box option to schedule task/batch-jobs to launch at a recurring cadence. To do that, SCDF builds on the primitives of `cronjob` spec in Kubernetes.
From the task list page, select the dropdown to choose "Schedule Task".
![Schedule Task](https://i.imgur.com/2F9en0d.png)
Give your schedule a name and schedule the task to launch every minute with the `*/1 * * * *` cron-expression.
![Schedule Cron](https://i.imgur.com/Sj96q1A.png)
Switch back to `k9s` terminal and verify the scheduled creation of pods automatically launching once every minute.
![Scheduled Pods](https://i.imgur.com/g54KDaP.png)
Alternatively, you can also use the `kubectl` command to query the `cronjob` and `job` resources that are running in the Kubernetes cluster.
```bash
[ec2-user@ip-100 ~]$ kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
sch-thumbnails */1 * * * * False 1 54s 4m27s
[ec2-user@ip-100 ~]$ kubectl get job
NAME COMPLETIONS DURATION AGE
sch-thumbnails-1597961580 1/1 39s 3m57s
sch-thumbnails-1597961640 1/1 49s 2m57s
sch-thumbnails-1597961700 1/1 40s 117s
sch-thumbnails-1597961760 0/1 57s 57s
```
Let's also verify the task execution details from SCDF's dashboard.
![Task Exec 1](https://i.imgur.com/V30KzCq.png)
![Task Exec 2](https://i.imgur.com/7YfFf5Z.png)
#### Monitor ETL Job
Similar to the monitoring steps discussed in the [event-streaming](https://hackmd.io/@sabbyanandan/B1bDf74fv#Monitor-Stream-Performance) lab, the task monitoring with Prometheus and Grafana is preloaded, and the metrics dashboard is prepared to monitor the tasks running in the Kubernetes cluster.
Go to "Grafana Dashboard" -> "Tasks".
![Task Grafana](https://i.imgur.com/D2sB8IA.png)
Kudos to you for completing lab-3! :trophy: :smile:
## Appendix
* Source code is at: [sabbyanandan/SpringOne2020](https://github.com/sabbyanandan/SpringOne2020)
* Slides: [SpeakerDeck](https://speakerdeck.com/sabbyanandan/getting-started-with-spring-cloud-data-flow)
:::info
* [Spring Cloud Data Flow Documentation](https://dataflow.spring.io/)
* [Spring Cloud Data Flow Reference Guide](https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-started)
* [Spring Cloud Data Flow Samples](https://github.com/spring-cloud/spring-cloud-dataflow-samples)
:::
###### tags: `event streaming` `batch processing` `stateful streams` `predictive analytics` `cloud-native` `microservices`