Overall development performance is terrible. Probably caused by:
if (workflow.getVersion() == '1') { doSth()} else {doSthElse()}
As per their description it is "(…) orchestration engine to execute asynchronous long-running business logic(…)"
There are client for Go and Java. One for Python is not ready yet https://github.com/firdaus/cadence-python
Creating a workflow in Uber Cadence is not so much different than normal coding.
Uber cadence comes with a non problematic docker compose
git clone https://github.com/uber/cadence.git
cd cadence/docker
docker-compose up
It will spin up all needed dependencies and web ui
Access web ui: http://localhost:8088/
Cadence comes with a nice set of samples in Java Client
git clone https://github.com/uber/cadence-java-client.git
cd cadence-java-client
./gradlew -q execute -PmainClass=com.uber.cadence.samples.common.RegisterDomain
Open the project in intellij (just open if you have graddle plugin installed) and go to src/main/java/com/uber/cadence/samples/hello
Each class has a main method - run it and things should just work.
https://bitbucket.sec.sony.com/projects/CRO/repos/cadence-samples/browse
The repo consists of two projects. One implements activities(cadence-sample-activities
) and the other one implements workflow(kotlin-cadence
)
The workflow is really simple. It just calls 2 activities. One activity is hosted together with the workflow and other one is hosted by different process. First activity just returns string. The other one fetches some information from Jira (it uses proxy on localhost:4140).
cd cadence-sample-activities
./gradlew -q execute -PmainClass=com.sony.crow.cadence.samples.quarantine.activities.QuarantineActivitiesImpl
Note: Running via intellij does not work (as per 12.11.2019) as new gradle version is not so well supported.
3. Run workflow worker
Open kotlin-cadence
in Intellij, navigate to: src/main/kotlin/cadence/SampleFlow.kt
and run the main class
5. Start workflow
docker run --network host --rm ubercadence/cli:master --domain sample workflow run --tl quarantine1 --wt 'QuarantineWorkflow::quarantineHost' --et 6 -i '"host1"'
Cadence comes with a nice CLI. The above command is not so handy as it uses docker run
. A standalone version is also available.
The flow should fail if you have no tunnel set up. To set up the tunnel:
ssh -f plladace1@43.194.55.86 -L 4140:localhost:4140 -N
It is really simple
There is in-memory workflow engine. You can easily provide mocked activities.
Activities are normal classess and can be tested easily.
The test will be performed locally using crowcompose. This removes the necessity to involve other teams. If tests show not satisfactory performance then they will be repeated on sth environment with more proper setup.
test measures time of starting proccesses. Does not measure time when they're completed
Glossary:
ww - workflow worker
aw - activity worket
xargs -P 50 -I{} ./cadence --domain sample workflow start --tl perftest --wt 'PerformanceTestWorkflow::quarantineHost' --et 10000 -i '"host1"'
No of proc | No of aw | No of aw | time |
---|---|---|---|
1000 | 1 | 1 | 15.7s |
1000 | 2 | 1 | 11.4s |
1000 | 2 | 2 | 10.9s |
10000 | 2 | 2 | 116s |
xargs -P 50 -I{} ./cadence --domain sample workflow start --tl perftest --wt 'PerformanceTestWorkflow::quarantineHostAsync' --et 10000 -i '"host1"'
No of proc | No of aw | No of aw | time |
---|---|---|---|
1000 | 1 | 1 | 11.6s |
1000 | 2 | 1 | 9.6s |
1000 | 2 | 2 | 10.9s |
Note:
Running cadence
binary 1000 times only displaying version takes about 3s: time seq 1 1000 | xargs -P 50 -I{} ./cadence -v
-> 3,573 total
CPU usage on cassandra is 100%. Running additional cassandra node does not change this behaviour. Running 3 cassandra nodes improved the performance.
Cadence server CPU usage is also high.
camunda.prd.crow.marathon.mesos
https://eng.uber.com/open-source-orchestration-tool-cadence-overview/
https://github.com/banzaicloud/banzai-charts/tree/master/cadence/
Polling which does not overwhelm history
https://stackoverflow.com/questions/57562772/polling-for-external-state-transitions-in-cadence-workflows
Let’s start with the cluster setup first. If you look at the Cadence service config, there are few knobs you need to pay attention before you can start a production Cadence server.
Your workflow and activities for are hosted outside of Cadence Server within your own worker. You can scale them according to the needs of your own use case. Your workers running your application will continuously poll Cadence Server for tasks. When an event happens like startWorkflowExecution, signalWorkflowExecution, activityCompletion, etc Cadence Server will dispatch a task to your worker to execute your application logic.
All your workflow state is managed by Cadence Server and it will route the signal or any other event to worker hosting that particular workflow execution
Please watch maxim’s presentation to understand the model: https://www.youtube.com/watch?v=llmsBGKOuWI
Just a couple clarifications. The sticky is indeed about caching a workflow instance on a worker. When an instance is cached it receives only new events in a decision task instead of replaying the whole history on every task. As word caching implies the workflow can be pushed out of that any time or due to worker failures and be cached on another worker by replaying the whole history. So stickiness is a purely performance optimization and it doesn't guarantee that the workflow is executed on a single worker.