Part2: Experiment
Q1
Q1.1 A naive way of splitting the model is to split into equal size, is this a good policy? What are the split points of this policy?speed up: 3.5167
image
I don't think it's a good policy, because doing this will induce problems like:Imbalanced Workload Distribution:Splitting the model into equal sizes does not consider the computational complexity of different layers. Some layers might require significantly more computation than others, leading to an imbalanced workload across the nodes.
Increased Communication Overhead:
Equal-sized splits might not optimize the data flow between layers. If layers that heavily interact are placed on different nodes, the communication overhead will increase.