cloud computing

# cloud computing ## 1. In virtualization, Why we want a VM to be isolated? to be secure, to have its own resources..., to have its own dependencies ## 2. In HDFS, what's the function of namenodes? And why we want namenodes to have intelligence? The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself. NameNode, a master server that manages the file system namespace and regulates access to files by clients. The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. he DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode. So, if our namenode is smart enough, it can do the above efficiently ## 3. Why Hicut is better than hypercut? ref: Energy Efficient Packet Classification Hardware Accelerator * Hicut: > In the HiCuts algorithm the packet classification problem is viewed geomet- > rically, meaning that each rule in the classifier is viewed as a D-dimensional > rectangle in D-dimensional space, where D is the number of fields in the classi- > fier. The D-dimensional space is partitioned by a number of cuts and an input > key created from a packet header becomes a point in the space. The packet > classification problem now reduces to finding the D-dimensional rectangle that > contains the point. A decision tree is built from the subregions created by the > cuts and searching in the decision tree is done by using the header fields of the > incoming packet as the input key and traverse the decision tree until a leaf is > found. * decision tree then TCAM rule: * pros: * (y56) can tune to trade off between searching speed and storage limit * cons: * Too many cuts however will result in an unacceptable amount of memory needed to store the decision tree. * hypercut: > The HyperCuts algorithm is based on the techniques in the HiCuts algorithm. > HyperCuts takes a geometrical view of the packet classification problem, makes > a number of cuts and builds a decision tree. Unlike HiCuts in which each node > in the decision tree is cut at one dimension, each node in the HyperCuts decision > tree is cut at several dimensions simultaneously. * Here we choose the combination which resulted in the smallest number of max rules stored in a child node. * pros: * * cons: compare: A new Cutting Algorithm for the Packet Classification Problem - UpperCuts Josefine Åhl https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwiP1qegndHtAhXRFVkFHecLAO8QFjAAegQIAhAC&url=http%3A%2F%2Fwww.diva-portal.org%2Fsmash%2Fget%2Fdiva2%3A1027763%2FFULLTEXT01.pdf&usg=AOvVaw1fl1IK5pGpxkZPF8ATVHV- CH 6 ``` Chapter 6 Evaluation of HiCuts and HyperCuts This chapter evaluates the HiCuts algorithm and the HyperCuts algorithm and it compares the two algorithms with each other. 6.1 HiCuts Evaluation The HiCuts algorithm is tested in [6] to see how it works on real and synthesized classifiers. The worst case lookup time and amount of storage is measured. To measure the lookup time the depth of the decision tree built is counted. The first test is for a classifier with two dimensions. The classifier is created by randomly taking prefixes in both dimensions from publicly available routing tables and wildcards are added at random to each dimension. For a binth value of four, a classifier with 20,000 rules consumes about 1.3 MB of storage and has a decision tree depth of four in the worst case and 2.3 in the average case. When more than two dimensions are tested, classifiers with four dimensions taken from real ISP and enterprise networks are used. With a binth value of eight and a spf ac value of four, the maximum storage is about 1 MB. The worst case decision tree depth is twelve and this is followed by a linear search on eight rules. The HiCuts algorithm is also tested in [9] to see how it works on core router databases (CR), edge router databases (ER) and firewall databases (FW). All classifiers have five dimensions. The worst case lookup time and the amount of storage are measured and the result of this can be seen in Table E.1 to Table E.6 in Appendix E. 6.2 HyperCuts Evaluation The HyperCuts algorithm is tested in [9] to see how it works on among others core router databases (CR), edge router databases (ER) and firewall databases (FW). All classifiers have five dimensions. The worst case lookup time and amount of storage is measured. To measure the lookup time the number of memory accesses is counted. One memory access in [9] is one word, where one word is 32 bits. 23The amount of storage depends on the number and size of the nodes. A node consists of a header plus an array of pointers to child nodes, one for each cut. The header size is four bytes, each pointer takes four bytes, and the number of entries in the array is equal to the number of child nodes. A bitmap in the header is used to distinguish between types of nodes. If the refinements discussed in Chapter 4.3 are considered, it can result in an increase of two to eight bytes per dimension [9]. The result of the tests can be seen in Table F.1 to Table F.6 in Appendix F. 6.3 HiCuts versus HyperCuts The HiCuts algorithm and the HyperCuts algorithm can be compared with each other in order to se which one has the better lookup time and storage requirements. Decision Tree Height The main difference between the HiCuts algorithm and the HyperCuts algorithm is in the cutting process. HiCuts chooses one dimension to cut on in each node. HyperCuts can choose more than one dimension to cut on in each node and HyperCuts makes the cuts simultaneously in the chosen dimensions, resulting in a smaller decision tree height compared to HiCuts. For example, the decision tree created by the HyperCuts algorithm in Figure 4.1 has a height that is one less than the decision tree created by the HiCuts algorithm in the same figure. By doing multiple cuts the lookup time for the HyperCuts search algorithm can be better than the lookup time for the HiCuts search algorithm. Lookup Time Cutting simultaneously in a node as HyperCuts does requires a list of N C pointers to the subregions the cuts generate. This can result in slower lookup time at each node because the list of pointers must be searched. This problem is solved in HyperCuts by using array indexing. To see how array indexing works consider four equally spaced cuts in one dimension; [0:3], [4:7], [8:11] and [12:15]. Each cut has a pointer and the pointers are stored in an array of size four. To find the right pointer for an input point, the input point is divided by the cut width, which in this case is four. Let for example the input point be the number ten. The result when ten is divided by four is two when rounded down to nearest integer. This means that the third element of the array is indexed if array indices start at zero. It does not matter how many cuts are made, array indexing will always cost one memory access. This means that HyperCuts can reduce the height of the decision tree without increasing the search time at each node [9]. Storage Requirement When pointer arrays are used, the storage required for the HyperCuts structure can increase. Pointers can be removed in the same way as in HiCuts, i.e. when two pointers point to identical subtrees, one of the subtrees can be removed and the corresponding pointer is made to point to the other subtree. Moving up 24common rules reduces the storage further in the HyperCuts structure. It is also suggested in [9] that empty array pointers can be eliminated in the HyperCuts structure by using bitmap compression as in the Lulea Algorithm [8]. Test Results If the tables in Appendix E and Appendix F are compared to each other it can be seen that the HyperCuts algorithm in general use less memory than the HiCuts algorithm and that the cost for lookups are better in the HyperCuts algorithm. For core router databases the total amount of storage occupied by the search structure in HyperCuts is at one point more than 25 times less than in HiCuts and at no point is it more than in HiCuts. The total number of memory accesses for a lookup in HyperCuts for core router databases is at one point more than three times less than in HiCuts and at no point is it more than in HiCuts. For edge router databases the total amount of storage occupied by the search structure in HyperCuts is about the same as for HiCuts. In three cases HiCuts takes slightly less storage than HyperCuts. The total number of memory ac- cesses for a lookup in HyperCuts for edge router databases is about the same as for HiCuts, HyperCuts takes slightly less memory accesses than HiCuts. For firewall databases the total amount of storage occupied by the search structure in HyperCuts is a one point more than eight times less than in HiCuts. At one point HiCuts takes slightly less storage than HyperCuts. The total number of memory accesses for a lookup in HyperCuts for firewall databases is at one point more than four times less than in HiCuts. Conclusion The HyperCuts algorithm generates decision trees with equal or smaller depth than the decision trees generated by the HiCuts algorithm, without increas- ing the amount of storage required. The HyperCuts algorithm performs bet- ter than the HiCuts algorithm on the tests for core router databases and fire- wall databases. For edge router databases HyperCuts and HiCuts performances are about the same. The reason for this, explained in [9], is that edge router databases only specify the two fields for IP source and IP destination and that two dimensions is not enough for HyperCuts to perform better than HiCuts. The conclusion is that HyperCuts performs better in general than HiCuts in solving the packet classification problem. ``` ## 4. Why DIFANE is scalable? * keep packets in data plane * use auth sw to ease the burden of control plane * auth sw serve as the 2nd brain * not every query will go to controller plane if auth sw already know how to deal with that ### lecture08_09 Flow Table Management ![](https://i.imgur.com/DIfFDEL.png) ![](https://i.imgur.com/Hs3FP3t.png) ![](https://i.imgur.com/U3cA6kS.png) ![](https://i.imgur.com/QnXthH2.png) ## 5. Please describe the work flow of 'Version' method in Silkroad and explain why this method can guarantee PCC(per-connection consistence)? * load balancer needs to fulfill: * uniform load distribution, of the incoming connections across the servers * per-connection-consistency (PCC) * TCP SYNC ACK ... * the ability to map packets belonging to the same connection to the same server even in the presence of changes in the number of active servers and load balancers. * Yet, meeting both these requirements at the same time has been an elusive goal. Today's load balancers minimize PCC violations at the price of non-uniform load distribution. ## 6. In which scenario Beamer will present better performance than Silkroad? ## 7. Since Spotlight is excellent on 'Good Balancing', what may be the disadvantages when compared with other three Load Balancing strategies? ## 8. Please try to come up with a method that helps pFabric to sort out the packet with highest priority. ## 9. Compare Baraat and pFabric, they all use prioritization mechanism to deal with task scheduling. But it seems that Baraat do better job than pFabric. Do you agree with this statement? Please present your answer with your explanation. # L4(transport layer) LB ## tradtional solution * PCC ok * expensive * vulnerable ## DUET **scalability** ** no PCC** LB(VIP) server(DIP) solve **scalability**, **availability**(kt) how? switch can hash(ECMP) and encapsulation use existing edge switches by installing softwar in it as LB SMUX (sofrw MUX): a LB how to handle failure ## silkraod **scalability** **PCC** ![](https://i.imgur.com/rhqPJcn.png) # spotlight to know load of DIP to assign work **better LB** **can PCC** **can scale up** ECMP can assign packets evenly, but loading don't know the ability of each server # Beamer **scalability** **PCC** * no conn table * pcc while no per conn state * SYN bit will let server(w/ DIP) know if the packet is for it or an old one ![](https://i.imgur.com/xy2FY55.png) ![](https://i.imgur.com/glFzoZX.png) # CONGA network layer LB PS: 仅供参考，希望各位好哥哥可以补充一些题目，对大家复习都有帮助🙏。 PPS：chao肯定会出一些what advantage what disadvantage之类的题所以我就没发。 PPPS：qJump和DCTCP我学得不好不知道咋出题，希望有好哥哥可以出几道有关的题。