Enterprise Application Architecture

# Enterprise Application Architecture *adapted from the book, Patterns of Enterprise Application Architecture* ## What are Enterprise Applications? ### Examples of Enterprise Applications - payroll - patient records - shipping tracking - cost analysis - credit scoring - insurance - supply chain - accounting - customer service - forex trading ### Examples of **NOT** Enterprise Applications - word processors - elevator controllers - chemical plant controllers - telephone switches - OS - compilers - games ### Characteristics of Enterprise Applications #### Persistent data Enterprise applications usually involve huge amounts of persistent data. Persistent in the sense that the data needs to exist between several runs of the application. Persistent data from enterprise systems even outlive versions of the programs that use it. It persists even after drastic system changes in the company. #### Huge data Enterprise applications usually use up so much data. Organized records that will amount to multiple gigabytes of storage. Because of this, managing data of this amount becomes an integral part of the system. #### Concurrent access Enterprise systems involve many users who will use the applications and access the data at the same time. Issues from concurrent access must be addressed when engineering an enterprise application. #### Many distinct UI Users of enterprise systems vary from one another. These users have different access rights and privileges. Enterprise systems will usually have hundreds of distinct interface screens. Data must be presented in different ways for different types of users. #### Integration Enterprise applications will usually be integrated with other enterprise applications as well. It could require integration to applications which are built from a different time. Integration could also have issues with differing business processes or data formats. #### Business process constraints Enterprise applications also have the issue of being built with various business rules. The application to be built will have to conform to these rules since an enterprise engineer has no power to change business rules. These conditions may be strange and illogical which could offer problems to the application itself. ## Patterns Defining patterns is quite difficult but maybe the best reference for a definition is Christopher Alexander himself: "Each pattern describes a problem which occurs over and over again in the environment, and then describes the core solution of the problem, in such a way that you can use thes solution a million times over". Christopher Alexander was an architect and a design thinker who wrote multiples about the nature of patterns and the pattern language. His works influenced different disciplines including computer science. Alexander's book, *A Pattern Language*, influenced design patterns in computer science. Another book by Alexander, *The Nature of Order*, has been cited as a major influence to object oriented programming. ### Patterns are discovered not invented One key thing to remember about patterns is that it born from practice. One will be able to find patterns by studying what people do, observing things that work and looking for the **core of the solution**. Patterns are not formulated or invented, patterns are discovered. ### Patterns are applied to similar but unique problems Another thing to keep in mind about patterns is that the pattern is not the solution itself. Patterns are half baked, which means that in order to apply them to unique problems, one must figure out how to apply it based on the nuances of the project. A pattern will always be tweaked in some way to the circumstances of any problem. ### Solutions are best communicated using patterns One does not need to memorize all of these patterns to be able to solve problems. The purpose of these problems is not to provide new solutions to existing problems (again, patterns emerge from existing solutions), but to provide engineers a way to classify and describe problems and solutions. This reference's real use is to give engineers more efficient ways to communicate solutions. By being familiar of patterns engineers can just describe which patterns are present in this solution and other engineers will have a clear idea of the architecture of the solution. ## Layering Layering is a technique used by software developers to break apart a huge and complex software system. Layers are common in the study of computer systems. A programming language for example descends into deeper levels by accessing OS calls into device drivers into CPU instruction sets, into logic gates on chips. Networking has FTP layered above TCP which is on top of IP which is on top of ethernet. Layering is the reason why we call software stacks like MEAN, LEAP, XAMPP and etc. stacks. Layers on the study of information systems behave similarly. High layer code in a complex program contains functions defined in lower level code. Each layer in the stack hides layers below it from the layers on top of it. ### Why a system is broken down to layers There are a lot of advantages in breaking down a system into layers: - One does not need to know about all the underlying layers to understand a single layer. - It is easy to substitute layers with different implementations - Building using separate layers minimize dependencies - It is easy to standardize services when a system is built upon layers - A layer can be used to build more layers above it offering higher level services Although layering is widely used, it does have its disadvantages - Code that is not well encapsulated will lead to cascading changes - Too much layers will affect a systems performance ### How layers evolved #### The early days During the early days of information systems, layers were not a thing. Systems were simpler back then. Programmers simply programs that manipulate some form of files (ISAM, VSAM). People didn't need to think about layering at all. Layers started to get popular during the rise of client-server systems. These were simple two layer systems for distributed systems. The client held all user interface services and some code while the server was some relational database. Client-server systems were usually created using tools like VB, PowerBuilder, and Delphi. These client-server systems worked very well with applications that were mostly about displays with simple CRUD to the database. Client-server systems started having problems with applications with a lot of domain logic, business rules, validations, calculations and etc. People usually place this code on the client cluttering UI code with application code. This became a huge maintainability issue where bugs became very difficult to hunt on scattered code. The alternative to this approach was placing domain logic and such in the database. This was done by storing procedures into databases. Developers found this approach awkward. This approach was unpopular because developers would often switch between other SQL vendors (switching was common place since SQL is standardized). Code in SQL is not standardized so switching SQL vendors was difficult with this approach. While client-servers became the norm in information systems, object-oriented paradigms started getting popularity. Object-oriented provided a new way of layering information systems. OO proposed to simply add another layer in between the client and server layers. This new layer would contain the domain logic code that cluttered up UI code in the client layer. Although this new layering approach would solve many problems of the client-server approach it didn't gain a lot of traction in the start. This was partly because most systems during that time were simple enough for client-server. Developers would feel overwhelmed trying to use the existing client-server tools for a new three layer configuration. Three layer systems started getting real popularity during the rise of the Web and Java. Client-server systems started to become unusable on Web systems since domain logic needed to be redone anyway to accommodate web servers. The tools that were built for Web systems was also less coupled to SQL and thus gave rise to the third layer of information systems. ### The three principal layers | Layer | Responsibility | | ------------ | ---------------------------------------- | | Presentation | Display information and some services | | Domain | Logic that is the real point of the system | | Data source | Communication with database and other storage | ### Where Each Layer Should Reside When designing with layers, the programmers should also be wary of where to place the layer. Whether to place a **layer** in the client, or whether to place the layer into the **server**. #### Client Whether to put a layer into client or server isn't a question of which type of computer should a system reside in. Client in this case can refer to a browser, a mainframe, another UI view in the same computer, or a computer miles away form the server. Client refers to the computer/interface a regular user interacts with. Putting the layers into the client interface is a rare choice nowadays, especially among distributed information systems used by millions of users share the same data. Choosing to put everything in the server becomes a real choice if the system can run in disconnected operation. Putting everything on the client side will also favor responsiveness since the whole system can operate without slow server calls. #### Server A system server can also refer to a dedicated super machine or a xampp service inside the same computer as the client. Server refers to the single interface shared by the clients. Choosing to run all the layers into the server will help with the system's maintainability. It is easy to change and fix a system which resides only in one computer. One doesn't have to worry about the logistical nightmare of deploying into multiple computers with different operating systems #### Separating the Layers Knowing these trade-offs we can look into separating the layers into different locations. The data source layer is almost always placed in the server. It important that the system only keeps one copy of the data source to ensure that different users accessing data source concurrently see the same thing. The few exceptions here are disconnected systems which don't deal with a lot data source updates. The presentation layer usually resides in the client (nowadays). This is because rich client tools like browsers can easily run presentation layer code. Placing the presentation layer in the client also helps the system with responsiveness. Knowing the right place to put the domain logic is very dependent on context. You can put all the domain logic into the server, all to the client, or you can split the code. The most important thing to remember the trade-offs, is that putting layers on client will affect with security and maintainability, while putting layers on the server will affect responsiveness. ## Unified Modeling Language UML is a general-purpose tool used in information systems engineering. It is intended to provide a standardized way of visualizing the architecture of a system. ### Class Diagrams #### Member Visibility | notation | meaning | | :------: | :------------: | | `..>` | dependency | | `-->` | association | | `o--` | aggregation | | `*--` | composition | | `--|>` | inheritance | | `..|>` | implementation | ##### Instance Level Relationships - **Dependency** - This is the least formal relationship that can be described by UMLs. When one says that `A ..> B`, this means that class A is dependent on class B. Dependency has a broad meaning, and people tend to overuse this relationship since any type of relationship can be view as some form of dependency. In coding terms this generally means that class A uses B but that class A does not contain an instance of class B as part of its own state. - - **Association** - This relationship defines dependency but a much stronger dependency. Saying `C ..> D` means that C is associated to class D. This means that class C contains and uses an instance of class D but D is unaware of or does not contain class C. - **Aggregation** - Saying `E o-- F` means that E aggregates an F. This relationship is also called weak association. Class F can exist both outside and inside the lifetime of class E. F's existence can make sense without being associated to E. *In other words E merely uses instance of F*. - **Composition** - Saying `G *-- H` on the other hand means that G is composed by F. This relationship is also called strong association. H's lifetime only exists within G's lifetime. Without G, H's lifetime is meaningless. *G owns H*. ##### Class level relationships - **Inheritance/Generalization** - This relationship means that if `A --|> B` then A inherits from B. Or A is a subclass of B. - **Implementation/Realization** - Saying `A ..|> B` means that A implements/realizes the interface defined by B. ## OOP Concepts - **Abstraction** is the concept of hiding the details of a model, a method or a system. This is one of the rationales for designing systems in terms of layers. For example, one does not need to perform repetitive multiplication, $b*b*b* ...$ to calculate exponents. One only needs to call the method `power(b,e)` to calculate $b^e$. We say that the internal operations of power, which is repetitive multiplication, is abstracted by the method `power(b,e)`. We can also say that `power(b,e)` is high level code while `b * b * b *...` is lower level code. - **Encapsulation** is the technique we use to implement abstraction in OOP. Users are not aware of how exactly `power(b,e)` is implemented because its lower level code is encapsulated into `power(b,e)`. This also applies to objects, where a user in a high layer is unaware of the exact contents, state, definition, and etc. of an object. - **Polymorphism** is the concept in where an object is able to behave differently in different contexts or situations. Polymorphism can be runtime polymorphism or compile time polymorphism. Compile time polymorphism is supported by method overloading where a method of the same name can have different parameters/definitions. Runtime polymorphism is supported by implementation/realization. Two different classes, A and B which implements the same interface C, can contain the same method `d()`. Even if we do not know what the exact class of an object `o` at runtime, we can still call the method `o.do()`. ## Organizing Domain Logic ### Transaction Script Pattern The transaction script pattern is one of the simplest patterns one can implement. This is the pattern that we usually build when coding programs pre OOP. #### Advantages - Structured around procedures. Simple and easy to understand - Works well when paired with simple data source layer - Transaction boundaries are easily set #### Disadvantages - Tends to have duplicated code when transactions are similar - Doesn't scale well with complicated domain logic Most business applications can be thought of a set of transactions. Viewing some information in an organized in a particular way is a transaction. Making changes to the data source is a transaction. Transactions can be as simple CRUD actions to the data source or complicated multi-step transactions involving multiple calculations. A system designed using transaction scripts organizes its business logic by procedures where each procedure handles a single request from the presentation layer. Each transaction will have its own Transaction Script. A single transaction script can have multiple subtasks but the boundaries of a single business transaction is exactly contained into a single Transaction Script. #### Designing Transaction Scripts There isn't much to say about designing transaction scripts. All you have to remember is that each business transaction should be represented using one Transaction Script. There must be some sort of separation between each Transaction Script like this: ```java class PayrollServices{ void paySalary(int employeeID){ payAmount = pullFromDataSource(studentID,int ClassID); calculateThis(...); doThat(...); } void changeSalary(int employeeID, float newSalary){ updateDataSource(employeeID, newSalary); } void showAllEmployees(){ pullFromDatasource(...); string } ... } ``` where each transaction is inside its own function. These functions could be contained into a single class or defined globally. Each Transaction Script can also be placed into its own class where each class can only be controlled by the presentation layer using a function like `transactionScriptObject.run()`. Invoking these `run()` functions will execute the business transaction. ![classService](..\CMSC179\UML\TransactionScriptClasses.png) The reason why programmers sometimes choose to write transaction scripts in into separate classes is, so that the system can enforce the boundaries of a business transaction. When the presentation layer requests the transaction, a new instance of a Transaction Script is created (fields of the class are instantiated with values from the data source layer, these are data required to complete transaction). The method `(new TransactionScript(...)).run()` is called, performing the business transaction procedure. The object is immediately destroyed afterwards, along with all possible accesses to its field. The main objective of designing Transaction Scripts, as observed in the two implementations above, is to enforce the separation between transactions. Separating transactions from each other makes sure that relevant data is only accessed during a business transaction (e.g. grades are never accessed when viewing your schedule). This will make the system more secure and private. ### Domain Model When businesses scale up, business logic can get very complex. As features, rules, and exceptions grow, simpler design patterns like transaction scripts start to get problems. Maintainability becomes an issue, since code will likely get duplicated. It will also become difficult to change code in the domain layer since code is not likely abstracted from other code (this means changing things will likely affect other parts). This is where the Domain Model design pattern come. This model creates a web of objects with defined relationships to each other, where each object represents a meaningful component in an enterprise information system. Components can be as large as corporations or as small as a single line in a form. Creating a Domain Model Architecture involves creating a whole layer of objects that represent the system you are modelling. The objects in a Domain Model Architecture closely mimic the rules that the business uses. ![DomainModel](..\CMSC179\UML\DomainModelSalary.png) There are usually two kinds of Domain Model Architecture. A **Simple Domain Model** is modelled very much like the database. Each object in the Domain Layer has a corresponding table in the database. On the other hand, a **Rich Domain Model** looks different from the database. Rich Domain Model's have a lot of inheritance, strategies and OOP design patterns. A rich Domain model will be better for more complex business logic but it may be harder to integrate with the database. #### Advantages - Extensible and scalable - Less dependency to other layers #### Disadvantages - More complicated than transaction scripts #### Table Module Pattern One pattern very, similar to Domain Model is the Table Module Pattern. The only difference between the two is that in Domain Model, for every employee in the company, there is an instance of Employee, while there is only one instance of Employee that represents all employees in the company. ### Service Layer Another common approach in implementing domain knowledge is splitting the domain layer into two. Instead of letting the Domain Layer directly interact with the presentation layer, we place a service layer in between them. A service layer can be thought of as putting an extra façade layer to add operation boundaries to the complicated domain code. Service Layers are usually implemented when the domain layer uses a complicated Domain Model Pattern. A service layer can be thought of placing Transaction Scripts above the around Domain Model Code. ## Mapping to Relational Databases The main responsibility of the data source layer is to communicate with external infrastructure or services needed for the application to do its job. One of its main jobs is communication to a database. Nowadays database has synonymous to a relational database.