Practical Microservice Architecture and Implementation Considerations

Much has been written about microservice architecture as seen in blogs here (http://microservices.io/patterns/microservices.html) and here (http://martinfowler.com/articles/microservices.html). This is a promising approach to deal with large scale integration of software so that it is easily maintainable, high performing, and scalable. Although the approach has some concrete foundation, there are some serious practical considerations that involve scalability and performance that one must take into account in order to ensure good SLAs. Anyone who has tuned high-end platforms understands that this approach is not straightforward.

Consider the industry over the last 15 years as I briefly review emerging software engineering trends that are important to give context to the move toward microservices. Today, microservices and functional programming trends are prominent in many software development discussions, where these concepts of software engineering patterns were born out of empirical findings and limitations of object oriented programming (OOP) and service oriented architectures (SOA).

Both OOP and SOA had challenges and failed to comprehensively address programming techniques that deal with agility of the codebase with minimal brittle effect to the scalability and performance of the runtime application kernel. (Developers like myself understand the meaning of “brittle effect.”) From a programming perspective we saw the rise of OOP largely through data abstraction, encapsulation, inheritance, and polymorphism. However, large-scale systems became difficult to develop, and with the famous debacle of the earlier EJB specification, a new concept emerged based on dependency injection / inversion-of-control (IoC), which was largely popularized by the Spring framework. This trend involved code that looked more like services, which were functional pieces of code that were well encapsulated to have a well-defined interface with sets of inputs and outputs, and the various service relationships and dependencies injected by the container. During this time we saw less emphasis on inheritance, but more on what functionality a developer is trying to deliver to the caller with scalability and maintainability. Likewise, we saw the concept of encapsulating the functionality into a well-defined set of services.

During the same time, the phenomenon of SOA was well popularized but there was not a clear and distinct best practice for a concrete implementation of this. Many implementations did succeed, but some were very difficult and failed. Some had services that were just too large and monolithic, while others had too many smallish services (almost microservices like) that it became difficult to achieve good performance. The concept of SOA was there, but designers and implementers failed to understand the full lifecycle of the service and its granularity and scalability impact on other services, and therefore paid a huge price during implementation.

It remains difficult to define what a service is, as this largely depends on the developer and how well they understand the end-to-end lifecycle of the service they are offering. Part of the problem is that the developer lacks full understanding of the underlying infrastructure services – for example, when to scale up or scale out, what the cost of scaling out is, and how it impacts response time. On the other hand, infrastructure service providers do not have insight into application components and how they should be best mapped.

Consider the following important practical aspects when building new microservices-capable platforms, or in some cases platform as a service (PaaS) deployments. Issues such as fragmented horizontal scalability, licensing considerations, performance, and code deployment flexibility.

1. Fragmented Horizontal Scalability

I came across a customer that had implemented the concept of microservices where they defined 25 REST services as part of their services layer. Front end web applications would connect to these REST services through a back end middleware/services layer to conduct transactions. After close examination of the call paradigm, we found that the vast majority of their transactions, approximately 90% of transactions would require traversal of all 25 REST services to complete an end-to-end business transaction. On the first day that the platform was launched they decided to have a one-to-one relationship between REST services and the JVM onto which it was deployed. Thus, RESTService1 was deployed on JVM1, and so on through RESTService25 deployed on JVM25.

Much of the influence of the one-to-one approach came from various writings of how microservices should be implemented, which unfortunately disregards the performance aspects and the overall total cost of ownership of the underlying infrastructure as well as licensing of the software components/containers.

On that first day when they launched the platform, they needed 25 REST services, and therefore 25 JVMs. Each of the JVMs had a heap size of 1 GB, so the total heap space across all JVMs was 25 GB. After being in operation for a few months they discovered that they needed to scale out by an additional 5 GB. Because the functionality of their scale-out architecture required a minimum of 25 JVMs (25 REST services) they were forced to add another 25 JVMs, or an additional 25 GB. Over time this pattern led to an environment made up of 400 JVMs, each with 1 GB heap, across 16 physical hosts.

Figure 1 shows on the right the 400 JVMs that constitute this platform across 16 physical hosts. The left hand side illustrates how the load balancer interacts with the 400 REST services. The green arrow shows the completion of an entire transaction within one host, while the blue arrows show the potential of bouncing to another host to complete a transaction. Clearly this many network jumps can hurt response time and overall performance. The paradigm almost always assumes that everything is remote and makes a remote service call, when in fact 90% of the time this is not the case.

Figure 1. Load Balancer Interaction with the 25 REST Services

 

 

FragmentedMicroServices

Therefore, one issue is the need to deal with the ping pong effect on the completion of a transaction. One approach is to consolidate all of the REST services into one JVM heap space, and scale out to accommodate the 400 GB worth of heap. This naturally leads to larger JVMs, but it also has the distinct advantage that most or all of the calls will be local.

 

Figure 2. Highly Tuned High Performance Service

NonFragmentedHighPerfService

Figure 2 shows how we can deploy the 25 REST services all onto one JVM, and scale out based on how many JVMs we actually need to fulfill the 400 GB heap requirement. The basic setup is 2 VMs running on a host,  with 1 JVM per VM. In this deployment paradigm when a transaction reaches the JVM at RESTService1 the call continues all the way through to RESTService25, all within the same JVM and thus the same heap space. We tested this and found a 3x improvement in response time as compared with the old single REST service per JVM approach of Figure 1 (also shown at the right of Figure 2).

The VM sizes are no doubt NUMA optimized. I will write about NUMA optimization in an upcoming blog, but for now I show in Table 1 a deployment calculation for 400 GB and 800 GB in case there is a future need to extrapolate for traffic growth.

In this table we show how we collapsed the original 400 JVMs down to 12 JVMs of 34.56GB heap size.

Table 1. NUMA Optimized VM and JVM Size for 400 GB and 800 GB Platforms

ServiceJVMSize

 

In this approach we still had 12 JVMs, which is plenty of scale-out capability. The downside is that you have very large JVMs that can potentially lose more data than with the original paradigm. However during a close examination of the old fragmented system we found that this was a false assumption. In fact, many times a JVM would crash and take down a service, for example, RESTService3. If a transaction had completed its transit through RESTService1 and RESTService2, it sat hanging, waiting for RESTService3. The code did not have retry logic, which is not necessarily the best approach, but it can work. What we found is that although the remaining REST service JVMs were running (the remaining 24), they could not complete transactions, so they were indeed up, but useless. In the case of the refined deployment where we show all 25 REST services consolidated into one heap space, the entire transaction set can be rolled back cleanly, failed over, or made redundant. By contrast, if you try to do that with the fragmented scaled-out approach, you have to chase down which JVM and REST service is currently performing or holding up the transaction.

This comes with the need to understand how to tune and size large JVMs, which is an area where we have done much research with our customer base and have been producing and publishing the various approaches.

Figure 3 shows each section of the JVM and the VM with various sizes.

NonFragmentedHighPerfService-2

 

Figure 4 shows a snippet of the JVM options used, mostly using a combination GC of CMS in old generation and ParNewGC in young generation. In an upcoming blog I will explain each one of these JVM options in detail.

Figure 4. High Performance Service JVM with Various GC Options

JVMConfig

 

2. Licensing Considerations

If you plan to proliferate microservices, note that each time you spin a new JVM there might be additional licensing costs – for example, if you need a license for each application server instance. In our case we were able to consolidate from 16 physical hosts down to 6. This had a direct impact on reducing the application server license cost, operating system license cost, and power consumption on the order of 60%.

Consider the practical limitations of licensing costs, as the majority of application servers are licensed by CPU cores, and having licensed a NUMA node or CPU socket you want to use up as much as possible of the memory attached to those CPU cores. Otherwise, you will have paid for licensing the usage of that memory without utilizing it. Although having larger JVMs will quickly get you the largest returns, you can also stack up multiple smaller JVMs instead. However in that case, every new JVM instance comes with its own overhead and needs additional CPU core cycles to fulfill the heavy GC cycles.

After a JVM has been launched and it has consumed the initial overhead, it continues to scale vertically very well without proportionally tracking an increase in CPU utilization. We specifically experimented with this using the GemFire in-memory database and noticed that a cluster of 8 very large JVMs would outperform a cluster of 30 smaller JVMs having the same heap size. We also conducted other performance studies that show the vertical scalability of JVMs using web applications, back in 2010.

3. Performance – How Much to Scale Up or Scale Out

More recently we tested this assumption yet again on a PaaS installation, comparing 2 JVMs of 2 GB each with 4 JVMs of 1 GB each. We found that the 2 JVM case outperformed the 4 JVM case by 26% with better response time and 60% less CPU utilization.

Figure 5 shows the response time chart. The blue lines represent Scenario-1 with 4 JVMs using 1 GB heap each as a scale out, while the red line is Scenario-2 of 2 JVMs using 2 GB heap each showing a good mix of scale up and scale out. In Figure 5 we see that Scenario-2 has 26% better response time and in Figure 6 we see that Scenario-2 has 60% less CPU utilization.

Figure 5. Response Time (y Axis) Compared with the Number of Test Iterations (x Axis)

RT-2JVMs-vs-4JVMs

 

Figure 6. CPU Utilization of Scenario-1 as Compared with Scenario-2

CPU-2JVMS-vs-4JVMS

 

While scale out is an important attribute of Java applications, when to scale up or scale out is key to any successful PaaS platform. On the one end of the spectrum, if every component of your system was wrapped in a service, and each service mapped to a single JVM, you would quickly find that a small system of 1000 components turns into a system of 1000 very small JVMs. What was formerly a call to another method or function within the same heap space is now remote, and you contend with the efficiency of your network. Your network will never be as fast as an in-memory call. Even in cases where there is a call to localhost (assuming the neighboring service is running on the same operating system), it might now be load balanced to some microservice a few hops away where it would hurt response time.  This is why services, and the definition of services, must take into account the complete lifecycle of the service as driven by its transactional usage.

Transactions that traverse common services, all the time or during a majority of the time, must be as close to each other as possible, whereas if you assume that everything is remote in a soup of microservices hurts performance. Good PaaS platforms attempt to first provide a decent vertical heap size for your workload during incubation time (before you go live), and later allow you to adjust the configuration and decide when to scale up or scale out. In the background, the PaaS kernel attempts to vertically fill the NUMA socket of the physical server with either more JVM containers, if you chose to scale out, or with a smaller number of larger JVMs to the same net heap size.

Deciding between fewer larger JVMs and many smaller ones depends entirely on your workload. Mixing microservices of the same tenant onto the same JVM is relatively good practice as it allows for faster transactional response times. In the opposite case, mixing multitenant microservices would not be recommended because a multitenant JVM is not yet a mainstream approach. As we see multitenant JVMs become more common, we will start to see a shift in the market and in PaaS deployments towards larger JVMs, primarily because you can achieve greater performance at smaller infrastructure cost. The downside of course is that you have a larger JVM so if it crashes, you need to protect against the loss of a larger dataset or state.

PaaS platforms of the future should provide platform-based fault tolerance where deploying a service onto a JVM on a decent PaaS provides an option to have fault tolerance or redundancy. Of course this comes at a cost, and only those key state management services would be protected through this approach. PaaS platforms can certainly provide a mechanism that can copy the JVM memory state to a redundant copy at runtime to safety guard against failures. Whether this is done at the JVM, application container level, or VM level, or all of the above, is up to the PaaS platform engineer to decide. This is where cost, performance, and scalability drive a categorization of platform type dictated by the nature of the workload. In http://tinyurl.com/lrkhakf I discuss various categories of platforms, Category-1 web application based, and Category-2 in-memory database types of platforms. What you design for one might not be appropriate for the other.

4. Code Deployment and Flexibility Myths

The notion of independently deploying and updating a microservice independent of anything else around it has some practical limitations. Consider the soup of many microservices, perhaps multiple microservice instances of the same type. As an example, consider 3 main types, say 10 instances of Microservice1, 5 instances of Microservice 2, and 3 instances of Microservice 3. When you update a service definition, what happens to Microservice 1 and Microservice 2 while you are updating Microservice 3? How do you know which of the Microservice 1 and Microservice 2 instances must be quiesced, or told that the Microservice 3 instance they are about to access is going down? One answer is that it does not matter, but really this approach leads to a lot of transactional failures and its main issue is the fragmented complexity of not knowing the mesh graph of transactions.

In the case of the 25 REST services we discussed earlier, within the original model where every JVM had one type of REST service, the same problem existed of not being able to update a particular REST service type independent of the various other instances. Even if you could, it seems that you would have RESTService3 instance 1 with a newer version than the others. Perhaps you could hide this by not putting the updated service into the mix and rotate around and update the other instances in turn, but this requires a lot of coordination. Whereas in the case of having all of the REST services stacked onto one JVM, you can cleanly take out the entire JVM, and not that you would do this, but potentially you could have one JVM having an entirely newer REST service definition than the others, because the transactions do not cross to other JVMs.

In my lucky career I spent the first 12 years as a pure software engineer coding lots of applications in C++ and Java. Since 2005 I have been delving more into platforms and infrastructure, which has given me the unique opportunity to fill the gap that exists in the area between the code and the infrastructure, an area I like to call application runtime, or application kernel.

I will write more about this, but in the meantime I think microservices are promising. However, they should not immediately imply micro-application kernels or JVMs. In some cases, a 1:1 ratio of microservices to JVMs might be excessive, but in other cases it is warranted.

There is quite a bit more to address on this topic, much of which is discussed in my book, http://tinyurl.com/lrkhakf.

 

Look for me at VMworld at this session: http://tinyurl.com/lnd5wpj

BooLogoOnly

 

 

The Era of Application Platforms is Born

In Figure 1, the IT enterprise is superimposed with the notion of an Application Platform Architect (APA). An APA is someone who has done extensive software development, has worked in how to scale-up and scale-out the application runtime, and has a solid understanding of infrastructure platforms. The APA is someone who has worked critically across the scalability and performance concerns of the application and associated application platform.

Figure 1 – Application Platform Architects a New Persona for CNA Era

ApplicationPlatformArchitect

As an example, to better understand what an APA is, Company X is in the business of selling a trading platform as a service. In their latest iteration, they have multiple components: a messaging system, custom distributed Java-based services, and a critical GemFire data platform (in-memory database). The core of this platform is the in-memory database. GemFire is a Java-based distributed database, in this case running on VMware vSphere. Hence, you need someone with a decent understanding of these three skills, or at least two of the three. When the platform was first released, developers were always involved during the operation of this platform. This placed a huge strain on the development efforts; however, later they managed to train and hire APAs, essentially re-skilling their existing DevOps team. Their existing DevOps team had great skills in automated deployments of application platforms, but lacked the skillset in terms of scalability and reliability techniques. In a few cases, we trained their existing engineers to become good APAs. But in other cases, they could hire SREs as the closest persona to the notion of what an APA does. A popular saying in that team was “…we put SRE back into DevOps.” This is to indicate that the founding principles of DevOps were SRE, but over the years DevOps digressed into meaning anything and everything for everyone, and more about CI/CD rather than its core founding principle of reliability. Many DevOps engineers they interviewed did not have scalability and performance backgrounds like SREs—an artifact of where the industry had influenced DevOps in the direction of certain vendors with CI/CD tools.

Here is a list of what the APAs/SREs at Company X do on a day-to-day basis:

1. They own the performance and reliability of the trading platform.

2. They are paired with multi-discipline senior engineers from development and infrastructure. They form close relationships with them and learn what they need to learn daily. This is important because often they find themselves chasing multiple organizations and disciplines to resolve a specific challenge with the platform.

3. The trading platform is made of 40 VMs and various software components within them. As a result, they have fully automated the creation and tearing down of these environments. They have many environments like these, so the SREs fully automated the creation and tearing down of the trading platform. This was the minimum necessary to support their agile high-iteration development shop with multiple requests per day for creating such a trading platform.

4. They look at the most feasible way to scale the platform, whether it is a certain amount of healthy scale-out or a combination of scale-up and scale-out strategy. Feasible scalability that meets adequate performance is a daily pursuit for them, because they profit as a company by charging a fee for transactions on their volume-driven trading platform.

5. Their platform has high stringency on reliability and predictable runtime execution of the transactions, regardless of volume. In fact, if they miss the agreed-upon transaction response time, they pay their customers a penalty (the stock exchanges of the world that use this trading platform are their customers).

6. If there is a performance challenge, they are responsible for investigating it from the code and infrastructure platform perspective. Often, with the ability to troubleshoot by taking thread dumps, SAR reports, heap dumps, and many other metrics, they can narrow down the cause of the issue and organize multi-faceted team meetings to address the concern, usually in a war room style if needed. The SRE team would own most production problems, but if they need help will organize wider teams to be involved.

7. They work with software engineers in architecting the runtime for new applications, helping them understand the capabilities of the platform, as well as what is feasible and what is not from a scale-up and scale-out perspective. For example, which components will need affinity or anti-affinity rules. (Affinity means some components need to run next to each other and other components that are redundant pairs must not run on the same VM/physical host; i.e. anti affinity.)

8. They create various platform and infrastructure abstraction layers and infrastructure/platform APIs to help developers with writing smart cloud-native era applications. For example, applications that can understand and remedy themselves if there was a certain platform failure event.

9. They create very advanced metrics that can show the transactional volume versus cost of each transaction, and continue to improve the application platform as a result. They also have further metrics on hours spent on administration, chasing and fixing reliability issues. (Trading platform business analysts use this rich platform data to help further analyze cost vs. profit of the business.)

Let’s highlight further what the key skills are in each of these categories: in the case of a development background, it is critical that the APAs have had a significant software engineering background at some point in their career, have written code for major applications/services, and have become super-versed in at least one of the modern computer languages such as Java, JavaScript, Node.js, Ruby, Scala, Python, or Go. It is also critical that they have not only worked as pure software engineers developing narrow product features, but have worked in an area where multiple components/platform layers are involved—typically ones found in every enterprise IT shop. Systems they’ve worked on should include load balancers, web servers, application servers, middleware, messaging systems, in-memory databases, traditional relational databases, operating systems, containers, and virtualization. Most critically, they have been able to deal with common production reliability and scalability issues as software engineers. The best software engineers are ones that write and learn how to best run and scale their code in production environments.

Application Platform Architect in a Nutshell

In Figure 2, we show the three main categories of skill sets needed for the new role of Application Platform Architect: development background, deployment, a background in having deployed and scaled applications in production, and a good grounding in the infrastructures stacks on which the application platform relies.

Figure 2 – Application Platform Architect Combines Three Skillsets

APAInaNutshell-9

In the case of a Deployment/App runtime background, this is a critical piece of the puzzle, and the hardest skill set to find. In fact, most organizations don’t even know that this needs to exist as a skill. Over the past decade at VMware we have developed a specialized technical workshop for coaching teams on application platform thinking. We typically run this workshop with a mixed audience of SREs, VMware Architects, Platforms/Infra architects, developers, and application owners, all at the same time. This is by design, because we want to identify the gaps in interpretations in each layer and address them holistically in real time. The workshop takes on the agenda of material that we present in this paper, but also has a deep-dive interactive session where we review an existing application platform and make design improvement recommendations. When the collective audience sees how our cross-application developer, application runtime, and infrastructure discussion is conducted to lead to a better design, most customers walk away saying that they would like to build an Application Platforms team within their organization.

When we said SDDC, what we really meant was the Software Defined Application Platform

When we talked about SDDC a few years back, we were really saying that a Software Defined Application Platform (SDAP) is a specialized multi-cloud runtime common layer that is application-workload aware. This is a layer that understands application workload behavior, the cost of its placement on various clouds, the cost of its scale on a private or public cloud, and the needs of its data affinity or proximity to other services. It’s an intelligent control plane that is application-workload aware, which we believe is the essence of true HCR.

Keep in mind, if one doesn’t deliver on these vectors, they are simply building generic infrastructure layers that are blind to what application workloads need. To every set of generic hardware configured, it ends up being customized to suit a customer’s application workload, and while there can be many such instances of workloads, mostly they can be categorized into a dozen use cases. There is a huge opportunity to leverage and take advantage of the phenomenon of common multi-cloud runtimes, where applications are starting to look more like networks, and as such network layers need to build specialized application-workload aware functionality to deliver value.

We are seeing the drive toward building specialized multi-cloud application platforms, but these platforms cannot simply be split in an ad-hoc fashion across two separate clouds. As one customer said, “putting a load balancer in front of a private and a public cloud does not constitute a hybrid/multi-cloud system. I can do that all day myself. What I need is a common runtime between the two.” Hence, one must think of a common stretched layer between these clouds. This common layer could be referred to as the HCR, which would be the main engine of the Application Platform runtime, with service mesh capabilities, application workload behavior simulation utilities, VMware HCXlike workload mobility services, software defined compute, network and storage, with distributed trace capabilities using our management products such as Wavefront and others. One can think of these layers as SRE-as-a-Service (SREaaS), where there is an effort to fully automate anything that an SRE does to keep an application platform optimized, providing constant real-time analytics data, and making intelligent placement and optimization decisions. It may be that SREs have a manual job today, but their current job will be fully automated by these specialized layers—a layer like an HCR—an application platform runtime with service mesh at the core of its foundation, as shown in Figure 1.  In this figure, we also show some specialized controllers such as Predictable Response Time Controller (PRTC) for optimizing response time at specific scale across services, Topology Mapper to determine the call graph between services including placement on compute space, and Dedup Controller that removes compute space fragmentation.

Figure 1 – The HCR a Core Runtime for an Application Platform

HCR

 

In Figure 2, we show a functional diagram for an application platform containing a Cloud Native Infrastructure (CNI) layer, CI/CD, service mesh, security and policy management, and Distributed Trace capabilities that will help drive SREaaS initiatives. This application runtime also applies to non- cloud-native products for the enterprise, where we think a truly federated service mesh layer can marshal between cloud-native and non-cloud-native services.

Figure 2 – Application Platforms Functional Diagram

AppPlatformsDiagram

HCR with Service mesh

Figure 2 shows a service mesh layer with key functions. (Note: in Figure 1 we also show a service mesh as a core function of HCR.) Some of these functions are well evangelized in the industry, and others are based on our empirical findings in terms of what we think a service mesh should remedy in a typical microservices cloud-native application platform. The following is a list of functions of a service mesh:

Service QoS Load Balancer: This is at the core of any service mesh implementation. With so many microservices implementations suffering from extended response times as the service mesh grows, an intelligent workload/service-aware load balancer is needed to help optimize routes and reduce the impact of latency. This can be critical in functionality for achieving the intended behavior with using “Predictable Response Times Across Service Routes”. At a minimum, the load balancer would provide traffic optimization for HTTP, gRPC, WebSocket, and TCP.

Service Sidecar Proxy: The reference to “sidecar” comes from the fact that it is a supporting service that attaches to a main service context. Just like a motorbike’s sidecar, it attaches to extend the motorbike’s capabilities. To create this notion of a highly intelligent workload/service-aware control layer such as a service mesh, one must be able to inject lots of service mesh introspection and optimization information, but in a non-hardcoded manner. Often developers are offered libraries that can enhance the performance of their service, but with the heavy requirement of hardcoding certain callouts within their code, often littering the primary business logic code with plumbing code. On the other hand, the sidecar proxy concept alleviates this burden by intercepting calls and adding the needed optimization information to the call, all at runtime without any specific hardcoded code callouts.

Service mesh is gaining traction rapidly in the market because there are many microservice implementations that are failing to meet their performance, scale, and reliability objectives. It is important to focus on clear use case areas that would allow microservices implementers to gain control of their SLAs. Below we focus on new use cases, skipping over the common use cases that have already been written about such as load balancing, security, scale controller, redundancy, and sidecar proxy.

Service Replicator: The service mesh can easily create various availability zones and based on the injected configuration from user, can maintain anti affinity of services by replicating them to other zones. In most cases the replication can be simple. In others the user may provide service dependencies and would require a sub-branch of the service mesh to be replicated as a result. The main objective is that the replication behavior can be abstracted from the main code and housed in the metadata configuration the service mesh uses to apply the replication logic.

Service Circuit Breaker: Just like with electrical circuits, when a certain current level is reached a breaker is tripped and no more current can flow. In similar fashion, a microservice can be protected by a breaker pattern that encapsulates it with a service call breaker. The encapsulating breaker pattern detects any potential deterioration of SLA and breaks all calls to the base microservice by throwing an error, where an error handler in the system can have additional logic to act on the situation. In most cases rudimentary implementations simply cut off ensuing calls to the service. But in a more advanced functionality, they can have the ability to shut down the microservices and provide cascade shutdown behavior. This means if one service is shut down, its neighboring or related services are shut down as well. Finagle (https://twitter.github.io/finagle/) and Hystrix (https://github.com/Netflix/Hystrix) were some of the early examples of this pattern.

Service Discovery and Registration: In any large service deployment, as a service mesh grows, knowing which service is in the mesh is paramount; but equally important is the ability to have a well-known and entrusted service registration mechanism, where certain policies can be applied.

Service Scaler: This is a service mesh sidecar process that sits and listens to certain service execution metrics, then determines which services need to be scaled. At first this appears to be a simple problem but, in fact, it requires the ability to sample execution data over time, and heuristically apply changes and learns from them to improve the decision in the next sample. Many of the use cases at the service mesh layer would critically depend on Service Scaler functionality, particularly the following use cases: Achieving Predictable Response Times Across Service Routes, Service to Service Co-location and Service vMotion, and Service to Service Co-location and Service vMotion.

Service Failure Compensator: There are a lot of manual steps that a typical SRE/APA does to handle a myriad of day-to-day service failures. SREs can write many such handlers to plug into the service mesh to be called back by the compensator handler when a certain event happens. The essence of the mesh is that it is partly developed by a vendor with a certain baseline, but then it grows quickly by many plugins/compensation handlers that the users plug in. This is where the huge value of the service mesh comes in, where many manual steps are captured and automated away.

Service Rolling Update: This function provides the ability to update service instances without causing interruption to current midflight transactions.

 

VMworld 2016 and the Renaissance of the Application Platform Engineer

This year we will do a deep dive into:

How to Design and Tune Virtualized Trading Platforms – VIRT7507

Last year at VMworld 2015  and Spring-2015

Last year we highlighted the notion of Platform Engineering in the above sessions, and as briefly described here.

It is important to understand the elements of this rapidly moving phenomenon through our industry, a phenomenon of building platforms, not just business logic software but infrastructure as software. We humbly believe that the drive towards these platform solutions is due to the following fact: approximately half of new applications fail to meet their performance objectives, and almost all of these have 2.x more cloud capacity provisioned than what is actually needed. As developers we live with this fact every day, always chasing performance and feasible scalability, but never actually cementing it into a scientific equation where it is predictable, but rather it has always been trial based, and heavily prone to error. As a result we find ourselves delving with some interesting platforming patterns of this decade, and unfortunately we are lead to believe that such patterns as Microservices, 3rd platforms, cloud native, and 12factor are mainly a change in coding patterns, to the contrary – these patterns reveal a much more deep shift in the way developers view and consume infrastructure. In fact these patterns represent a major change in “deployment” approach, a change in how we deploy and structure code artifacts within applications runtimes, and how those application runtimes can leverage the underlying cloud capacity. These patterns are not code design patterns, but rather platform engineering patterns, with a drive to using APIs/Software to define application platform policies to manage scalability, availability and performance in a predictable manner. In the above session we  will briefly inspect platform patterns that we built over the last decade, and the ones we are building for the next decade. The main objective of the session will be to layout the definition of Platform Engineering as the software engineering science needed in order to understand how to precisely deploy application components onto application runtimes and how in-turn one should appropriately map the application runtimes onto the infrastructure that it needs. With this knowledge you should be able to more effectively decide when to scale-out (by how many instances), and when to scale-up both manually and dynamically as needed within your software defined platform.

My VMworld Sessions for 2015

VAPP4449 – How VMware Customers Build and Tune High Performance Application Platforms

The session will cover various platform architectures that are used in high end transactional systems at VMware customer sites, such as those found on trading platforms, and high volume order management systems. The deep dive will take a tour into structure of the platform, how to best scale it with virtualization, and how to deeply tune it for optimal performance. The session will share many best practices and actual examples used with customers that are running such high-end platforms. Specifically, Java GC and vSphere tuning techniques, and rationalization of large deployments having 1000s of VMs and JVMs will be discussed. Come to this session to learn about GC tuning and vSphere recipes that can give you the best configuration for latency sensitive applications high end performance platforms. We will show case in memory DBs with multiple terabyte clusters, but also looking at some monster JVM of 360GB Heap

Emad Benjamin – Principal Architect, VMware
Alessandro Quargnali-Linsley – Systems Engineer, Societe Generale
Program Location: Europe and US

VAPP4732 – Enterprise Application Architecture Influence on SDDC

Understanding enterprise application architecture influence on good SDDC design is at the core of what make a successful SDDC implementation. Often the divide between application developers and infrastructure architects is so great that the gap become very difficult to fill. On the development side developers look at the application layer concerns and tend to ignore infrastructure implementation paradigms, conversely infrastructure architects start at the bottom and never quite reach the design requirements of the application. In this session we demonstrate how this gap can be addressed.

Emad Benjamin – Principal Architect, VMware

Marco Caronna, Solutions Architect, VMware

Eamon Ryan, Sr. Solutions Architect
Program Location: US

CTO6659 – Ask the Experts – Cloud Native Applications

In the brave new world of IT there is a shift in customer expectations which which is driving a change in how applications are developed and operated (e.g. DevOps), how they’re architected (e.g. microservices and 12-factor apps), and how they”re deployed (e.g. Docker and containers). This is a shift in the balance of focus from Ops lead to a Dev lead world, where developers become a first class citizen of the datacenter. VMware calls these applications “cloud-native” as they’re designed for the mobile-cloud era. VMware launched Cloud-Native Apps at the beginning of the year to address the shift that is currently taking place how applications are developed today.

Emad Benjamin – Principal Architect, VMware
Joe Baguley – CTO EMEA, VMware
Robbie Jerrom – Senior Solutions Architect, VMware
Martijn Baecke – Solutions Consultant, VMware
Ed Hoppitt – CTO Ambassador, VMware
Program Location: Europe and US

What Should Happen on Day-2 of Containerization and Microservices Deployment Architecture?

Software component packaging is Day-1 concern for containers, however day-2 optimization of which container, how big the container, how many transactions the container will service in a scalable fashion, what other containers are related to the current container,  and their dynamic movement for better scalability and availability is the key aspect of taking a niche technology to the next level. These are all concerns of the next wave of commercialized products, no doubt developers want to quickly package something and deploy it to production, however, fundamentals of distributed computing and scalability don’t magically disappear.  The current container packaging wave is yet another developer behavior of trying to encapsulate the various software components, libraries and process details.  Packaging and containerization is great concept, after all containment concept is a natural progression of trying to abstract implementation detail, for example, we saw a similar progression with the concept of encapsulating many threads into a process, encapsulating may files into tar, many source codes into jars/wars etc.  The current wave of containerization is about encapsulating the software components and associated processes to the context of the application being deployed, as one deployment unit without the repeated overhead of an entire operating system.  The next wave will be about day-2 operations, and how to write DRS (VMware vCenter Distributed Resource Scheduler) like functionality for containers delivering the correct scale and meeting the response time SLA of an app, at the most cost efficient utilization of hardware.  In almost every corner of container discussions, microservices architecture is part and parcel of the discussion, and the service unit of a microservice is being positioned as the containerized deployment unit, this would lead to fragmented scale-out of the distributed platform that would negatively impact response times, I wrote a little about this here, a better answer would be to more appropriately containerize based on relationships between service usage as suggested in my earlier blog.  This type of containerization can be encoded into how the container is formed on day-1 if the developer intimately knows how the microservices graph is used, or the app is used, and perhaps it can be calculated by the runtime at incubation time, pilot/trial time, or even prod time.  Let’s say you deploy a deeply fragmented fashion microservices as touted by many early entrants, the underlying application/container/platform runtime engine can determine and give you recommendations of how to better deploy…more on this later.

Virtualizing and Tuning Java Workshop Videos

I would love to be able to come to your corner of the world and deliver these workshops, however this is obviously not practical.  The alternative is to take a look at videos of the workshops I have delivered at various events.

Spring2gx Session:

VMworld Sessions: