Dynamic Workload Management in IBM System z Mainframe Environment

Summary

Workload manager (WLM) employed in IBM System z environment is one of most sophisticated performance monitoring and control tool in IT industry. In this article, first WLM concepts and facilities has been summarized. After going over evolution from machine resource level specifications to customer service level specifications, some advanced characteristics are explained in more detail.

Introduction

One of the strengths of the System z mainframe environment is the ability to run multiple workloads at the same time. This is performed by dynamic WLM, which is implemented in the WLM component of the zOS operating system.

The idea of zOS WLM is to make a service level agreement (SLA) between service users (IT customers) and the service provider (zOS operating system). The installation classifies the work running on the zOS operating system in distinct service classes and defines goals for them that express the expectation of how the work should perform. WLM uses these goal definitions to manage the work across all systems of a SYSPLEX (SYStems comPLEX, which is zOS clustering facility) environment.

In earlier versions of mainframe operating systems, performance parameters were not reflecting user expectations. They were implemented separately in operating system and subsystems. They used to have their own separate terminologies for similar concepts and their own separate control procedures. Those uncoordinated policies and procedures were not defined in terms of customer expectations. All parameters and controls were defined in terms of system resources. This type of workload management was called compatibility mode. Components of compatibility mode were Installation Performance Specifications (IPS) and Installation Classification Specifications (ICS) PARMLIB members.

2000 onwards workload management focus shifted from tuning at system resources level to defining performance expectations. This brought goal mode workload management with fewer and simpler tuning parameters reflecting customer expectations. Goal mode WLM does not use low-level system tuning parameters any more. Performance goals defined directly from customer SLA.

In order to understand how WLM works, we can look at the traffic in cities. When we drive a car from one location to another location, the time is determined by the constant speed we are allowed to go, the amount of traffic on the streets, and the traffic regulations at crossings and intersections. Based on this, Time is the basic measure to go from a start to a destination point. We can easily understand that the amount of traffic is the determining factor of how fast we can go between these two points. While maximum driving speed is constant, the number of cars on the street determines how long we have to wait at intersections and crossings before we can pass through them. As a result, we can identify the crossings, traffic lights, and intersections as the points where we encounter delays on our way through the city.

A traffic management system can use these considerations to manage the running traffic. For example, dedicated lanes for public buses allow them to pass the waiting traffic during rush hours and therefore to travel faster than passenger cars. This is a common model of prioritizing a certain type of vehicle against others in the traffic systems. So far, we see that it is possible to use the time as a measure between two points for driving in a city, we identified the points where contention might occur, and we found a simple but efficient method of prioritizing and managing the traffic. Now we will go over WLM concepts. They are similar and analogous to traffic management systems.

Goal Mode WLM Concepts and Facilities

Managing mainframe workloads in a zOS system or across multiple systems in a Parallel SYSPLEX requires the end user to formulate their expectations as goals in different categories and associate goals with each category. This is known as workload classification. Workloads are classified into distinct service classes that define the performance goals to meet the business objectives and end-user expectations.

Classification is based on application-specific or middleware-specific qualifications that allow WLM to identify different work requests in the system. for example, defining a service class for batch programs or for all requests accessing the DB2 stored procedures. During runtime, the subsystems of zOS operating system, middleware components and applications inform WLM when new work arrives, through programming interfaces that all operating system and major middleware components exploit. The interfaces allow a work manager to pass the classification attributes that the component supports to WLM. WLM associates the new work request with the service class based on the user definitions and starts to manage the work requests toward the affiliated goal definition.

A performance goal belongs to a certain type with an importance level between 1 and 5. That number tells WLM how important it is that the work in this service class meets its goal, if it does not already. WLM makes every attempt to ensure that work with a level 1 service class is meeting its goal before moving to work in importance level 2 and so forth down to level 5.

Goal type expresses how the end user wants to see the work perform in a service class. Three types of goals exist:

  1. Response Time: This can be expressed either as “Average Response Time” (for example, all work requests in the service class should complete, on average, in one second) or as “Response Time with a Percentile” (for example, 90 percent of all work requests in a service class should complete within 0.8 seconds). Using a response time goal assumes that the applications or middleware tell WLM when a new transaction starts. This is the case when the component supports the WLM Enclave services, and its possible for CICS and IMS and subsystems like Job Entry Subsystem (JES), TSO (Time Sharing Option) and UNIX System Services. Using a response time goal also requires that a certain number of work requests constantly arrive in the system.

  2. Execution Velocity: Response Time is not appropriate for all types of work, for example address spaces that do not exploit WLM services or long-running transactions with infinite ends. To manage those workloads, a formula measures a transactions velocity as a number between 1 and 99, quantifying how much time a transaction spends waiting for system resources. Higher velocity means fewer delays have been encountered.

  3. Discretionary: Assign this goal to work that can run whenever the system has extra resources. Discretionary work is not associated with an importance level, so it accesses resources only when the requirements for all work with an importance level can be satisfied.

WLM Periods

A service class can also be subdivided into multiple periods. As the work consumes more resources, it may “age” from one period to another, at which point it can be assigned a new goal and importance level. The resource consumption is expressed in Service Units, a zOS definition for work consuming resources. For example, a service class for batch jobs can be broken into three periods. The first period has a “high” execution velocity goal of 50 and an importance level of 3 for the first 2,500 Service Units. Work that takes longer would go into the second period with a defined execution velocity of 20 and an importance level of 5 for the next 10,000 Service Units. Finally, long-running batch jobs age into the third period, which is associated with a discretionary goal.

We can now classify the work and define its performance goals. The subsystems, middleware components and applications use WLM services to inform WLM when new work arrives. They also summarize work requests into uniquely identifiable entities that can be monitored and managed.

WLM constantly collects performance data by service class, compares the results with the goal definitions and changes the access to the resources for the work entities contained in the service classes based on goal achievement and demand. Data collection occurs every 250 milliseconds with goal and resource adjustment executing every 10 seconds.

WLM calculates the performance index (PI) to determine whether a service class is meeting its goals. A response time goal is the quotient of the actual achieved response time divided by the goal value, and an execution velocity goal is the defined value divided by the measured value in the system. If the PI is less than one, the goal is overachieved, and if the value is greater than one the service class misses its goal. If a service class does not achieve its goal, WLM attempts to give that service class more of the specific resource it needs. As several service classes may be competing for the same resource, however, WLM must perform a constant balancing act, making trade-offs based on the business importance and goals of the different work types.

Whenever WLM decides to make a change, the current goal adjustment is completed and the system is monitored again for the next 10 seconds while WLM assesses whether additional changes are required. If no change was possible, WLM may look for another service class to help or attempt to help the selected service class for another resource.

The service class and goal definitions are part of the service definition for the entire Parallel SYSPLEX. The service definition can be subdivided into multiple service policies that allow you to dynamically change the performance goals of the service classes and have the same service definition with adapted goals in effect for certain times. For example, you may have different goals for your batch service classes during the night shift when primarily batch work runs in the system than you do during the day when the focus is online work.

Advanced WLM Characteristics

Some advanced WLM characteristics to mention are Intelligent Resource Director (IRD), dynamic channel path ID (CHPID) management, some small product enhancements (SPE) and dynamically managed batch initiators.

IRD is an important manageability enhancement. It comprised of Parallel SYSPLEX, PR/SM and WLM technologies. IRD processes work in a clustered environment in a new way. Rather than distributing work to the systems, the resources are moved to where the work runs. Systems, IPL’ed (Initial Program Load) in a logical partition (LPAR) on the same central processor complex (CPC), that belong to the same Parallel SYSPLEX and are running zOS, form an LPAR cluster. Within such a cluster, the CPU weight (which specifies how much CPU capacity is guaranteed to an LPAR if demanded) can be shifted from one LPAR to another while the overall LPAR cluster weight remains constant.

When all LPAR’s are busy, their current weights are all enforced. WLM can initiate a weight adjustment in favor of one system to help the most important work not meeting its goals. Along with CPU weight management, IRD can also manage the number of logical CPUs for a system. The LPAR overhead can be high when the logical-to-physical CPU ratio in a CPC exceeds 2.0. To avoid this, WLM manages the number of logical CPUs to be close to the physical number of CPUs that can be utilized based on the partitions current weight.

Dynamic CHPID management is based on the same idea, namely managing a set of channel paths (also called managed channels) as a pool of floating channels and assigning those channel paths dynamically to DASD control units based on the importance of the work doing the I/O and the channel path delay measured for this work. Dynamic CHPID management not only allows the system to react on changing I/O patterns based on business importance, but also helps reduce the impact of the 256 channel paths limit: fixed channel paths need only be defined for availability considerations, while additional channels can be added or removed dynamically to meet the business objectives and increase the overall I/O performance based on the workload demand.

IRD also allows control of the I/O priority from its start point within zOS operating system, through the channel subsystem, until its processed within the storage controller itself. Within zOS operating system, an I/O request is queued in front of a particular device control block. The I/O priority determines the position of a new I/O request within that queue. When the request is processed, it flows through the channel subsystem where the I/O priority queuing allows an installation to define the partitions I/O priority relative to other partitions in the CPC. Within the storage controller, its again the priority of the I/O request that determines how fast the data is accessed from disk if the request cannot be satisfied from the cache.

As an SPE, the scope of an LPAR cluster extends and allows it to include non-zOS members, particularly system images running Linux for System z. CPU weight can be shifted from zOS images to non-zOS images and vice versa. Non-zOS images are defined as work assigned to a service class in the WLM service definition and managed toward a velocity goal.

WLM introduced dynamically managed batch initiator address spaces. Instead of tuning the initiator address spaces manually, WLM-managed batch initiators let the system determine the optimal number of initiator address spaces.

Each batch job is associated with a job class that is either in JES or WLM mode. Accordingly, there are two types of initiators: JES-managed initiators selecting work from job classes in JES mode and WLM-managed initiators selecting work from job classes in WLM mode. Operators have full control over the JES-managed initiators but no control over WLM-managed initiators. When a job becomes ready for execution, an idle initiator associated with that job class selects the job. If no initiator exists or all initiators for that class are busy, the job must wait until an initiator becomes available.

That wait, or queue, time is factored into the actual goal achievement of the work as part of the overall response time or as an execution delay in case of velocity goals. If queue delay becomes a bottleneck and goals are no longer reached, WLM determines whether it can help the batch work by starting additional initiators. WLM calculates the maximum number of initiators that can be started to improve goal fulfillment without the expense of higher importance work. It also selects a system to start new initiators-systems with available capacity first, then systems with enough displaceable capacity. If more initiators are available than needed to process the batch work, WLM stops the initiators or assigns them to other job classes in WLM mode.

Reporting enhancements are the most extensive enhancement since the introduction of goal mode in MVS 5.1 is the implementation of report class periods. Before zOS, report classes appeared as simple containers for any kind of transaction-from those managed toward different velocities or different response time goals to those originating in different subsystems. This made general reporting possible, but not in specific cases where reporting makes sense for only a homogeneous set of transactions: response time distribution and subsystem work manager delays. Many installations add service classes to help solve the reporting deficiencies of report classes, but service classes should only be used for management purposes. We recommend using only 25 active service class periods at any time in a system.

To solve this dilemma, WLM implemented report class periods where the actual period number is dynamically derived from the service class period in which a transaction is currently running. Even then, there is still the possibility of mixing different kinds of transactions. However, WLM can track the transactions attributed to a particular report class, informing performance monitors such as the Resource Measurement Facility (RMF) about this when they request WLM reporting statistics. A homogeneous report class has all transactions attributed to it associated with the same service class. A heterogeneous report class has at least two transactions attributed to it that are associated with different service classes. A performance monitor can determine the homogeneity of the report class within the reporting interval and support the additional reporting capabilities.

Usually, response time distributions can only be generated for report classes when it reports on transactions managed by one service class with a response time goal. With a little trick, however, WLM can maintain the response time distribution for CICS and IMS workloads even though the CICS and IMS regions are managed towards execution velocity goals: CICS/IMS transactions are classified to a report class and a service class. The service classes response time goal is ignored for management; WLM will use it to maintain the response time distribution for the report class. RMF obtains the data and presents it throughout its reports.

One of operational facilities is that WLM has the capability to reset Enclaves to a different service class. Operators can reset address spaces through an operator command or another console interface, such as the System Display and Search Facility (SDSF). The RESET command allows the operator to specify a different service class, quiesce an address space (and swap it out if swappable) or resume it in the original service class (i.e., reclassify it according to the rules in the current service policy). In this way, address spaces can be slowed down or accelerated depending on the operators intention.

Until zOS V1.3, nothing could be done with Enclaves once they were created. That release provides a similar reset capability as address spaces. But rather than giving operators a new command to enter a somehow circumstantial Enclave token (Enclaves have no name comparable to an address spaces job name), WLM provides an API for authorized programs to reset an Enclave. SDSF uses that API to allow operators to reset the service class of an independent Enclave, quiesce the Enclave or resume it. That support is implemented on ENC-screen of SDSF and requires SPE PQ50025.

WLM enhanced its monitoring programming interfaces to allow middleware like WebSphere to collect and report on subsystem states. This, together with enhancing the capability of WebSphere to classify work on zOS, is to improve and ease the deployment of major middleware applications on zOS.

The installation and customization of WLM has been enhanced by integrating this function into zOS Managed System Infrastructure Setup. This integration allows users to more easily adjust the size of the WLM couple data set.

Conclusion

Before the introduction of zOS WLM, Operating systems required customers to translate their data processing goals from high-level objectives about what work needs to be done into the extremely technical terms that the system can understand. This translation requires high skill-level staff, and can be protracted, error-prone, and eventually in conflict with the original business goals. Multi-system, SYSPLEX, parallel processing, and data sharing environments add to the complexity.

In this article we summarized how zOS WLM provides a solution for managing workload distribution, workload balancing, and distributing resources to competing workloads like CICS, IMS/ESA, JES, APPC, TSO/E, zOS UNIX System Services, DDF, DB2, SOM, LSFM, and Internet Connection Server.

We also summarized how zOS WLM component dynamically allocates or redistributes server resources such as CPU, I/O and memory across a set of workloads based on user-defined goals and their resource demand within a zOS image.

References

  1. z/OS V2.1 MVS Planning: Workload Management, SC34-2662

  2. IBM Redbooks, System Programmer’s Guide to: Workload Manager, SG24-6472

  3. IBM Systems Magazine, Managing Workloads in z/OS, January 2004

Advertisements
This entry was posted in IBM zEnterprise Servers, IT Service Management and tagged , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s