Web Services

Capturing and Analyzing Client Transaction Metrics for .NET-Based Web Services

Brian Connolly

This article discusses:

The importance of Web services metrics
Why you need to measure on the client
Writing a metrics measuring app in C#
Analyzing the results

This article uses the following technologies:
SOAP, Web Services, C#, SQL, MSMQ

Code download available at: TransactionMetrics.exe (155KB)

Defining the Problem
Defining the Solution: Client Metrics Recording
Database Design and Structure
Server Work Queues
Data Access Layer
TransactionMetricsPayload
Web Services
ConsoleReader—the Application "Server"
MetricsReader—the Metrics Logger
Client Design and Coding
WSDL-Generated Service1 Class
PayloadManager—The Client Metrics Recording Library
ConsoleClient—the Client Simulator
The Sample Application
Running the Transaction System
Using the Metrics Database
Performing the Test
Analyzing the Results
Limitations
Further Work and Possible Extensions

he success or failure of IT departments can depend on their ability to deliver a high throughput of requests to the services they support. This is especially important for Web services. In order to ensure adequate throughput, you really need to be able to measure response times. To achieve that goal, this article presents a general-purpose reporting mechanism that can be used in any .NET system that employs HTTP and SOAP.

Throughput measurement and the expected peak number of operations influence enterprise capacity planning, allowing planners to determine how many servers are required. These are important metrics, yet typical Web service logging consists solely of raw HTTP logs. Logging and reporting tools available for Web logs analyze only low-level HTTP requests. The architecture proposed here gathers more accurate and useful metrics than those offered by systems based solely on Web server HTTP logs.

Defining the Problem

A typical enterprise transaction system built using Web services is shown in Figure 1. Application servers execute business logic and request database operations and are usually placed on an internal network separate from the Web servers for security purposes. A set of Web servers running IIS and ASP.NET accept incoming requests and use a queuing resource manager such as Microsoft® Message Queue (MSMQ) to deliver them to application servers for processing. Client applications use a Web Services Description Language (WSDL)-generated proxy in order to submit operations to the Web server.

Figure 1 Web Services Transaction Architecture

Most client transactions will consist of multiple SOAP requests. For example, a financial market trade might consist of a SOAP request to submit an order and a subsequent request to check for its execution. Or a product purchase might consist of one request to submit an order, a second request to submit credit data, and a third request to obtain an order number for a completed order.

It can be very difficult to measure and report these operations from IIS logs. Suppose you need a report on the number of orders that were processed as well as their average completion time. You would need to do several things, such as developing a database representation of HTTP logs or purchase one as part of a Web log reporting system. You would also need to identify individual clients, either by distinct IP addresses, cookies, or through various other HTTP headers.

Finally, you would need to write custom SQL procedures to identify these clients and the user transactions they issued. This would be quite complex as a typical user transaction consists of multiple HTTP requests. The SQL procedure would need to identify the first HTTP request of a transaction and then use a cursor to look ahead in the log for the concluding status check request.

This can be a complicated and expensive project, and often you don't realize the need for it until a system is already in production and user support issues arise. Even if a Web server log reporting system is developed, it can't provide a clear picture of response times from the client's point of view. It cannot record client delays in connecting to a server, network transmission delays, or any client overhead involved in processing a transaction.

In addition, an HTTP request is logged and timestamped only after it is serviced, so delays receiving service from an overloaded Web server will not be recorded. These factors can add significantly to a client's perception of response time. An organization that relies solely on Web logs will lack a clear picture of the true quality of service they offer to their clients.

That said, architects must be aware of some issues in order to use this approach. While these client-based metrics will paint a picture of what the client actually sees, that data isn't necessarily useful to the service designers. Long delays due to network latency and bandwidth will affect these numbers and yet can't be easily changed for most systems. Issues also arise regarding how the client components are distributed to the users. The assumption I make is that these components are distributed with the software itself or are otherwise distributed to the clients in a robust manner. Regardless of these potential pitfalls, there are many valuable applications for the client-based performance data you obtain. This article explores one possible solution.

Defining the Solution: Client Metrics Recording

While there are many ways you can solve such a problem, my solution is based on one overriding principle. Client applications already implement the business logic that relates low-level requests to user transactions. Any reporting system based on server HTTP logs is an unnecessary reimplementation of a model that already exists in the client. The solution is to have the client application itself record metrics about the operations it generates. When a user initiates a set of these operations, the client can record the start time and the elapsed time before results are presented to the user. These records can be buffered in the client and then submitted to the server using SOAP headers. No extra requests or responses are sent between the client and server—these headers are sent as an additional payload with the next transaction initiated by the user.

Figure 2 shows the architecture used in this approach. New components are shown in red. A client Payload Manager is used to record metrics, such as transaction start and end times, number of items ordered, payment method, or any other business data that will aid product support and capacity planning. This transaction data is added in optional headers to the same SOAP messages that are used to complete the user operations. The ASP.NET Web services that process SOAP requests strip off any metrics headers and place them on a queue for low-priority processors. Metrics loggers dequeue these headers and place them into a database designed to allow general-purpose transaction reporting. IIS logs are no longer needed.

Figure 2 Web Services Transaction Architecture with Metrics

The solution I am presenting has the following features. First, the Payload Manager, the Metrics database schema, and the Metrics Loggers are written as general-purpose metrics-recording components that should be able to record data of interest to a wide variety of applications.

Client transaction metrics are implemented in a manner that doesn't lower the reliability of the system as a whole. Any raised exceptions are caught and handled in a manner that doesn't impede successful execution of a business transaction. Also, the actual client response times that are recorded reflect delays experienced by a user.

It is important to note that this is a sampling implementation—not all transactions will be recorded. In particular, the final operation of each application session will not show up. My intention was to record most operations in order to serve as a user support tool and to assist in capacity planning.

In writing this article I used Microsoft SQL Server 2000 and Visual Studio® .NET to develop a simple system to illustrate the design for gathering client metrics. The model transaction is simple: computing the prime factors of large numbers. Clients invoke one Web service to submit a number to be factored and invoke a second Web service to check if results are ready. Client response time is measured as the time difference between submitting a number to be factored and the final status check (the one that returns the prime factors).

The implementation I'm using here is based on Windows® XP Professional with MSMQ installed. All of the client, server, and database code I'll be discussing is available in the code download. The system consists of the following components:

Two databases: a simple application database called Factor, and a more complex client Metrics database.
A data access layer library that offers application server access to the Factor database. This is used by both the ASP.NET Web services and the ConsoleReader application.
Definition of a Header class that reflects the Metrics database schema. This will hold client metrics reports.
Two message queues: factorjob, to hold numbers to be factored, and transactionmetrics, to hold metrics Headers.
An application server called ConsoleReader that computes prime factors and saves them to the Factor database.
An application server similar to ConsoleReader called MetricsReader that obtains Headers from the transactionmetrics queue and logs them to the Metrics database.
An ASP.NET Web service that defines two WebMethods: Factor and CheckJob. Factor accepts a number to be factored, and CheckJob returns either its prime factors or an indication that factoring is still in progress.
A PayloadManager library, used by the client to record transaction statistics as Header objects for upload.
A ConsoleClient application that creates multiple threads to simulate many SOAP clients submitting factor requests of random sizes.

Database Design and Structure

Figure 3 shows the Factor database, which is used to service the application. When the Web service receives a Factor request, it creates an entry in tblTransaction, assigns a GUID as the TransactionID, and sets the ServerStartTime to the current time. The transaction Status is initially set to "In Progress." The transaction is queued, and the TransactionID is passed back to the client to be used as a handle to check for the completion of the operation.

Figure 3 The Factor Database

ConsoleReader computes the prime factors and stores them as rows in tblResults. ConsoleReader updates the ServerEndTime in tblTransaction and also sets the status to "Finished."

Three stored procedures are shown in Figure 4. These support transaction creation and completion, as well as results storage. The procedure upNewTransaction uses NEWID to create a GUID and create a transaction entry with the current time and "In Progress" status. The procedure upUpdateStatus is used to mark a completed transaction as "Finished." The procedure upCheckTransaction is used to check if a transaction has been completed.

The Metrics recording database, shown in Figure 5, is kept distinct from the application database because it is designed to be application independent. In this article, it shares the same data server with the Factor database, although in practice it would reside on a separate data server in order to reserve system resources for application processing. The only entity the two databases have in common is the notion of a unique TransactionID value.

TblTransaction represents the client view of transactions. TransactionID is the same identifier as the TransactionID in the Factor database—it is passed to the client when the first service call of a transaction is invoked, then passed back to the server with the client metrics for the transaction. ClientID is a GUID that is generated by the client application that submitted the transaction, and so tblClient represents all the instances of clients that have submitted any transaction metrics.

The Transactions table has the commonly useful metrics for transactions: the client's name for the transaction type (Type), the client system time when it was initiated (ClientStartTime), and the corresponding ServerStartTime. The server time allows transactions metrics to be aligned with server resource utilization logs. The transactions table contains the most important metric of all: ClientElapsedMS, the difference between the time the first SOAP call of a transaction was initiated and the time a status check indicated that the server completed the transaction.

Both tblClientAVPair and tblTransactionAVPair are intended as general-purpose descriptors for their associated entities. Examples of client attributes might be software version or machine type. In general, transaction attributes would be any attribute that makes a transaction more or less difficult for the server. For example, a trading application might add an attribute ("All-or-None", "True") to all-or-none orders.

All-or-none orders require special processing by electronic exchanges since the entire order quantity must be filled in one trade. Since these types of orders take more time to execute, it is important to distinguish them when analyzing system performance. In the sample factoring application developed for this article, the clients report the number of prime factors obtained from the server, since fewer factors generally imply more CPU-intensive factoring.

Both of those tables were implemented in a fully denormalized manner, using additional space to gain the ability to add new client and transaction attributes without changing the schema. In fact, the design allows the client application to add arbitrary attributes without requiring changes to other design components.

Server Work Queues

Servers usually contain a pool of resources that can process user requests. One way of distributing work among these resources is to use queues. Work queues allow operations to be executed asynchronously. Front-end Web servers can simply deposit new requests into a queue and confirm receipt to the user. When a back-end resource becomes available, it can read the next request from the queue and process it.

The Windows XP Computer Management application can be used to create two work queues for the simple transaction system: factorjob holds numbers to be factored by the ConsoleReader application server, transactionmetrics holds the SOAP headers uploaded by the client.

Data Access Layer

The DataAccessLayer namespace source code is in DataAccessLayer.cs. This is used by both the Web services and the ConsoleReader application to access the Factor database. Since this is used by both components, the WorkItem (the object passed to the factorjob queue) is defined here. It contains the transaction GUID and the data to be passed to the application server—in this case, the decimal number to be factored.

The Factor Web service uses NewTransaction to create a new user transaction and to obtain a transactionID that will be returned to the client. The CheckJob Web service uses CheckStatus in order to check whether the transaction denoted by transactionID has, in fact, been completed.

The ConsoleReader application uses UpdateStatus to mark a transaction as "Finished," and it then uses SaveResults to save the prime factors for the problem it was given.

DataAccessLayer is a simplification of a typical database encapsulation layer. What makes it simple is that it only uses a single database connection, while a real data access layer would employ a pool of connections. Since multiple threads will service SOAP clients, it is necessary to single-thread the NewTransaction and CheckStatus methods. Note that for simplicity the database connection string that I'm using in the implementation has been hardcoded. This is an obvious exception to the standard practice of defining connection strings in an application configuration file rather than right in the code.

TransactionMetricsPayload

The TransactionMetricsPayload namespace (see Figure 6) defines the metrics Header class that will hold client metrics reports for eventual logging to the Metrics database. Since I want the client to upload data within SOAP headers, I have subclasssed from System.Web.Services.Protocols, as you can see. SoapHeader.Header is used by the Web services and the MetricsReader application, and its design parallels the Metrics database schema. ClientReport has the general attributes of [Metrics].tblTransaction, and also the application-specific arrays of AVPair objects that will be placed in [Metrics].tblClientAVPair and tblTransactionAVPair. The actual Header object has an array of ClientReport objects, so that a client can buffer multiple reports in a single SOAP header and upload them all to the server.

Web Services

The ASP.NET Factor service defines two WebMethods: Factor and CheckJob. A class, FactorAck, is defined as the return type of the Factor service. This is used to return the transaction GUID and the server start time to the client. The server start time is needed so that the metrics processing components on the server don't require access to the Factor application database.

To create the WebMethods in Visual Studio .NET, you first create an ASP.NET Web services project called Factor, which will generate a default Service1 service. Use the Server Explorer to drag the factorjob and transactionmetrics queues onto the component workspace, causing Visual Studio .NET to declare and configure two member variables in the service: messageQueue1 and messageQueue2 (see Figure 7).

Note the mHeader member variable. This will hold any SOAP header sent by the client. The SoapHeader attribute for each WebMethod maps the header to this variable and also specifies that the header is optional (it need not be present in a message). The first thing each WebMethod does is check for an mHeader and if found, it's placed on the transactionmetrics queue.

If an exception is raised when queuing the header, it is caught and handled so that you can continue to process the business transaction even if there is a problem with the metrics infrastructure. Recall that one of the design objectives was to gather metrics in a manner that doesn't make the overall system less reliable than it would be without metrics collection. In a production system, the application should still continue after header processing errors, but any errors should be reported to the event log.

ConsoleReader—the Application "Server"

Normally, an application server is implemented as a Windows service, and a large transaction-based system will have a pool of application server machines. In this article, I chose to model the application server as a console application because it will allow me to run any number of application servers easily, together with multiple clients and Web services, on a single machine. Even though it is a console application, it is functionally equivalent to an application server since it performs the same basic operations—it reads work from a queue and deposits results into a database.

The code for the ConsoleReader application is available in the download. Each ConsoleReader application reads a WorkItem from the factorjob queue, computes the prime factors of its System.Decimal mItem object, and uses the DataAccessLayer to place the results into the Factor database. Multiple application servers can be simulated by creating multiple instances of ConsoleReader. Note that the factoring algorithm is very rudimentary and will take considerable time to factor extremely large numbers.

MetricsReader—the Metrics Logger

MetricsReader, shown in Figure 8, was also developed as a console application for simplicity. In a manner similar to the ConsoleReader, it reads TransactionMetricsPayload.Header items from the transactionmetrics queue and adds the array of ClientReport objects to the Metrics database.

Multiple MetricsReaders can be started. The app doesn't employ a data access layer because the design is simple and the performance of the metrics infrastructure is relatively unimportant. Metrics header logging would in practice be performed on distinct servers anyway to reserve other resources for application servers.

Client Design and Coding

The client application in this article is designed as a console application, one that creates multiple threads, each one simulating a client that is submitting factor transactions. It is easy to build because I use the proxy class that was automatically generated from the service's WSDL to handle communications with the server. A PayloadManager class is developed to handle metrics recording. This class is general enough to be used in any Web services client application. For example, to build a trading application client, you would use the server class generated from the trading service's WSDL and the identical PayloadManager class used in this sample.

WSDL-Generated Service1 Class

The WSDL for the factoring service can be viewed in Microsoft Internet Explorer by entering the URL http://yourserver/Factor/Service1.asmx?WSDL. When you open it, notice the definitions for Factor and the CheckJob services and the FactorAck structure returned by Factor. It also has the complete definitions for Header and the ClientReport and the AVPair arrays it contains. The .NET Framework wsdl.exe utility is used to generate this class by entering the following command line:

WSDL -l:cs http://yourserver/Factor/Service1.asmx?WSDL

I want to be able to move the client application to a different machine. Since the server name is embedded in the client class that wsdl.exe generates, this should be a name that can be resolved to the server host I want to use regardless of the machine on which the proxy will be used. The \ConsoleClient\Service1.cs file has the WSDL-generated code for the sample application.

PayloadManager—The Client Metrics Recording Library

PayloadManager is the central part of the metrics recording design. It allows any client application to construct useful metrics payloads for response time analysis and system capacity planning (see Figure 9). It was designed with a number of objectives in mind. It must not require any new message exchanges between the client and server. It should be general enough to be used within any client application that implements business transactions with SOAP requests. In addition, it should provide a facility to buffer a configurable number of metrics before uploading them, and it should allow the application to choose which SOAP requests might have metrics headers. This allows the implementation to be tuned according to server and network resource limits.

The PayloadManager class constructor is used to specify the unique identifier of this client instance as well as to specify the number of ClientReports to buffer before sending. Client applications can use AddAttributeToClient to set client attributes.

BeginTransactionRecord is used to identify the start of a new business transaction and its type. A transaction will consist of several Service1 calls, the first of which must return a unique transaction identifier and the server time when the transaction was initiated. The client invokes SetTransaction to save these in the current ClientReport.

The client should precede any Service1 calls with a call to GetHeader to assign the HeaderValue member of the Service1 object. If the number of ClientReports is less than the numToBuffer threshold set in the PayloadManager constructor, the call will return null and no header will be sent in the next message.

AddAttributeToCurrentTransaction is used to add attributes to the transaction that is currently in progress. When a transaction is completed, the client application records the transaction status by invoking EndTransaction. Internally, PayloadManager records the elapsed time from the BeginTransaction call. It is important that both BeginTransaction and EndTransaction are invoked as close as possible to any user activity, so that elapsed times reflect the subjective experience of users.

In my implementation, PayloadManager is not thread safe. Even when a client application is multithreaded, server communications are almost always performed within a single thread.

ConsoleClient—the Client Simulator

A console application is used to simulate a population of clients submitting transactions of varying complexity at fixed intervals. ConsoleClient uses a number of command-line arguments, which are listed in Figure 10.

Argument	Definition
scale	Size of maximum number to factor
count	Number of factoring problems/client (of size 0 .. scale) to submit
gdelay	Milliseconds between problem submissions
cdelay	Milliseconds between problem completion checks
buffer	Number of reports buffered before sending
threads	Number of clients to start
tdelay	Seconds between client starts

The tdelay parameter allows a client population to be introduced slowly to the system so that the rate at which server capacity is exceeded can be determined. Note that buffer is an upper limit—when a client thread is created, it is passed a random buffering factor to use whose value is between 1 and buffer. Since each client reports the actual buffer threshold it was given as a client attribute, the metrics database can be used to investigate any effect this value has on client response time.

Each client thread creates its own instances of Service1 and PayloadManager. Each thread submits count number of factoring problems between 0 and scale. It uses Thread.Sleep to wait gdelay milliseconds after a factoring problem is completed before submitting another. When a transaction is begun, a call to checkjob is made every cdelay milliseconds until the results are available.

The Sample Application

The download consists of C# projects and associated database SQL files. It is configured so that the entire sample transaction system can be run on a single Windows XP Professional installation, although in this example I will move the client simulator to a different machine. It contains several items, including a solution developed in Visual Studio .NET named ClientTransactionMetrics which consists of seven projects: ConsoleClient, ConsoleReader, Factor, DataAccessLayer, MetricsReader, TransactionMetricsPayload, and TransactionMetricsServer. They are all located in subdirectories of the main solution directory, except for Factor and its directory which should be placed in the default IIS root directory. Also included are FactorDatabase.SQL and MetricsDatabase.SQL, found in ClientTransactionMetrics\DB.

In order to run the system, you will need to create and populate the databases, create the message queues, and modify the source code server machine names as necessary. See the README.txt file included with the download for complete setup information.

Running the Transaction System

Create a command window and start the MetricsReader application from the command line. Create another command window and start an instance of ConsoleReader. If possible, move the ConsoleClient executable to a different machine (one that has the .NET Framework installed) and start it there with the following command line:

>ConsoleClient 1000000 100 333 2 6 1 10

(Note that if the client reports a "401 access denied error" it can be corrected by using CredentialCache.DefaultCredentials with the client proxy to authenticate to IIS using your domain credentials.)

This will start a single client thread, submitting 100 random factoring problems. The numbers to be factored will be between 0 and 1,000,000. It will wait a third of a second between problem submissions and it will wait two milliseconds after submitting a problem before checking results. The client will choose a random buffering factor between 1 and 6.

Each ConsoleClient thread reports the completion of 10 transactions at a time, so the output from the command just shown will be something like the following:

Thread:1 is at job:10
Thread:1 is at job:20
Thread:1 is at job:30
•••

If the system is running properly, ConsoleReader will report the count of completed factoring problems in increments of 50:

Jobs completed:2
Jobs completed:52
•••

Since ConsoleReader waits on the factorjob queue, it won't exit until it is shut down manually. MetricsReader will display every metrics report it receives after it logs it in the Metrics database. A sample metrics report follows. It shows the transaction and client IDs, the transaction start time, its elapsed time in milliseconds, and the attributes the client assigned to the transaction:

Metrics Report for Transaction: 9d873634-5f37-42c5-8348-730afe23b567
    ClientID: 9d873634-5f37-42c5-8348-730afe23b567
    Started at: 12/11/2002 1:49:29 PM
    Elapsed Time: 380
   Tx  Attribute: Number to factor Value:200472
   Tx  Attribute: Number of Factors Value:6
•••

Using the Metrics Database

In this section, I will demonstrate the usefulness of the collected metrics data by using it to determine the throughput that the system as a whole is capable of achieving. This measurement will account for client processing, network I/O, IIS, ASP.NET, queuing effects, database logging, and the CPU resources that are needed to factor a given problem.

This systemwide view must be distinguished from the analysis you would make to evaluate and optimize a particular component. Profiling and design analysis are necessary to ensure components make the best possible use of available resources. While this design will reveal capacity problems, you'll still have to use analysis to resolve them. The key benefit of client-oriented metrics gathering is that it allows overall transaction system capacity to be measured against its key performance target: client response time.

Performing the Test

In this section, I will use ConsoleClient to test the capacity of this system. I run two tests. In the first test, I run a single client with a low input rate. The purpose of this test is to examine response time in an unloaded system so I can study the effect that factoring problem complexity has on application response time.

In the second test, I run many clients in an attempt to overload the system. This is designed to measure server throughput, which is the maximum number of transactions a server can complete per unit of time. It is important to distinguish throughput from expected response time. Consider a system that has a maximum throughput of 100 transactions/second. In an unloaded system, a single client may expect a response time of approximately 1/100th of a second (ignoring network delays). That said, if I load this system with a real-world population that submits transactions at an average rate of 100 transactions/second, those clients will experience response times far slower that 1/100th of a second. While the average rate is 100/second, the random behavior of real-world client populations means that some intervals may see request rates of 80/second while others see 120/second. Backlog queues accumulate in the server during the peak arrival intervals, leading to increased response times.

To begin this example, move the ConsoleClient application to another machine. While ConsoleClient will use only a small amount of CPU (less than five percent) to load the system, it has to be running on a separate machine in order to accurately measure server capacity. Otherwise, the simulated clients will have difficulty generating new transactions when the system is heavily loaded with request traffic.

To make the analysis easier, first remove any existing client metrics from the Metrics database. Start the SQL Query Analyzer, connect to the Metrics database, and execute the following SQL:

delete from tblTransaction
delete from tblClient
delete from tblClientAVPair
delete from tblTransactionAVPair

On the server machine, start two instances of the ConsoleReader and shut down any instances of MetricsReader, which is tangential to the test. Since the metrics reports are queued, you can run it after the factoring load test is complete. I use two instances of ConsoleReader so that one can be computing factors if the other is waiting for a database update to complete. Start the ConsoleClient with the following command:

>ConsoleClient 1000000 100 333 2 6 1 10

This creates a single client load on the system. The client submits a factoring problem, checks back frequently for the results, and then waits a third of a second before submitting the next problem. The use of a single client prevents the accumulation of work queue jobs in the system so it can be used as a baseline estimate of the total server time needed to receive factoring problems for random numbers between 1 and 1,000,000 and deliver them back to the client. The client will finish after it submits 100 factoring jobs.

After the first ConsoleClient simulation ends, start another with the following command:

>ConsoleClient 1000000 700 333 2 6 10 30

This will create a total of 10 threads, each behaving like the single client in the last simulation. The threads are started at 30 second intervals in order to slowly increase the load on the system so I can test the point at which response times degrade. Since there are multiple clients submitting factoring jobs, a backlog will accumulate in the factorjob queue when the two ConsoleReader application servers are busy.

This simulation will take approximately 15 minutes to complete. When the client finishes, start a MetricsReader on the server and let it run until it has logged all the queued metrics reports into the Metrics database.

Analyzing the Results

Now I use the Metrics database to estimate the capacity of my simple factoring system. Recall that I designed the Metrics database for extensibility. I left the client and transaction attribute tables denormalized so that I could easily record new attributes in the client application without making schema or server design changes. This makes the analysis more difficult, but since analysis is performed offline on systems that don't affect production capacity, it's a manageable price to pay for the extensibility that is gained.

You can make the Metrics database more intuitive by creating a view. This view joins tblTransaction, tblClientAVPair, and tblTransactionAVPair into a single entity that relates transactions, response times, and both client and transaction attributes (see Figure 11 for an example).

For convenience, I define a summary view for clients:

create view vClient
as
    select 
       ClientID, 
           submitted=count(*),
           first=min(ClientStartTime), 
           last=max(ClientStartTime)
    from vTransaction
    group by ClientID

This view summarizes all the distinct clients that have reported metrics, showing the time of their first and last transactions, and the total number each client has submitted. I ran all simulated clients on the same client machine so their times are synchronized. However, a real population of clients would not have synchronized times, so for this reason I also gathered ServerStartTime along with the metrics for each transaction. I could substitute it in the query just shown to get the best measure of relative time that is available for those situations.

The command "select * from vClient order by first asc" should display the client from the first test (the one where I only simulated one client) at the top of the list. I'll work with that client and study the behavior of its transactions. I embed the query (adding the "top 1" qualifier) in a dynamic table within an outer query to look at all the transactions that client reported on:

select * 
from vTransaction t,
(
    select top 1 *
    from vClient
    order by first asc
) loneclient
where 
   t.ClientID = loneclient.ClientID

Note that all the buffering values are the same. A simulated client chooses a random buffering value and uses it to control all of its uploads. Also note that if the buffering value is greater than 1, there will be missing requests. This is because the final request issued by a client will not be reported. Once again, this is a sampling design, intended to give data on almost all operations. In my case, the client used a buffering factor of 3, so it only reported 99 of the 100 transactions it submitted.

Since that client ran alone, there was no system overload and no queuing, so I can study the relationship between the number of prime factors computed and the elapsed time experienced by that client. I ran the following query to look at the time that client experienced for each factoring problem:

select NumberOfFactors=ProblemFactors, ClientElapsedMS
from vTransaction t,
(
    select top 1 *
    from vClient
    order by first asc
) loneclient
where 
   t.ClientID = loneclient.ClientID
order by ClientStartTime

The first transaction takes almost eight seconds, an anomaly that can be attributed to the overhead of new object and connection creation when the server receives the first call to the Factor service. Almost all transactions are completed in well under a second; a handful, associated with factor counts of two, take approximately two seconds. This relationship is confirmed with the query shown in Figure 12. The output shows the average response time by the number of prime factors. Note the large difference between 2 ProblemFactors and the others. The ConsoleReader prime number factoring algorithm returns an array of two items when it is given a prime number to factor. The number to be factored appears in both slots. The algorithm must work considerably harder attempting to factor a prime number because its main For loop is executed n times where n is the number to be factored.

The average client response time in my test for the last 98 transactions (excluding the anomalous one at the start) is reported as 524 milliseconds. However, this is not a true measure of the capacity that the server machine is capable of. Recall that two instances of ConsoleReader were running in both the first test and the second, multiclient one. However, in the first test, the behavior of the client—waiting until a transaction was completed until submitting the next—meant that there was only one transaction in the system at a time. As a result, there was always an idle ConsoleReader and no opportunity within the system to use available CPU time while one process in the chain was I/O bound.

The second test, which was designed to overload the system, represents a better test of system capacity. Each client in that test behaved in the same manner as the lone client in the first test. But since there were many clients, there was always sufficient work queued to take advantage of any available resource. Overloading a system like this gives a measure of throughput—the maximum rate of work a server can complete per unit of time.

The second test introduced 10 clients at the rate of 1 new client every 30 seconds. Each client submits factoring problems at the rate of 3 per second. Each client submits 700 transactions so that for several seconds in the middle of the run, all 10 clients are actively loading the server. A test like this is designed to slowly ramp up the number of clients and slowly increase the transaction rate in order to reveal the point at which response times degrade.

Now I'll create a new view of tblClients, one that selects only the clients that ran during the second test:

create view vClientM
as
    select 
       ClientID, 
           submitted=count(*),
           first=min(ClientStartTime), 
           last=max(ClientStartTime)
    from vTransaction
    group by ClientID 
    having count(*) > 650

Notice that the number of reported transactions returned by "select * from vClientM" differs among the clients—each used a different buffering factor, leading to a different number of concluding transactions that were unreported.

The query in Figure 13 uses vClientM. The output lists average transaction response times by the number of prime factors. While prime number problems (those with two factors) still take longer than the others, the difference is smaller than in the first test. There are two reasons for this. In the first test, since there was only one problem in the system at a time, the resulting response times are serially independent. This means that the time required for one problem is independent of the preceding one. This is no longer true in the second, multiclient test. Since there are many clients submitting transactions, the time required for each is dependent on two conditions. The first condition is the difficulty of the problem (whether the number to be factored is prime and also, to some extent, its magnitude), and the second is the state of the system (whether there is a backlog of unfinished work queued in the server that has to be completed first).

Let's examine the average response times against the number of input transactions over time. First, I define a view (vTransactionM) of only those transactions that occurred within the multiclient test. Then I do an inner join, getting the time offset from the beginning of the test. For each one-second interval I get the number of transactions that have been submitted and the average client response time (see Figure 14).

Figure 15 Graphing Response Times

In Microsoft Excel, I take the query output and add 10 period trend lines, represented in Figure 15. Notice the correspondence between the input load trend and the response time trend. Normally, an overload test like this would saturate a transaction system, leading to exponentially increasing delays as the input load continued to exceed system capacity. But in this case, all server components run on a single machine. When the system reaches its capacity, it is unable to accept new work because the Web services have no available CPU time to accept an input transaction and queue it for processing. Recall also that in my simple example, the Web services check the same database that the application server uses to post results, leading to additional coupling between the two layers. This is why the input load rate in the example has the long plateau in the middle of the run.

In a real production system, the Web server, the application servers, and the database might all be running on different machines. In that architecture, the Web components can accept new work and queue it for delivery to internal servers. Queues can accumulate, leading to exponentially increasing delays when the capacity of server machines is exceeded.

Limitations

The design I've described makes it difficult to add application data to SOAP headers. For example, a system that uses SOAP headers to contain authentication credentials would have to modify the Header class that is implemented here in order to add application-specific fields. PayloadManager processing would have to be modified as well, since it would no longer be the sole owner of the Header. A better approach in this case might be to use a SOAP attachment, rather than a header, for metrics data. Alternatively, applications services could implement an optional general-purpose metrics block, one similar to the Header class, within each Web services call. The Header class is also not optimized for size. A redesign might be necessary for very low-bandwidth connections.

The ClientID in this design really represents a session; it doesn't represent a client application. Recall that a new GUID is generated whenever a new PayloadManager instance is created. In practice, it would really be better to use a machine identifier for ClientID. The Header should be extended to contain a SessionID in addition to the ClientID, and a new session entity should be added to the Metrics database as well.

As I mentioned previously, the DataAccessLayer should not be used as a model for a production data access layer. It doesn't employ connection pooling, and it doesn't use transactions to guarantee referential integrity in the Factor database.

Further Work and Possible Extensions

This design should be evaluated against other Web service implementations to ensure that the Header class works with other SOAP client implementations such as the Microsoft SOAP Toolkit.

This has been presented as a sampling design—a program that gathers metrics on most, but not all of the transactions. If deemed necessary, coverage could be increased by having PayloadManager save any buffered metrics reports to the client file system when an application session ends.

This design is not limited to response-time metrics; it can also be used for general-purpose quality reporting. Examples of other uses might be to add software version and operating system environment items as client attributes. Application faults could also be added as client or transaction attributes.

The Metrics database can be extended to include summaries of system resource utilization over time by loading operating system performance counters into the database. Client response time delays that indicate system bottlenecks could be matched against machine resource utilization to identify the limiting machine or application. Client application resources can contribute to delays as well, and so the items, such as client CPU utilization or available bandwidth for example, can be included as transaction attributes.

The MetricsReader implementation can be extended to allow near real-time intervention when clients experience problems. For instance, a metrics report that indicates an application software problem or an extremely long wait for a transaction could utilize SNMP to raise an alarm.

Brian Connolly is an independent consultant specializing in enterprise transaction system design and performance analysis. He has led development teams that have produced international trading systems and data warehouses. Contact Brian at brian@ideajungle.com.