Anyone who goes shopping or to a movie experiences the inconvenience of waiting in line. Not only do people spend time waiting in lines, but parts and products queue up prior to a manufacturing operation and wait to be worked on, machinery waits in line to be serviced or repaired, trucks line up to be loaded or unloaded at a shipping terminal, and planes wait to take off and land. Waiting takes place in virtually every productive process or service. Since the time spent by people and things waiting in line is a valuable resource, the reduction of waiting time is an important aspect of operations management.

Waiting time has also become more important because of the increased emphasis on quality, especially in service-related operations. When customers go into a bank to take out a loan, cash a check, or make a deposit, take their car into a dealer for service or repair, or shop at a grocery store, they equate quality service with quick service. Companies focus on reducing waiting time as a component of quality improvement. Companies are able to reduce waiting time and provide faster service by increasing their service capacity, which usually means adding more servers--that is, more tellers, more mechanics, or more checkout clerks. However, increasing service capacity has a monetary cost, and therein lies the basis of waiting line, or queuing, analysis, the trade-off between the cost of improved service and the cost of making customers wait.

Waiting lines are analyzed with a set of mathematical formulas which comprise a field of study called *queuing theory.* The origin of queuing theory is found in telephone network congestion problems and the work of A. K. Erlang. Erlang (1878-1929), a Danish mathematician, was the scientific adviser for the Copenhagen Telephone Company. In 1917 he published a paper outlining the development of telephone traffic theory, in which he was able to determine the probability of different numbers of calls waiting and the waiting time when the system was in equilibrium. Erlang's work provided the stimulus and formed the basis for the subsequent development of queuing theory.

Different queuing models and mathematical formulas exist to deal with different types of waiting line systems. Although we discuss several of the most common types of queuing systems, we do not investigate the mathematical derivation of the queuing formulas. They are generally complex and not really pertinent to our understanding of the use of queuing theory to improve service.

Waiting lines form because people or things arrive at the servicing function, or server, faster than they can be served. This does not mean that the service operation is understaffed or does not have the capacity to handle the influx of customers. Most businesses and organizations have sufficient serving capacity available to handle its customers *in the long run.* Waiting lines result because customers do not arrive at a constant, evenly paced rate, nor are they all served in an equal amount of time. Customers arrive at random times, and the time required to serve each individually is not the same. A waiting line is continually increasing and decreasing in length (and is sometimes empty) and in the long run approaches an average rate of customer arrivals and an average time to serve the customer. For example, the checkout counters at a grocery store may have enough clerks to serve an average of 100 customers in an hour, and in a particular hour only 60 customers might arrive. However, at specific points in time during the hour, waiting lines may form because more than an average number of customers arrive and they have larger than average purchases.

Decisions about waiting lines and the management of waiting lines are based on these averages for customer arrivals and service times. They are used in queuing formulas to compute **operating characteristics** such as the average number of customers waiting in line and the average time a customer must wait in line. Different sets of formulas are used, depending on the type of waiting line system being investigated. A bank drive-in teller window at which one bank clerk serves a single line of customers in cars is different than a single line of passengers at an airport ticket counter that are served by three or four airline agents. In this section we present the elements that make up waiting lines before looking at waiting line formulas in the following sections.

The basic elements of a waiting line, or **queue,** are arrivals, servers, and the waiting line. The relationship between these elements is shown in Figure 16.1 for the simplest type of *waiting line system,* a single server with a single queue. This is commonly referred to as a *single-channel* queuing system. Following is a brief description of each of these waiting line components.

In our discussions of queuing, a customer is a person or thing that wants service from an operation. The **calling population** is the source of the customers to the queuing system, and it can be either *infinite* or *finite.* An infinite calling population assumes such a large number of potential customers that it is always possible for one more customer to arrive to be served. For example, a grocery store, a bank, and a service station are assumed to have infinite calling populations; that is, the whole town or geographic area.

A finite calling population has a specific, countable number of potential customers. It is possible for all the customers to be served or waiting in line at the same time; that is, it may occur that there is not one more customer to be served. Examples of a finite calling population are a repair facility in a shop, where there is a fixed number of machines available to be worked on, a trucking terminal that services a fleet of a specific number of trucks, or a nurse assigned to attend to a specific number of patients.

The **arrival rate** is the rate at which customers arrive at the service facility during a specified period of time. This rate can be estimated from empirical data derived from studying the system or a similar system, or it can be an average of these empirical data. For example, if 100 customers arrive at a store checkout counter during a 10-hour day, we could say the arrival rate averages 10 customers per hour. However, although we might be able to determine a rate for arrivals by counting the number of customers during a specific time period, we would not know exactly when these customers would arrive. It might be that no customers would arrive during one hour and 20 customers would arrive during another hour. Arrivals are assumed to be independent of each other and to vary randomly over time.

We further assume that arrivals at a service facility conform to some probability distribution. Arrivals could be described by many distributions, but it has been determined (through years of research and the practical experience of people in the field of queuing) that the number of arrivals per unit of time at a service facility can frequently be defined by a *Poisson distribution.* In queuing the average arrival rate, or how many customers arrive during a period of time, is signified by *l*.

The queuing theory arrivals are described in terms of a *rate* and service in terms of *time.* **Service times** in a queuing process may also be any one of a large number of different probability distributions. The distribution most commonly assumed for service times is the *negative exponential distribution.* Although this probability distribution is for service *times,* service must be expressed as a *rate* to be compatible with the arrival rate. The average service rate, or how many customers can be served in a period of time, is expressed as *m*.

Empirical research has shown that the assumption of negative exponentially distributed service times is not valid nearly as often as is the assumption of Poisson-distributed arrivals. Therefore, for actual applications of queuing analysis, this assumption would have to be carefully checked before this distribution was used.

It is logical to assume that the rate at which services are completed must exceed the arrival rate of customers. If this is not the case, the waiting line will continue to grow, and there will be no "average" solution. Thus, it is generally assumed that the service rate exceeds the arrival rate, *l* < *m*.

The **queue discipline** is the order in which waiting customers are served. The most common type of queue discipline is *first come, first served*--the first person or item in line waiting is served first. Other disciplines are possible. For example, a machine operator might stack in-process parts beside a machine so that the last part is on top of the stack and will be selected first. This queue discipline is *last in, first out.* Or, the machine operator might reach into a box full of parts and select one at random. This queue discipline is *random.* Often customers are scheduled for service according to a predetermined appointment, such as patients at a dentist's office or diners at a restaurant where reservations are required. These customers are taken according to a prearranged schedule regardless of when they arrive at the facility. Another example of the many types of queue disciplines is when customers are processed alphabetically according to their last names, such as at school registration or at job interviews.

Queues can be of an infinite or finite size or length. An **infinite queue** can be of any size with no upper limit and is the most common queue structure. For example, it is assumed that the waiting line at a movie theater could stretch through the lobby and out the door if necessary. A **finite queue** is limited in size. An example is the driveway at a bank teller window that can accommodate only a limited number of cars.

Waiting line processes are generally categorized into four basic structures, according to the nature of the service facilities: single-channel, single-phase; single-channel, multiple-phase; multiple-channel, single-phase; and multiple-channel, multiple-phase processes. These are illustrated graphically in Figure 16.2.

The number of **channels** in a queuing process is the number of parallel servers for servicing arriving customers. The number of **phases,** on the other hand, denotes the number of sequential servers each customer must go through to complete service. An example of a *single-channel, single-phase* queuing operation is a post office with only one postal clerk waiting on a single line of customers. A post office with several postal clerks waiting on a single line of customers is an example of a *multiple-channel, single-phase* operation.

When patients go to a doctor for treatment or check into a hospital, they wait in a reception room prior to entering the treatment facility. When they get to the treatment room, the patients receive an initial checkup or treatment from a nurse, followed by treatment from a doctor. This arrangement constitutes a *single-channel, multiple-phase* queuing process. If there are several doctors and nurses, the process is a *multiple-channel, multiple-phase* process.

An example of another multiple-phase system is a manufacturing assembly in which a product is worked on at several sequential machines or operators at a workstation. An example of a single-channel, multiple-phase system is a manufacturing assembly line type operation in which in-process product units are fed to several sequential machines or operators at workstations to be worked on. Two or more of these lines operating in tandem and being fed by a single line of product units are an example of a multi-channel, multi-phase system.

You may immediately visualize a familiar waiting situation that fits none of these categories of waiting line structures. The four categories of queuing processes presented are simply the four basic categories; there are many variations. For example, rather than a single queue preceding the multiple-channel, single-phase case, there might be a separate queue preceding each server. In the multiple-channel, multiple-phase case, items might switch back and forth from one channel to the other between each of the various service phases. Queuing models can become quite complex. However, the fundamentals of basic queuing theory are relevant to the analysis of all queuing problems, regardless of their complexity.

The mathematics used in queuing theory do not provide an optimal, or "best," solution. Instead they generate measures referred to as *operating characteristics* that describe the performance of the queuing system and that management uses to evaluate the system and make decisions. It is assumed these operating characteristics will approach constant, average values after the system has been in operation for a long time, which is referred to as a *steady state.* These basic operating characteristics used in a waiting line analysis are defined in Table 16.1.

**16-3.** Discuss briefly the relationship between waiting line analysis and quality improvement.

**16-4.** List the elements that define a queuing system.

**16-5.** For each of the following queuing systems, indicate if it is a single-or multiple-server model, the queue discipline, and if its calling population is infinite or finite:

- Hair salon
- Bank
- Laundromat
- Doctor's office
- Adviser's office
- Airport runway
- Service station
- Copy center
- Team trainer
- Mainframe computer

**16-6.** Provide an example of when a first-in, first-out (FIFO) rule for queue discipline would not be appropriate.