Understanding Statistics

Many of us have data stored in a database or file that we need to analyze on a regular basis. If you're in that situation and you're using Minitab Statistical Software, here's how you can save some time and effort by automating the process.

When you're finished, instead of using File > Query Database (ODBC) each time you want to perform analysis on the most up-to-date set of data, you can add a button to a menu or toolbar that will update the data. To do this you will need to:

A. Create an Exec (.MTB) file that retrieves the data and replaces the current data.
B. Add a shortcut to that file to either a menu or toolbar.

Creating an Exec (.MTB) file

First, I'll create a Minitab script or "exec" that pulls in new data to my worksheet. This is easier than it might sound.

1. Use File > Query Database (ODBC) to import the desired data. I have several fields that need to be updated, so I can just use File > Query Database (ODBC) repeatedly to pull required fields from multiple tables.

2. Open the History window by clicking the yellow notepad icon and select the ODBC commands/subcommands.

3. Right-click the selected commands and choose Save As...

4. In the Save As... dialog box, choose Exec Files (*MTB) from the Save as Type: drop-down. Choose a filename and location—for example, I'm going to save this as GetData.MTB on my desktop.

5. In Minitab, choose Tools > Notepad.

6. In Notepad, choose File > Open. Change Files of Type: to All Files, and open the .MTB file you just created.

7. Do the following for each ODBC command and corresponding subcommands:

Replace the period (.) at the end of the last subcommand with a semi-colon (;).
Add the following below the last subcommand, including the period (In this example, 'Date' and 'Measurement' are the columns I want to store the imported data in. Typically, these share the same name as the fields they are imported from):

Columns 'Date' 'Measurement'.

For example:

Make sure the column names you specify in the Columns subcommand already exist in the Minitab worksheet. You also can use column numbers such as C1 C2, without single-quotes. If you're importing many columns, instead of naming each one individually, you can specify a range like this: Columns C1-C10.

8. Choose File > Save and then close Notepad. This exec will run the commands and update my data sheet each time it is run.

But I want to make it even easier. Instead of opening the script when I want to use it, I want to be able to just select it from a menu.

Adding a Shortcut to a Minitab Menu

To add the .MTB file to a menu in Minitab, I do the following:

1. Choose Tools > Customize.

2. Click the Tools tab.

3. Click for New (Insert) as shown. If you hover the cursor over the button, the ToolTip displays New (Insert).

4. Enter a name for the button, and then press [Enter]. (For example, enter Get My Data.)

5. Click to view the Open files dialog box. From Files of type, choose All Files (*.*) then navigate to the .MTB file and double-click it. The dialog box will look like this:

6. Click Close. Now I can run the macro by choosing Tools > Get My Data.

I can also add the macro to a menu other than Tools.

Adding a Button to a Minitab Toolbar

But now that I think about it, I really don't even want to bother with a menu. I'd prefer to just click on a button and have my data updated automatically. It's easy to do.

7. Choose Tools > Customize.

8. On the Commands tab, under Categories, choose Tools. Note: If you did not complete steps 5 and 6, the macro will not yet appear in the list.

9. Click and drag Get My Data to the desired place on a menu or toolbar.

Basically, that's it. However, you can change what is displayed on the toolbar by right-clicking the button or text while the Tools > Customize dialog box is open. You can select Image, Text, or Image and Text.

To change the image that is displayed, choose Edit Button Image. To change the text that is displayed, choose Name Button. As shown below, I have inserted a red button with a circular arrow in the main toolbar, and named it "Get My Data."

Now I can update my data at any time by clicking on the new button. And if you've been following along, so can you! If you don't already have Minitab Statistical Software and you'd like to give it a try, download the free 30-day trial.

If you're just getting started in the world of quality improvement, or if you find yourself in a position where you suddenly need to evaluate the quality of incoming or outgoing products from your company, you may have encountered the term "acceptance sampling." It's a statistical method for evaluating the quality of a large batch of materials from a small sample of items, which statistical software like Minitab can make much easier.

Basic statistics courses usually teach sampling in the context of surveys: you administer the survey to a representative sample of individuals, then extrapolate from that sample to make inferences about the entire population the samples comes from. We hear the results of such sampling every day in the news when the results of polls are discussed.

The idea behind acceptance sampling is similar: we inspect or test a sample of a product lot, then extrapolate from that sample to make an inference about whether the entire batch is acceptable, or whether it needs to be rejected.

You can see why this is useful for safeguarding quality. If you work for an electronics manufacturer that is receiving a shipment of 500 capacitors, inspecting and testing every one will take too much time and cost too much money. It's much more efficient to examine a few to determine whether the full shipment is ready to use, or if you should send the lot back to your supplier.

But how many do you need to look at? Acceptance sampling will help you determine how many capacitors to examine, and how many defectives you can tolerate and still accept the shipment.

But it's important to remember that acceptance sampling won't give estimates of quality levels, and because you're inspecting items that are already complete, it doesn't give you any direct process control.

Acceptance Sampling by Attributes, or by Variables?

If you want to use acceptance sampling to evaluate a batch of products, you first need to decide which method is best for your situation: acceptance sampling by attributes, or by variables.

Acceptance sampling by attributes assesses either the number of defects or the number of defective items in a sample. You might tally the total number of defects, in which case each defect in a single item with multiple defects is counted. Alternatively, you can count defective items, in which case the first problem makes an item defective, and you move on to evaluate the next item in your sample.

In Minitab, you can choose Stat > Quality Tools > Acceptance Sampling by Attributes to either create a new sampling plan or to compare various plans.

Attribute plans are generally easy to carry out: you randomly select a specified sample of n units from a lot of N units. If there are c or fewer defectives, accept the lot. If there are more than c defectives, reject it.

For example, suppose you're receiving 10,000 transistors. You will inspect 89 of them. If there are 0, 1, or 2 defective transistors, you can accept the shipment. But if the sample contains more than 2 defectives, you'll reject the lot.

Acceptance sampling by variables is based on quality characteristics you can measure. For example, you might measure the length of the leads on capacitors, resistors, or other electronic components for circuit boards.

In Minitab, you select Acceptance Sampling by Variables - Create / Compare to either devise a new sampling plan or to contrast different possible sampling plans. After you've collected data according to your variables plan, you need to calculate the mean, standard deviation, and Z value using those measurements. Just select Acceptance Sampling by Variables - Accept / Reject in Minitab to do those calculations and make a determination about the batch your sample came from.

One thing to remember about variables sampling plans is that only one measurement can be examined per sampling plan. So if you need to assess the lead length of sample resistors as well as their performance, two separate sampling plans are required. However, variables sampling plans require much smaller sample sizes than attributes plans.

Risks of Acceptance Sampling

Because we are not sampling the entire lot, there are two types of risk that we must consider:

Rejecting a good-quality batch, also known as producer's risk, or a.
Accepting a poor-quality batch, also known as consumer's risk, or b.

When you use Minitab for acceptance sampling, the software graphs an operating characteristic curve (OC curve) to quantify these risks. That graph illustrates the probability that a lot containing a certain fraction of defects or defective items will be accepted.

In this graph, based on a sample of 89 items, there's a 50% chance of accepting the batch when 3% of it is defective, but if the percent defective is 9%, there's only a 10% chance of accepting the batch.

In my next post, I'll go through an example of acceptance sampling by attributes.

Earlier, I shared an overview of acceptance sampling and presented an example of acceptance sampling by attributes. Now we'll look at how to do acceptance sampling by variables, facilitated by the tools in Minitab Statistical Software. If you're not already using it and you'd like to follow along, you can get the free 30-day trial version.

In contrast to acceptance sampling by attributes, where inspectors make judgment calls about defective items, acceptance sampling by variables involves the evaluating sampled items on properties that can be measured—for example, the diameter of a hole in a circuit board, or the length of a camshaft.

When you go to Quality > Acceptance Sampling by Variables you will find two options to select from.

Create / Compare lets you either create a new sampling plan or compare several different ones. Accept / Reject lets you evaluate and make an acceptance decision about a batch of items based on data you've collected according to a sampling plan.

In this post, we'll look at what you can do with the Create / Compare tools.

Creating a Plan for Acceptance Sampling by Variables

Suppose your electronics company receives monthly shipments of 1,500 LEDs, which are used to indicate whether a device is switched on or off. The soldering leads that go through the devices' circuit boards need to be a certain length. You want to use acceptance sampling to verify the length of the soldering leads.

Select Stat > Quality Tools > Acceptance Sampling by Variables >Create / Compare, then choose Create a sampling plan. Since we're creating and comparing variable sampling plans, we don't need any real data yet. Instead, we'll just enter information about our process into the dialog box, which you can fill out as shown below.

For Units for quality levels, choose the appropriate units for your measurement type. You and your supplier have agreed to use defectives per million to represent the number of defectives in your sample.

You and your supplier also have agreed on the poorest process average that would be an Acceptable quality level (AQL), as well as the poorest average you will tolerate before a lot reaches the Rejectable quality level (RQL or LTDP). You and the supplier agree that for LEDs, the AQL is 100 defectives per million, and the RQL is 400 defectives per million

You set the probability of accepting a poor lot (Consumer's risk) at 10 percent, and the chances of rejecting a good lot (Producer's risk) at 5 percent.

You also can enter upper and/or lower specification for your measured property, as well as optionally the historical standard deviation of your process. The lower specification for your LEDs leads is 2 cm.

The lot size refers to the entire population of units that the sample will be taken from. In this case, the size of your monthly LED shipment is 1500.

Interpreting Your Acceptance Sampling Plan

After you complete the dialog box as shown above and click OK, Minitab produces the following output in the Session Window.

You need to randomly select and inspect 64 items from each batch of 1500 LEDs. You'll use the mean and standard deviation of your random sample to calculate the Z value, where Z = (mean - lower spec)/ standard deviation. You also can use historical data about the standard deviation, if available.

If Z.LSL is greater than the critical distance, in this case k = 3.51320, you can accept the entire batch of LEDs. If the Z value is less than the critical distance, reject the shipment.

The probability of accepting a shipment at the AQL is 95%, and when the sampling plan was set up, you and your supplier agreed that lots of 100 defectives per million would be accepted approximately 95% of the time. You also have a 90% probability of rejecting a batch of LEDs at the RQL. This fits your agreement with the supplier that lots with 400 defectives per million would be rejected most of the time for your protection.

If, after a lot is rejected, the supplier's corrective action is to perform 100% inspection and rework any defective items, the Average Outgoing Quality (AOQ) represents the average quality of the lot and the Average Total Inspection (ATI) represents the average number of inspected items after additional screening.

The Average Outgoing Quality (AOQ) level is 91 defectives per million at the AQL and 38.2 defectives per million at the RQL. As we discussed in the overview of acceptance sampling, this is because outgoing quality will be good for lots that are either very good to begin with, or that undergo rework and reinspection due to a poor initial inspection. The Average outgoing quality limit (AOQL) represents the worst-case outgoing quality level, which usually occurs when a batch is neither very good nor very bad.

The Average Total Inspection (ATI) per lot represents the average number of LEDs inspected at a particular quality level and probability of acceptance. For the quality level of 100 defectives per million, the average total of inspected LEDs per lot is 135.6. For the quality level of 400 defectives per million, the average total number of items inspected is 1356.8.

You believe this is a reasonable and acceptable plan to follow, but your supervisor isn't sure, and asks you to see how this plan stacks up against some other possible options. I'll show you how to do that in my next post.

In my last post, I showed how to use Minitab Statistical Software to create an acceptance sampling plan by variables, using the scenario of a an electronics company that receives monthly shipments of LEDs that must have soldering leads that are at least 2 cm long. This time, we'll compare that plan with some other possible options.

The variables sampling plan we came up with to verify the lead length called for the random selection and inspection of 64 items from each batch of 1,500 LEDs. You and the supplier agree that the AQL is 100 defectives per million and the RQL is 400 defectives per million. If the calculated Z value is greater than the critical distance (3.51320), you'll accept the entire lot. Sounds great, right?

But that's not what your boss thinks. He believes that inspecting 64 LEDs is a waste of time, and says he's got a gut feeling that you could probably get by with inspecting half of that number, maybe even fewer. Your gut tells you that's too low.

Fortunately, Minitab makes it easy to consider a few possible plans very easily, so neither of you need to place a bet on whose gut feeling is correct. If you'd like to follow along and you're not already using Minitab, please download the free 30-day trial.

Setting Up a Comparison of Acceptance Sampling Plans

Start by selecting Stat > Quality Tools > Acceptance Sampling by Variables > Create/Compare..., and when the dialog box appears, choose the option for Compare User Defined Sampling Plans.

In Units for quality levels, choose Defectives per million. Since you and your supplier have already established the acceptable and rejectable quality levels, for Acceptable quality level (AQL) enter 100, and for Rejectable quality level (RQL or LTDP), enter 400. However, you don't need to enter AQL and RQL when you are simply comparing sampling plans.

In Sample sizes, enter 32 50 64 75 100, and enter 3.55750 in Critical distances (k values). All that's left is to enter the lower spec for each item, the historical standard deviation of .145, and the lot size of 1500.

When you press OK, Minitab's Session Window displays the following output:

How Do These Acceptance Sampling Plans Compare?

The table shown above shows the probabilities of accepting and rejecting lots of LEDs at quality levels of 100 and 400 defects per million opportunities for different sample sizes.

Your boss suggested cutting the number you sampled in half, to 32 items. Under that scenario, however, the producer's risk of having a good shipment rejected has more than doubled, from 5% to 12.2%. You know your supplier won't accept that. Moreover, your odds of properly rejecting a poor-quality shipment have fallen from 90% to just 81%, a level you aren't comfortable with. It's clear that a sample size of 32 does not give you or supplier sufficient protection.

Minitab also produces graphs that make it easy to see and understand the differences between sampling plans visually. In the graph below, the solid blue line represents the 32-item sampling plan, the dotted red line represents a plan that samples 50 items, and the green line represents the original sampling plan you devised, which called for evaluating 64 items.

OC Curve for Acceptance Sampling Plans

Comparing these curves, it's easy to see how far the blue line representing a 32-item sample diverges from the others. But the lines for the 50- and 64-item sampling plans are quite close; the chance of rejecting a good lot only rises 2%, from 5 to 7%, while the odds of correctly rejecting a poor lot only fall 3%, from 90 to 87%.

Evaluating 14 fewer LEDs would save a fair amount of time without adding much additional risk for either your or your supplier, so the 50-item sampling plan may be the best option for keeping yourself, your supplier, and your boss amenable to the inspection process.

In my next post, we'll use that sampling plan to evaluate the next lot of 1,500 LEDs, and make a decision about whether to accept the shipment, or reject it and return it to the supplier for corrective action.

Now that we've seen how easy it is to create plans for acceptance sampling by variables, and to compare different sampling plans, it's time to see how to actually analyze the data you collect when you follow the sampling plan.

If you'd like to follow along and you're not already using Minitab, please download the free 30-day trial.

Collecting the Data for Acceptance Sampling by Variable

If you'll recall from the previous post, after comparing several different sampling plans, you decided that sampling 50 items from your next incoming lot of 1,500 LEDs would be the best option to satisfy your supervisor's desire to sample as few items as possible while at the same time providing sufficient protection to you and your supplier. That protection stems from an acceptable probability that lots will not be accepted or rejected in error. Under this plan, you have just a 7% chance of rejecting a good lot, and an 87% chance to rejecting a poor lot.

So, on the day your next shipment of LEDs arrives, you select 50 of them and carefully measure the soldering leads. To make sure the sampling process will be effective, you're diligent about taking samples from throughout the entire lot, at random. You record your measurements and place the data into a Minitab worksheet.

Analyzing Acceptance Sampling by Variable Data

This time, when you go to Stat > Quality Tools > Acceptance Sampling by Variables, choose the Accept/Reject Lot... option.

The goal of this analysis is to determine whether you should accept or reject this latest batch of LEDs, based on your sample data. If the calculated Z value is greater than the critical distance (3.5132), you will accept the entire lot. Otherwise, the lot goes back to your supplier for rework and correction.

In Measurement data, enter 'Lead Length'. In Critical distance (k value), enter 3.5132. In Lower spec, enter 2. Finally, for Historical standard deviation, enter 0.145. Your dialog box will look like this:

When you click OK, the Session Window provides the following output:

Interpreting the Acceptance Sampling Output

From the measurements of the 50 LEDs, that you sampled, the mean length of the solder leads is 2.52254 centimeters, and the historical standard deviation is 0.145 inches. The lower specification of the pipe thickness is 2 inches.

When you created the sampling plan, the critical distance was determined to be 3.5132. Because this is smaller than the calculated Z.LSL (3.60375), you will accept the lot of 1,500 LEDs.

In an earlier post, I shared an overview of acceptance sampling, a method that lets you evaluate a sample of items from a larger batch of products (for instance, electronics components you've sourced from a new supplier) and use that sample to decide whether or not you should accept or reject the entire shipment.

There are two approaches to acceptance sampling. If you do it by attributes, you count the number of defects or defective items in the sample, and base your decision about the entire lot on that. The alternative approach is acceptance sampling by variables, in which you use a measurable characteristic to evaluate the sampled items. Doing it by attributes is easier, but sampling by variables requires smaller sample sizes.

In this post, we'll do acceptance sampling by attributes using Minitab Statistical Software. If you're not already using it and you'd like to follow along, you can get our free 30-day trial version.

Getting Started with Acceptance Sampling by Attributes

You manage components for a large consumer electronics firm. In that role, you're responsible for sourcing the transistors, resistors, integrated circuits, and other components your company uses in its finished products. You're also responsible for making sure your vendors are supplying high-quality products, and rejecting any batches that don't meet your standards.

Recently, you've been hearing from the assembly managers about problems with one of your suppliers of capacitors. You order these components in batches of 1,000, and it's just not feasible to inspect every individual item coming in. When the next batch of capacitors arrives from this supplier, you decide to use sampling so you can make a data-driven decision to either accept or reject the entire lot.

Before you can devise your sampling plan, you need to know what constitutes an acceptable quality level (AQL) for a batch of capacitors, and what is a rejectable quality level (RQL). As you might surmise, these are figures that need to be discussed with and agreed to by your supplier. You'll also need to settle on levels of the "producer's risk," which is the probability of incorrectly rejecting a lot that should have been accepted, and the "consumer's risk," which the probability that a batch which should have been rejected is accepted. In many cases, the Consumer's Risk is set at a higher level than the Producer's Risk.

Your agreement with the supplier is that the AQL is 1%, and the RQL is 8%. The producer's risk has been set at 5%, which means that about 95% of the time, you'll correctly accept a lot with a defect level of 1% or lower. You've agreed to accept a consumer's risk level of 10%, which means that about 90% of the time you would correctly reject a lot that has a defect level of 8% or higher.

Creating Your Plan for Acceptance Sampling by Attributes

Now we can use Minitab to determine an appropriate sampling plan.

Choose Stat > Quality Tools > Acceptance Sampling by Attributes.
Choose Create a sampling plan.
In Measurement type, choose Go / no go (defective).
In Units for quality levels, choose Percent defective.
In Acceptable quality level (AQL), enter 1. In Rejectable quality level (RQL or LTPD), enter 8.
In Producer's risk (Alpha), enter 0.05. In Consumer's risk (Beta), enter 0.1.
In Lot size, enter 1000.
Click OK.

Minitab produces the following output in the Session Window:

Interpreting the Acceptance Sampling by Attributes Plan

For each lot of 1,000 capacitors, you need to randomly select and inspect 65. If you find more than 2 defectives among these 65 capacitors, you should reject the entire lot. If you find 2 or fewer defective items, accept the entire lot.

Minitab plots an Operating Characteristic Curve to show you the probability of accepting lots at various incoming quality levels. In this case, the probability of acceptance at the AQL (1%) is 0.972, and the probability of rejecting is 0.028. When the sampling plan was set up, you and your supplier agreed that lots of 1% defective would be accepted approximately 95% of the time to protect the producer.

Operating Characteristic (OC) Curve

The probability of accepting a batch of capacitors at the RQL (8%) is 0.099 and the probability of rejecting is 0.901. The consumer and supplier agreed that lots of 8% defective would be rejected most of the time to protect the consumer.

What Happens If a Lot Gets Rejected?

When the next batch of capacitors arrives at the dock, you pick out 65 at random and test them. Five of the 65 samples are defective.

Based on your plan, you reject the lot. Now what? Typically, the supplier will need to take some corrective action, such as inspecting all units and reworking or replacing any that are defective.

Minitab produces two graphs that can tell you more. If we assume that rejected lots will be 100% inspected and all defects rectified, the Average Outgoing Quality (AOQ) plot represents the relationship between the quality of incoming and outgoing materials. The Average Total Inspection (ATI) shows the correlation between the quality of incoming materials and the number of items that need to be inspected.

When incoming lots are very good or very bad, the outgoing quality will be good because poor lots get reinspected and fixed, and good lots are already good. In the graph below, the AOQ level is 1.4% at the AQL and 1.0% at the RQL. But when incoming quality is neither very good or very bad, the number of bad parts that gets through rises, so outgoing quality gets worse. The maximum % defective level for outgoing quality is called the Average Outgoing Quality Limit (AOQL). This figure is included in the session window output above, and you can see it in the graph below: At about 3.45% defective, Average Outgoing Quality Limit(AOQL) = 1.968, the worst-case outgoing quality level.

Average Outgoing Quality (AOQ) Curve

The ATI per lot represents the average number of capacitors you will need to inspect at a particular quality level.

Average Total Inspection Curve

In the graph above, you can see that if the lot's actual % defective is 2%, the average total number of capacitors inspected per lot will approach 200 (including re-inspections after the supplier has rectified a rejected lot). If the quality level of 10% defective, the average total number of capacitors inspected per lot is 907.3.

Check out my earlier posts for a walk through of performing acceptance sampling by variables.

If you want to convince someone that at least a basic understanding of statistics is an essential life skill, bring up the case of Lucia de Berk. Hers is a story that's too awful to be true—except that it is completely true.

A flawed analysis irrevocably altered de Berk's life and kept her behind bars for a full decade, and the fact that this analysis targeted and harmed just one person makes it more frightening. When tragedy befalls many people, aggregating the harmed individuals into a faceless mass helps us cope with the horror. You can't play the same trick on yourself when you consider a single innocent woman, sentenced to life in prison, thanks to an erroneous analysis.

The Case Against Lucia

It started with an infant's unexpected death at a children's hospital in The Hague. Administrators subsequently reviewed earlier deaths and near-death incidents, and identified 9 other incidents in the previous year they believed were medically suspicious. Dutch prosecutors proceeded to press charges against pediatric nurse Lucia de Berk, who had been responsible for patient care and medication at the time of all of those incidents. In 2003, de Berk was sentenced to life in prison for the murder of four patients and the attempted murder of three.

The guilty verdict, rendered despite a glaring lack of physical or even circumstantial evidence, was based (at least in part) on a prosecution calculation that only a 1-in-342-million chance existed that a nurse's shifts would coincide with so many suspicious incidents. "In the Lucia de B. case statistical evidence has been of enormous importance," a Dutch criminologist said at the time. "I do not see how one could have come to a conviction without it." The guilty verdict was upheld on appeal, and de Berk spent the next 10 years in prison.

One in 342 Million...?

If an expert states that the probability of something happening by random chance is just 1 in 342 million, and you're not a statistician, perhaps you'd be convinced those incidents did not happen by random chance.

But if you are statistically inclined, perhaps you'd wonder how experts reached this conclusion. That's exactly what statisticians Richard Gill and Piet Groeneboom, among others, began asking. They soon realized that the prosecution's 1-in-342-million figure was very, very wrong.

Here's where the case began to fall apart—and not because the situation was complicated. In fact, the problems should have been readily apparent to anyone with a solid grounding in statistics.

What Prosecutors Failed to Ask

The first question in any analysis should be, "Can you trust your data?" In de Berk's case, it seems nobody bothered to ask.

Richard Gill graciously attributes this to a kind of culture clash between criminal and scientific investigation. Criminal investigation begins with the assumption a crime occurred, and proceeds to seek out evidence that identifies a suspect. A scientific approach begins by asking whether a crime was even committed.

In Lucia's case, investigators took a decidedly non-scientific approach. In gathering data from the hospitals where she worked, they omitted incidents that didn't involve Lucia from their totals (cherry-picking), and made arbitrary and inconsistent classifications of other incidents. Incredibly, events De Berk could not have been involved in were nonetheless attributed to her. Confirmation and selection bias were hard at work on the prosecution's behalf.

Further, much of the "data" about events was based on individuals' memories, which are notoriously unreliable. In a criminal investigation where witnesses know what's being sought and may have opinions about a suspect's guilt, relying on memories of events that happened weeks and months ago seems like it would be a particularly dubious decision. Nonetheless, the prosecution's statistical experts deemed the data gathered under such circumstances trustworthy.

As Gill, one of the few heroes in this sordid and sorry mess, points out, "The statistician has to question all his clients’ assumptions and certainly not to jump to the conclusions which the client is aiming for." Clearly, that did not happen here.

Even If the Data Had Been Reliable...

So the data used against de Berk didn't pass the smell test for several reasons. But even if the data had been collected in a defensible manner, the prosecution's statement about 1-in-342-million odds was still wrong. To arrive at that figure, the prosecution's statistical expert multiplied p-values from three separate analyses. However, in combining those p-values the expert failed to perform necessary statistical corrections, resulting in a p-value that was far, far lower than it should have been. You can read the details about these calculations in this paper.

In fact, when statisticians, including Gill, analyzed the prosecution's data using the proper formulas and corrected numbers, they found the odds that a nurse could experience the pattern of events exhibited in the data could have been as low as 1 in 25.

Justice Prevails at Last (Sort Of)

Even though de Berk had exhausted her appeals, thanks to the efforts of Gill and others, the courts finally re-evaluated her case in light of the revised analyses. The nurse, now declared innocent of all charges, was released from prison (and quietly given an undisclosed settlement by the Dutch government). But for an innocent defendant, justice remained blind to the statistical problems in this case across 10 years and multiple appeals, during which de Berk experienced a stress-induced stroke. It's well worth learning more about the role of statistics in her experience if you're interested in the impact data analysis can have on one person's life.

At a minimum, what happened to Lucia de Berk should be more than enough evidence that a better understanding of statistics could set you free.

Literally.

There has been plenty of noisy disagreement about the state of health care in the past several years, but when you get beyond the controversies surrounding various programs and changes, a great deal of common ground exists.

hospital bed Everyone agrees that there's a lot of waste and inefficiency in the way we've been doing things, and that health care should be delivered as efficiently and effectively as possible. But while a lot of successful models exist for using data to improve quality, the medical field has been slower than many other industries to adopt such data-driven quality improvement (QI) methods.

We have been talking to physicians, nurses, administrators, and other professionals at health care organizations in the United States and other countries to get more insight into the challenges of using data to improve health care processes, and to learn how Minitab might be able to help.

Operating with a Scalpel—and Statistics

We had a particularly enlightening conversation with Dr. David Kashmer, chief of surgery for Signature Healthcare in Brockton, Mass.

In addition to being a surgeon, Kashmer is a Lean Six Sigma Black Belt. In the 10 years since earning his belt, he's become passionate about using QI methods to improve trauma and acute care surgery. He also helps fellow practitioners do it, too, and talks about his experiences at the Surgical Business Model Innovation blog.

Kashmer told us about the resistance he encountered when he first began using statistical methods in his practice: “I kept hearing, ‘This guy is nuts...what’s he even talking about?’”

Nobody's saying that any more. Kashmer has shown that applying even basic statistical methods can yield big improvements in patient outcomes, and those once-skeptical colleagues are now on board. "When they saw the results from using statistical process control rather than typical improvement methods, they understood and began to appreciate their value," Kashmer said.

The Human Face of Health Care Quality

I've written previously about the language of statistics and how it can get in the way of our efforts to communicate what's really important about our analyses. Kashmer keyed in on similar themes when we asked him about the apparent reluctance among some in the medical profession to use data analysis for quality improvement.

"The language of the motivation for using statistics—to guard against type 1 and type 2 errors—is lost on us," he said. "We focus more on what we think will help an individual patient in a particular situation. But when we learn how statistics can help us to avoid making a change when nothing was wrong with the patient, or to avoid thinking there wasn’t a problem when there was one…well, that’s when these techniques become much more powerful and interesting."

For Kashmer, the most compelling way to show the value of data analysis is to draw a direct connection to the benefits patients experience from an improved process.

"Making decisions with data is challenging since it doesn't resonate with everyone," he told us. "Putting a human face on data and using it to tell a story that people can feel is key when talking about the true performance of our system."

Big Insights from a Little Data

Kashmer shared several stories with us about how using data-driven methods solved some tenacious problems. One thing that struck me was that even very straightforward analyses have had big impacts by helping teams see patterns and problems they otherwise would have missed.

In one case, an answer was found by simply graphing the data.

"We felt we had an issue with trauma patients in the emergency department, but the median time for a trauma patient looked great, so the group couldn’t figure out why we had an issue," Kashmer explained. "So we used Minitab to see the distribution, and it was a nonnormal distribution that was much different than just a bell curve."

histogram of time

Simply looking at the data graphically revealed why the team felt there was a problem despite the median.

"We saw that the median was actually a bit misleading—it didn’t tell the whole story, and that highlighted the problem nicely: the distribution revealed a tail of patients who were a lot worse when they stayed in the emergency department for over six hours, so we knew to focus on this long tail instead of on the median. Looking at the data this way let us see something we didn’t see before."

Read Our Full Interview

We'd like to thank Dr. Kashmer for talking with us, and for his efforts to help more healthcare organizations reap the benefits of data-driven quality improvement. He had much more to say than we can recap here, so if you're interested in using data to improve health care quality, I encourage you to read our full interview with Dr. Kashmer.

Over the past year I've been able to work with and learn from practitioners and experts who are using data analysis and Six Sigma to improve the quality of healthcare, both in terms of operational efficiency and better patient outcomes. I've been struck by how frequently a very basic analysis can lead to remarkable improvements, but some insights cannot be attained without conducting more sophisticated analyses. One such situation is covered in a 2011 Quality Engineering article on the application of binary logistic regression in a healthcare Six Sigma project.

In this series of blog posts, I'll follow the path of the project discussed in that article and show you how to perform the analyses described using Minitab Statistical Software. (I am using simulated data, so my analyses will not match those in the original article.)

The Six Sigma Project Goal

The goal of this Six Sigma project was to attract and retain more patients in a hospital's cardiac rehabilitation program. On being discharged, heart-surgery patients are advised to join this program, which offers psychological support and guidance on a healthy diet and lifestyle. Program participants also have two or three physical therapy sessions per week, for up to 45 sessions.

An average of 33 new patients begin participating in the program per month, and participants attend an average of 29 sessions. But many discharged patients do not enroll in the program, and many who do drop out before they complete it. Greater rates of participation would benefit individual patients' health and increase the hospital's revenues.

The project team identified two critical metrics they might improve:

The number of patients participating in the program each month
The number of therapy sessions for each participant

The team set a goal to increase the average number of new participants to 36 per month, and to increase the average number of sessions each patient attends to 32.

Available Patient Data

Existing data on the hospital's cardiac patients includes:

The distance between each patient's home and the hospital
Patient's age and gender
Whether or not the patient has access to a car
Whether or not the patient participated in the rehabilitation program

To illustrate the analyses conducted for this project, we will use a simulated set of data for 500 patients. Download the data set to follow along and try these analyses yourself. If you don't already have Minitab, you can download and use our statistical software free for 30 days.

Exploring Why Patients Leave the Program with a Pareto Chart

Encouraging patients who start the program to complete it, or at least to attend a greater number of sessions, has the potential to be a quick and easy "win," so the project team began by looking at why 156 patients who started the program eventually dropped out.

The reasons patients gave for dropping out of the rehabilitation program were placed into several different categories, then visualized with a Pareto chart.

The Pareto chart is a must-have in any analyst’s toolbox. The Pareto principle states that about 80% of outcomes come from 20% of the possible causes. By plotting the frequencies and corresponding percentages of a categorical variable, a Pareto chart helps identify the "vital few"—the “20%" that really matter, so you can focus your efforts where they can make the most difference.

To create this chart in Minitab, open Stat > Quality Tools > Pareto Chart... From our worksheet of simulated hospital data, select the Reason column as shown:

Pareto Dialog

When you press OK, Minitab creates the following chart:

Pareto Chart of Reasons

Along the x-axis, Minitab displays the reasons people dropped out of the rehabilitation program, along with the percent of the total and the cumulative percentage each reason accounted for. We can see that some 80% of these patients dropped out of the program for one of the following reasons:

They were readmitted to the hospital.
Work or other obligations conflicted with the program schedule.
They could not participate for medical reasons.
They had their own exercise facilities.

While encouraging existing participants to complete the program seemed like a good strategy, the Pareto chart shows that most people stop participating due to factors that are beyond the hospital's control. Therefore, rather than focusing on keeping existing participants, the team decided to explore how to attract more new participants.

Getting More Patients to Participate in the Program

Having decided to focus on increasing initial enrollment, the project team next gathered cardiologists, physical therapists, patients, and other stakeholders to brainstorm about the factors that influence participation.

At these brainstorming sessions, many stakeholders insisted that more people would participate in the rehabilitation program if the brochure about it were better. Another suggested solution involved sending a letter to cardiologists encouraging them to be more positive about the program and to mention it to patients at an earlier point in their treatment.

The project team recorded these suggestions, but they were wary of jumping to conclusions that weren't supported by data. They decided to look more closely at the data they had from existing patients before proceeding with any potential solutions.

In part 2, we will review how the team used graphs and basic descriptive statistics to get quick insight into the influence of individual factors on patient participation in the program.

My previous post covered the initial phases of a project to attract and retain more patients in a cardiac rehabilitation program, as described in a 2011 Quality Engineering article. A Pareto chart of the reasons enrolled patients left the program indicated that the hospital could do little to encourage participants to attend a greater number of sessions, so the team focused on increasing initial enrollment from 32 to 36 patients per month.

heart with stethoscope Stakeholders offered several solutions. Before implementing any improvement strategy, however, the team decided to look at how other individual factors influenced patient participation in the program. Taking this step can help avoid devoting resources to "fixing" factors that have little impact on the outcome.

In this post, we will look at how the team analyzed those individual factors. We have (simulated) data from 500 patients, including:

Address and distance between each patient's home and hospital
Each patient's age and gender
Whether or not the patient had a car
Whether or not the patient participated in the program

Download the data set to follow along and try these analyses yourself. If you don't already have Minitab, you can download and use our statistical software free for 30 days.

The team used simple statistics and graphs to get some preliminary insight into how these different factors affected whether or not patients decided to participate in the rehabilitation program.

Looking at the Influence of Distance on Patient Participation

The team looked first at the influence of distance on participation using a boxplot. Also known as a box-and-whisker diagram, the boxplot gives you an indication of your data's general shape, central tendency, and variability with a single glance. Displaying boxplots side-by-side lets you easily compare the distribution of data between groups. You can easily compare the central value and spread of the distribution for each group and determine if the data for each group are symmetric about the center.

To create this graph, open the patient data set in Minitab and select Graph > Boxplot > One Y With Groups.

boxplot dialog

In the dialog box, select "Distance" as the graph variable, choose "Participation" as the categorical variable, and click OK.

Boxplot of Distance dialog

Minitab generates the following graph:

Boxplot of Distance vs. Patient Participation

The boxplot indicates that patients who live closer to the hospital are more likely to participate in the program. This is valuable, but it would be interesting to know more about the relationship between distance and participation. But because "Participation" is a binary response—a patient either participates, or does not—we can't visualize that relationship directly with graphs that require a continuous response.

However, to get a bit more insight, the project team divided the patients into groups according to how far away from the hospital they live, then calculated the relative percentage of participation for each group. To do this, select Data > Recode > To Text... and complete the dialog box using the following groups. The picture below shows only the first five of the seven groups, so here is the complete list:

Group 1: 0 to 25 km
Group 2: 25 to 35 km
Group 3: 35 to 45 km
Group 4: 45 to 55 km
Group 5: 55 to 65 km
Group 6: 65 to 75 km
Group 7: 75 to 200 km

recode distance

When you recode the data, Minitab creates new columns of coded data and provides a summary in the Session Window:

distance group summary

Minitab automatically names the new column of data "Recoded Distance," which I've renamed as "Distance Group."

To determine the relative frequency of participation among each group, choose Stat > Tables > Descriptive Statistics... In the dialog box, select 'Distance Group' as the variable for rows, and Participation as the variable for columns, as shown. Click on the "Categorical Variables" button and make sure 'Counts' and 'Row percents' are selected, then press OK twice.

table of descriptive statistics for distance dialog

In the session window, Minitab will display a table that shows the total number in each distance group, the number participating, and the relative frequency of participation for each group.

Tabbed Data

If we enter that information into the Minitab worksheet like this:

table of descriptive statistics for distance

we can create a scatterplot that reveals more about the relationship between distance and participation. Select Graph > Scatterplot..., and choose "With connect line."

scatterplot dialog

Select 'Part %' as the Y variable and 'Distance Grp' as the X variable, and Minitab creates the following graph, which shows the relationship between distance and participation more clearly:

scatterplot of participation vs distance

We can see that the percentage of participation is very high among patients who live closest to the hospital, but decreases steadily among groups who lived further than 45 miles away.

Looking at the Influence of Age on Patient Participation

We can use the same methods to get initial insight into how age affects a patient's likelihood of participation in the program. The boxplot below indicates age does have some influence on participation:

Boxplot of Age

By dividing the patient data into groups based on Age as we did for Distance, as detailed in the table below, we can create a similar rough scatterplot to enhance our understanding of the relationship between these variables. We’ll divide the data as shown here before using Stat > Tables > Descriptive Statistics… to determine the relative participation rates:

table of age groups

The scatterplot of the relative frequency of participation for patients in each Age group again yields greater insight into the relationship between this factor and the likelihood of participation. In this case, a much higher percentage of patients in the younger groups take part.

Scatterplot of participation vs age group

Looking at the Influence of Mobility and Gender on Patient Participation

Because both "Mobility" and "Participation" are binary variables, we can select Stat > Tables > Descriptive Statistics... to give us a tabular view of the data. Select "Mobility" as the row, and Participation as the columns, and Minitab will provide the following output, which gives you percentages of participation among those patients who do not own a car and those who do.

We can put these data into a bar chart for a quick visual assessment. Minitab offers several ways to accomplish this easily; I opted to place the table data for each variable into the worksheet as shown here:

Gender and Mobility Data

Now, by selecting Graph > Bar Chart, and choosing a simple chart in which "Bars represent values from a table"...

Bar Chart dialog

we can create the following bar charts that show the proportion of those with and without cars who participate in the program, and the proportion of men and women who participate:

Bar Chart of Gender

Participation by Mobility

It appears that gender could have a slight influence on participation, but the impact of having a car on participation is clearly an important factor.

An initial look at these factors indicates that access to the hospital is very important in getting people to participate. Offering a bus or shuttle service for people who do not have cars might be a good way to increase participation, but only if such service doesn't cost more than the amount of increased revenue it might generate by increasing participation.

In the next part of this series, we'll use binary logistic regression—which is not as scary as it might sound—to develop a model that will let us predict the probability a patient will join the program based on the influence factors we've looked at. A good estimate of that probability will enable us to calculate the break-even point for such a service.

In part 2 of this series, we used graphs and tables to see how individual factors affected rates of patient participation in a cardiac rehabilitation program. This initial look at the data indicated that ease of access to the hospital was a very important contributor to patient participation.

physical therapy facility Given this revelation, a bus or shuttle service for people who do not have cars might be a good way to increase participation, but only if such a service doesn't cost more than the amount of revenue generated by participation.

A good estimate of that probability will enable us to calculate the break-even point for such a service. We can use regression to develop a statistical model that lets us do just that.

We have a binary response variable, because only two outcomes exist: a patient either participates in the rehabilitation program, or does not. To model these kinds of responses, we need to use a statistical method called "Binary Logistic Regression." This may sound intimidating, but it's really not as scary as it sounds, especially with a statistical software package like Minitab.

Download the data set to follow along and try these analyses yourself. If you don't already have Minitab, you can download and use our statistical software free for 30 days.

Using Stepwise Binary Logistic Regression to Obtain an Initial Model

First, let's review our data. We know the gender, age, and distance from the hospital for 500 cardiac patients. We also know whether or not they have access to a vehicle ("Mobility") and whether or not they participated in the rehabilitation program after their surgery (coded so that 0 = no, and 1 = yes).

data

The process of developing a regression equation that can predict a response based on your data is called "Fitting a model." We'll do this in Minitab by selecting Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model...

Binary Logistic Regression menu

In the dialog box, we need to select the appropriate columns of data for the response we want to predict, and the factors we wish to base the predictions on. In this case, our response variable is "Participation," and we're basing predictions on the continuous factors of "Age" and "Distance," along with the categorical factor "Mobility."

binary logistic regression dialog 1

After selecting the factors, click on the "Model" button. This lets us tell Minitab whether we want to consider interactions and polynomial terms in addition to the main effects of each factor. Complete the Model dialog as shown below. To include the two-way interactions in the model, highlight all the items in the Predictors window, make sure that the “Interactions through order:” drop-down reads “2,” and press the Add button next to it:

Binary Logistic Regression Dialog 2

Click OK to return to the main dialog, then press the “Coding” button. In this subdialog, we can tell Minitab to automatically standardize the continuous predictors, Age and Distance. There are several reasons you might want to standardize the continuous predictors, and different ways of standardizing depending on your intent.

In this case, we’re going to standardize by subtracting the mean of the predictor from each row of the predictor column, then dividing the difference by the standard deviation of the predictor. This centers the predictors and also places them on a similar scale. This is helpful when a model contains highly correlated predictors and interaction terms, because standardizing helps reduce multicollinearity and improves the precision of the model’s estimated coefficients. To accomplish this, we just need to select that option from the drop-down as shown below:

Binary Logistic Regression - Coding

After you click OK to return to the main dialog, press the "Stepwise" button. We use this subdialog to perform a stepwise selection, which is a technique that automatically chooses the best model for your data. Minitab will evaluate several different models by adding and removing various factors, and select the one that appears to provide the best fit for the data set. You can have Minitab provide details about the combination of factors it evaluates at each "step," or just show the recommended model.

Binary Logistic Regression - stepwise

Now click OK to close the Stepwise dialog, and OK again to run the analysis. The output in Minitab's Session window will include details about each potential model, followed by a summary or "deviance" table for the recommended model.

Assessing and Refining the Regression Model

Using software to perform stepwise regression is extremely helpful, but it's always important to check the recommended model to see if it can be refined further. In this case, all of the model terms are significant, and the deviance table's adjusted R2 indicates that the model explains about 40 percent of the observed variation in the response data.

stepwise regression selected model

We also want to look at the table of coded coefficients immediately below the summary. The final column of the table lists the VIFs, or variance inflation factors, for each term in the model. This is important because VIF values greater than 5–10 can indicate unstable coefficients that are difficult to interpret.

None of these terms have VIF values over 10.

variance inflation factors (VIF)

Minitab also performs goodness-of-fit tests that assess how well the model predicts observed data. The first two tests, the deviance and Pearson chi-squared tests, have high p-values, indicating that these tests do not support the conclusion that this model is a poor fit for the data. However, the low p-value for the Hosmer-Lemeshow test indicates that the model could be improved.

goodness-of-fit tests

It may be that our model does not account for curvature that exists in the data. We can ask Minitab to add polynomial terms, which model curvature between predictors and the response, to see if it improves the model. Press CTRL-E to recall the binary logistic regression dialog box, then press the "Model" button. To add the polynomial terms, select Age and Distance in the Predictors window, make sure that "2" appears in the “Terms through order:” drop-down, and press "Add" to add those polynomial terms to the model. An order 2 polynomial is the square of the predictor.

binary logistic regression dialog 4

You may have noticed that we did not select “Mobility” above. Why? Because that categorical variable is coded with 1’s and 0’s, so the polynomial term would be identical to the term that is already in the model.

Now press OK all the way out to have Minitab evaluate models that include the polynomial terms. Minitab generates the following output::

binary logistic regression final model

However, the VIFs for Mobility and the Distance*Mobility interaction remain higher than desirable:

VIF

So far, so good—all model terms are significant, and the adjusted R2 indicates that the new model accounts for 51 percent of the observed variation in the response, compared to the initial model’s 40 percent. The coefficients are also acceptable, with no variance inflation factors above 10. These terms are moderately correlated, but probably not enough to make the regression results unreliable:

binary-logistic-regression-model-VIF

The goodness-of-fit tests for this model also look good—the lack of p-values below 0.05 indicate that these tests do not suggest the model is a poor fit for the observed data.

final-binary-logistic-regression-model-goodness-of-fit-tests

The Binary Logistic Regression Equations

This model seems like the best option for predicting the probability of patient participation in the program. Based on the available data, Minitab has calculated the following regression equations, one that predicts the probability of attendance for people who have access to their own transportation, and one for those who do not:

regression equations

In the next post, we'll complete this process by using this model to make predictions about the probability of participation in the rehabilitation program and how much we can afford to invest in transportation to help more cardiac patients.

By looking at the data we have about 500 cardiac patients, we've learned that easy access to the hospital and good transportation are key factors influencing participation in a rehabilitation program.

monitor Past data shows that each month, about 15 of the patients discharged after cardiac surgery do not have a car. Providing transportation to the hospital might make these patients more likely to join the rehabilitation program, but the costs of such a service can't exceed the potential revenue from participation.

We can use the binary logistic regression model developed in part 3 to predict probabilities of participation, to identify where transportation assistance might make the biggest impact, and to develop an estimate of how much we could invest in such assistance.

Download the data set to follow along and try these analyses yourself. If you don't already have Minitab, you can download and use our statistical software free for 30 days.

Using the Regression Model to Predict Patient Participation

We want to develop some estimates of the probability of participation based on whether or not a patient has access to transportation. The first step is make some mesh data representing our population. In Minitab, go to Calc > Create Mesh Data..., and complete the dialog box as shown below. (The maximum and minimum ranges for Age and Distance are drawn directly from the descriptive statistics for the sample data we used to create our regression model.)

Make Mesh Data Dialog

When you press OK, Minitab adds 2 new columns to the worksheet that contain the 200 different combinations of the levels of these factors. Now we'll add two additional columns, one representing patients who have access to a car, and one representing those who don't. Now our worksheet should include four columns of data as shown:

mesh data in worksheet

Now we'll go to Stat > Regression > Binary Logistic Regression > Predict... Minitab remembers the last regression model that was run; to make sure it's the right one, click the "View Model..." button...

view model

and confirm that the model displayed is the correct one.

view model

Next, press the "Predict" button and complete the dialog box using the mesh variables we created, as shown. We can also press the "Storage" button to tell Minitab to store the Fits (the predicted probabilities) for each data point in the worksheet. Note that the column selected for the Mobility term is "Car," so all of these predictions will be based on the equation for patients who have access to a vehicle.

regression prediction dialog

When you click OK through all dialogs, Minitab will add a column of data that shows the predicted probability of participation for patients, assuming they have a vehicle.

Now we'll create the predictions for individuals who don't have cars. Press CTRL-E to edit the previous dialog box. This time, for the Mobility column, select "NoCar."

no car

When you press OK, Minitab recalculates the probabilities for the patients, this time using the equation that assumes they do not have a vehicle. The probabilities of participation for each data point are stored in two columns in the worksheet, which I've renamed PFITS-Car and PFITS-No car.

pfits

Where Can Providing Transportation Make an Impact?

Now we have estimated probabilities of participation for patients with the same age and distance characteristics, both with and without access to a vehicle. It would be helpful to visualize the differences in these probabilities to see where offering transportation might make the biggest impact in increasing participation rates.

First, we'll use Minitab's calculator to compute the difference in probabilities between having and not having a car. Go to Calc > Calculator... and complete the dialog as shown:

calculator

Now we have column of data named "Car - NoCar" that contains the probability difference for patients with the same age and distance characteristics both with and without a vehicle. We can use that column to create a contour plot that offers additional insight into the relationships between the likelihood of participation in the rehabilitation program and a patient's age, distance, and mobility. Select Graph > Contour Plot... and complete the dialog as shown:

contour plot dialog box

Minitab produces this contour plot (we have edited the range of colors from the default):

contour plot

From this plot we can see the patients for whom transportation assistance is likely to make the most impact. These are the patients whose age and distance characteristics fall within the dark-red-colored area, where access to a vehicle raises the probability of participation by more than 40 percent.

The hospital could use this information to carefully target potential recipients of transportation assistance, but doing so would raise many ethical issues. Instead, the hospital will offer transportation assistance to any potential participant who needs it. The project team decides to calculate the average probability of participation for all patients without access to a vehicle.

To obtain that average, select Stat > Basic Statistics > Display Descriptive Statistics... in Minitab, and choose "PFITS-NoCar" as the variable. Click on the "Statistics" button to make sure the Mean is among the descriptive statistics being calculated, and click OK. Minitab will display the descriptive statistics you've selected in the Session Window.

descriptive statistics

According to our binary logistic regression model, the average probability of participation for all patients without a car equals 0.1695, which we will round up to .17. Now we can easily calculate an estimated break-even point for ensuring transport for patients who need it. We have the following information on hand:

Patients per month without a car................................................. 15 Average probability of participation without a car........................... .30 Average number of sessions per participant.................................. 29 Revenue per session.................................................................. $23

Based on these figures, a per-patient maximum for transportation can be calculated as:

.17 probability of participation x 29 sessions x $23 per session = $113.39

Since about 15 discharged cardiac patients each month do not have a car, we can invest at most 15 x $113.39 = $1700.85/month in transportation assistance.

Implementing Transportation Assistance for Patient Participation

As described in the article on which inspired this series of posts, the project team evaluated potential improvement options against this this economic calculation and developed a process that brought together patients with cars and those without to carpool to sessions. A pilot-test of the process proved successful, and most of the car-less patients noted that they would not have participated in the rehabilitation program without the service.

After implementing the new carpool process, the project team revisited the key factors they had considered at the start of the initiative, the number of patients enrolling in the program each month, and the average number of sessions participants attended.

After implementing the carpool process, the average number of sessions attended remained constant at 29. But patient participation rose from 33 to 45 per month, which exceeded the project goal of increasing participation to 36 patients per month. Additional revenues turned out to be circa $96,000 annually.

Take-Away Lessons from This Project Study

If you've read all four parts of this series, you may recall that at the start of the Six Sigma project, several stakeholders believed that the problem of low participation could be addressed by creating a nicer brochure for the program, and by encouraging surgeons to tell their patients about it at an earlier point in their treatment.

None of those initial ideas wound up being implemented, but the project team succeeded in meeting the project goals by enacting improvements that were supported by their data analysis. For me, this is a core takeaway from this article.

As the authors note, "Often people’s ideas on processes are incorrect, but improvement actions based on these are still being implemented. These actions cause frustrated employees, may not be cost effective, and in the end do not solve the problem."

Thus, the article makes a compelling case for the value of applying data analysis to improve processes in healthcare. "Even when a somewhat more advanced technique like logistic regression modeling is required," the authors write, "exploratory graphics such as boxplots and bar charts point the direction toward a valuable solution."

Earlier this month, PLOS.org published an article titled "Ten Simple Rules for Effective Statistical Practice." The 10 rules are good reading for anyone who draws conclusions and makes decisions based on data, whether you're trying to extend the boundaries of scientific knowledge or make good decisions for your business.

Carnegie Mellon University's Robert E. Kass and several co-authors devised the rules in response to the increased pressure on scientists and researchers—many, if not most, of whom are not statisticians—to present accurate findings based on sound statistical methods.

Since the paper and the discussions it has prompted focus on scientists and researchers, it seems worthwhile to consider how the rules might apply to quality practitioners or business decision-makers as well. In this post, I'll share the 10 rules, some with a few modifications to make them more applicable to the wider population of all people who use data to inform their decisions.

1. Statistical Methods Should Enable Data to Answer Scientific Specific Questions

As the article points out, new or infrequent users of statistics tend to emphasize finding the "right" method to use—often focusing on the structure or format of their data, rather than thinking about how the data might answer an important question. But choosing a method based on the data is putting the cart before the horse. Instead, we should start by clearly identifying the question we're trying to answer. Then we can look for a method that uses the data to answer it. If you haven't already collected your data, so much the better—you have the opportunity to identify and obtain the data you'll need.

2. Signals Always Come With Noise

If you're familiar with control charts used in statistical process control (SPC) or the Control phase of a Six Sigma DMAIC project, you know that they let you distinguish process variation that matters (special-cause variation) from normal process variation that doesn't need investigation or correction.

Control charts are one common tool used to distinguish "noise" from "signal."

The same concept applies here: whenever we gather and analyze data, some of what we see in the results will be due to inherent variability. Measures of probability for analyses, such as confidence intervals, are important because they help us understand and account for this "noise."

3. Plan Ahead, Really Ahead

Say you're starting a DMAIC project. Carefully considering and developing good questions right at the start of a project—the DEFINE stage—will help you make sure that you're getting the right data in the MEASURE stage. That, in turn, should result in a much smoother and stress-free ANALYZE phase—and probably more successful IMPROVE and CONTROL phases, too. The alternative? You'll have to complete the ANALYZE phase with the data you have, not the data you wish you had.

4. Worry About Data Quality

gauge "Can you trust your data?" My Six Sigma instructor asked us that question so many times, it still flashes through my mind every time I open Minitab. That's good, because he was absolutely right: if you can't trust your data, you shouldn't do anything with it. Many people take it for granted that the data they get is precise and accurate, especially when using automated measuring instruments and similar technology. But how do you know they're measuring precisely and accurately? How do you know your instruments are calibrated properly? If you didn't test it, you don't know. And if you don't know, you can't trust your data. Fortunately, with measurement system analysis methods like gage R&R and attribute agreement analysis, we never have to trust data quality to blind faith.

5. Statistical Analysis Is More Than a Set of Computations

Statistical techniques are often referred to as "tools," and that's a very apt metaphor. A saw, a plane, and a router all cut wood, but they aren't interchangeable—the end product defines which tool is appropriate for a job. Similarly, you might apply ANOVA, regression, or time series analysis to the same data set, but the right tool depends on what you want to understand. To extend the metaphor further, just as we have circular saws, jigsaws, and miter saws for very specific tasks, each family of statistical methods also includes specialized tools designed to handle particular situations. The point is that we select a tool to assist our analysis, not to define it.

6. Keep it Simple

Many processes are inherently messy. If you've got dozens of input variables and multiple outcomes, analyzing them could require many steps, transformations, and some thorny calculations. Sometimes that degree of complexity is required. But a more complicated analysis isn't always better—in fact, overcomplicating it may make your results less clear and less reliable. It also potenitally makes the analysis more difficult than necessary. You may not need a complex process model that includes 15 factors if you can improve your output by optimizing the three or four most important inputs. If you need to improve a process that includes many inputs, a short screening experiment can help you identify which factors are most critical, and which are not so important.

7. Provide Assessments of Variability

No model is perfect. No analysis accounts for all of the observed variation. Every analysis includes a degree of uncertainty. Thus, no statistical finding is 100% certain, and that degree of uncertainty needs to be considered when using statistical results to make decisions. If you're the decision-maker, be sure that you understand the risks of reaching a wrong conclusion based on the analysis at hand. If you're sharing your results with stakeholders and executives, especially if they aren't statistically inclined, make sure you've communicated that degree of risk to them by offering and explaining confidence intervals, margins of error, or other appropriate measures of uncertainty.

8. Check Your Assumptions

Different statistical methods are based on different assumptions about the data being analyzed. For instance, many common analyses assume that your data follow a normal distribution. You can check most of these assumptions very quickly using functions like a normality test in your statistical software, but it's easy to forget (or ignore) these steps and dive right into your analysis. However, failing to verify those assumptions can yield results that aren't reliable and shouldn't be used to inform decisions, so don't skip that step. If you're not sure about the assumptions for a statistical analysis, Minitab's Assistant menu explains them, and can even flag violations of the assumptions before you draw the wrong conclusion from an errant analysis.

9. When Possible, Replicate Verify Success!

In science, replication of a study—ideally by another, independent scientist—is crucial. It indicates that the first researcher's findings weren't a fluke, and provides more evidence in support of the given hypothesis. Similarly, when a quality project results in great improvements, we can't take it for granted those benefits are going to be sustained—they need to be verified and confirmed over time. Control charts are probably the most common tool for making sure a project's benefits endure, but depending on the process and the nature of the improvements, hypothesis tests, capability analysis, and other methods also can come into play.

10. Make Your Analysis Reproducible Share How You Did It

In the original 10 Simple Rules article, the authors suggest scientists share their data and explain how they analyzed it so that others can make sure they get the same results. This idea doesn't translate so neatly to the business world, where your data may be proprietary or private for other reasons. But just as science benefits from transparency, the quality profession benefits when we share as much information as we can about our successes. Of course you can't share your company's secret-sauce formulas with competitors—but if you solved a quality challenge in your organization, chances are your experience could help someone facing a similar problem. If a peer in another organization already solved a problem like the one you're struggling with now, wouldn't you like to see if a similar approach might work for you? Organizations like ASQ and forums like iSixSigma.com help quality practitioners network and share their successes so we can all get better at what we do. And here at Minitab, we love sharing case studies and examples of how people have solved problems using data analysis, too.

How do you think these rules apply to the world of quality and business decision-making? What are your guidelines when it comes to analyzing data?

Design of Experiments (DOE) has a reputation for difficulty, and to an extent, this statistical method deserves that reputation. While it's easy to grasp the basic idea—acquire the maximum amount of information from the fewest number of experimental runs—practical application of this tool can quickly become very confusing.

steaks Even if you're a long-time user of designed experiments, it's still easy to feel uncertain if it's been a while since you last looked at split-plot designs or needed to choose the appropriate resolution for a fractional factorial design.

But DOE is an extremely powerful and useful tool, so when we launched Minitab 17, we added a DOE tool to the Assistant to make designed experiments more accessible to more people.

Since summer is here at Minitab's world headquarters, I'm going to illustrate how you can use the Assistant's DOE tool to optimize your grilling method.

If you're not already using it and you want to play along, you can download the free 30-day trial version of Minitab Statistical Software.

Two Types of Designed Experiments: Screening and Optimizing

To create a designed experiment using the Assistant, open Minitab and select Assistant > DOE > Plan and Create. You'll be presented with a decision tree that helps you take a sequential approach to the experimentation process by offering a choice between a screening design and a modeling design.

DOE Assistant

A screening design is important if you have a lot of potential factors to consider and you want to figure out which ones are important. The Assistant guides you through the process of testing and analyzing the main effects of 6 to 15 factors, and identifies the factors that have greatest influence on the response.

Once you've identified the critical factors, you can use the modeling design. Select this option, and the Assistant guides you through testing and analyzing 2 to 5 critical factors and helps you find optimal settings for your process.

Even if you're an old hand at analyzing designed experiments, you may want to use the Assistant to create designs since the Assistant lets you print out easy-to-use data collection forms for each experimental run. After you've collected and entered your data, the designs created in the Assistant can also be analyzed using Minitab's core DOE tools available through the Stat > DOE menu.

Creating a DOE to Optimize How We Grill Steaks

For grilling steaks, there aren't that many variables to consider, so we'll use the Assistant to plan and create a modeling design that will optimize our grilling process. Select Assistant > DOE > Plan and Create, then click the "Create Modeling Design" button.

Minitab brings up an easy-to-follow dialog box; all we need to do is fill it in.

First we enter the name of our Response and the goal of the experiment. Our response is "Flavor," and the goal is "Maximize the response." Next, we enter our factors. We'll look at three critical variables:

Number of turns, a continuous variable with a low value of 1 and high value of 3.
Type of grill, a categorical variable with Gas or Charcoal as options.
Type of seasoning, a categorical variable with Salt-Pepper or Montreal steak seasoning as options.

If we wanted to, we could select more than 1 replicate of the experiment. A replicate is simply a complete set of experimental runs, so if we did 3 replicates, we would repeat the full experiment three times. But since this experiment has 16 runs, and neither our budget nor our stomachs are limitless, we'll stick with a single replicate.

When we click OK, the Assistant first asks if we want to print out data collection forms for this experiment:

Choose Yes, and you can print a form that lists each run, the variables and settings, and a space to fill in the response:

Alternatively, you can just record the results of each run in the worksheet the Assistant creates, which you'll need to do anyway. But having the printed data collection forms can make it much easier to keep track of where you are in the experiment, and exactly what your factor settings should be for each run.

If you've used the Assistant in Minitab for other methods, you know that it seeks to demystify your analysis and make it easy to understand. When you create your experiment, the Assistant gives you a Report Card and Summary Report that explain the steps of the DOE and important considerations, and a summary of your goals and what your analysis will show.

Now it's time to cook some steaks, and rate the flavor of each. If you want to do this for real and collect your own data, please do so! Tomorrow's post will show how to analyze your data with the Assistant.

grill

Design of Experiments is an extremely powerful statistical method, and we added a DOE tool to the Assistant in Minitab 17 to make it more accessible to more people.

Since it's summer grilling season, I'm applying the Assistant's DOE tool to outdoor cooking. Earlier, I showed you how to set up a designed experiment that will let you optimize how you grill steaks.

If you're not already using it and you want to play along, you can download the free 30-day trial version of Minitab Statistical Software.

Perhaps you are following along, and you've already grilled your steaks according to the experimental plan and recorded the results of your experimental runs. Otherwise, feel free to download my data here for the next step: analyzing the results of our experiment.

Analyzing the Results of the Steak Grilling Experiment

After collecting your data and entering it into Minitab, you should have an experimental worksheet that looks like this:

With your results entered in the worksheet, select Assistant > DOE > Analyze and Interpret. As you can see below, the only button you can click is "Fit Linear Model."

As you might gather from the flowchart, when it analyzes your data, the Assistant first checks to see if the response exhibits curvature. If it does, the Assistant will prompt you to gather more data so you it can fit a quadratic model. Otherwise, the Assistant will fit the linear model and provide the following output.

When you click the "Fit Linear Model" button, the Assistant automatically identifies your response variable.

All you need to do is confirm your response goal—maximizing flavor, in this case—and press OK. The Assistant performs the analysis, and provides you the results in a series of easy-to-interpret reports.

Understanding the DOE Results

First, the Assistant offers a summary report that gives you the bottom-line results of the analysis. The Pareto Chart of Effects in the top left shows that Turns, Grill type, and Seasoning are all statistically significant, and there's a significant interaction between Turns and Grill type, too.

The summary report also shows that the model explains very high proportion of the variation in flavor, with an R2 value of 95.75 percent. And the "Comments" window in the lower right corner puts things if plain language: "You can conclude that there is a relationship between Flavor and the factors in the model..."

The Assistant's Effects report, shown below, tells you more about the nature of the relationship between the factors in the model and Flavor, with both Interaction Plots and Main Effects plots that illustrate how different experimental settings affect the Flavor response.

And if we're looking to make some changes as a result of our experimental results—like selecting an optimal method for grilling steaks in the future—the Prediction and Optimization report gives us the optimal solution (1 turn on a charcoal grill, with Montreal seasoning) and its predicted Flavor response (8.425).

It also gives us the Top 5 alternative solutions, shown in the bottom right corner, so if there's some reason we can't implement the optimal solution—for instance, if we only have a gas grill—we can still choose the best solution that suits our circumstances.

I hope this example illustrates how easy a designed experiment can be when you use the Assistant to create and analyze it, and that designed experiments can be very useful not just in industry or the lab, but also in your everyday life.

Where could you benefit from analyzing process data to optimize your results?

You need to consider many factors when you’re buying a used car. Once you narrow your choice down to a particular car model, you can get a wealth of information about individual cars on the market through the Internet. How do you navigate through it all to find the best deal? By analyzing the data you have available.

Let's look at how this works using the Assistant in Minitab 17. With the Assistant, you can use regression analysis to calculate the expected price of a vehicle based on variables such as year, mileage, whether or not the technology package is included, and whether or not a free Carfax report is included.

And it's probably a lot easier than you think.

A search of a leading Internet auto sales site yielded data about 988 vehicles of a specific make and model. After putting the data into Minitab, we choose Assistant > Regression…

At this point, if you aren’t very comfortable with regression, the Assistant makes it easy to select the right option for your analysis.

A Decision Tree for Selecting the Right Analysis

We want to explore the relationships between the price of the vehicle and four factors, or X variables. Since we have more than one X variable, and since we're not looking to optimize a response, we want to choose Multiple Regression.

This data set includes five columns: mileage, the age of the car in years, whether or not it has a technology package, whether or not it includes a free CARFAX report, and, finally, the price of the car.

We don’t know which of these factors may have significant relationship to the cost of the vehicle, and we don’t know whether there are significant two-way interactions between them, or if there are quadratic (nonlinear) terms we should include—but we don’t need to. Just fill out the dialog box as shown.

Press OK and the Assistant assesses each potential model and selects the best-fitting one. It also provides a comprehensive set of reports, including a Model Building Report that details how the final model was selected and a Report Card that notifies you to potential problems with the analysis, if there are any.

Interpreting Regression Results in Plain Language

The Summary Report tells us in plain language that there is a significant relationship between the Y and X variables in this analysis, and that the factors in the final model explain 91 percent of the observed variation in price. It confirms that all of the variables we looked at are significant, and that there are significant interactions between them.

The Model Equations Report contains the final regression models, which can be used to predict the price of a used vehicle. The Assistant provides 2 equations, one for vehicles that include a free CARFAX report, and one for vehicles that do not.

We can tell several interesting things about the price of this vehicle model by reading the equations. First, the average cost for vehicles with a free CARFAX report is about $200 more than the average for vehicles with a paid report ($30,546 vs. $30,354). This could be because these cars probably have a clean report (if not, the sellers probably wouldn’t provide it for free).

Second, each additional mile added to the car decreases its expected price by roughly 8 cents, while each year added to the cars age decreases the expected price by $2,357.

The technology package adds, on average, $1,105 to the price of vehicles that have a free CARFAX report, but the package adds $2,774 to vehicles with a paid CARFAX report. Perhaps the sellers of these vehicles hope to use the appeal of the technology package to compensate for some other influence on the asking price.

Residuals versus Fitted Values

While these findings are interesting, our goal is to find the car that offers the best value. In other words, we want to find the car that has the largest difference between the asking price and the expected asking price predicted by the regression analysis.

For that, we can look at the Assistant’s Diagnostic Report. The report presents a chart of Residuals vs. Fitted Values. If we see obvious patterns in this chart, it can indicate problems with the analysis. In that respect, this chart of Residuals vs. Fitted Values looks fine, but now we’re going to use the chart to identify the best value on the market.

In this analysis, the “Fitted Values” are the prices predicted by the regression model. “Residuals” are what you get when you subtract the actual asking price from the predicted asking price—exactly the information you’re looking for! The Assistant marks large residuals in red, making them very easy to find. And three of those residuals—which appear in light blue above because we’ve selected them—appear to be very far below the asking price predicted by the regression analysis.

Selecting these data points on the graph reveals that these are vehicles whose data appears in rows 357, 359, and 934 of the data sheet. Now we can revisit those vehicles online to see if one of them is the right vehicle to purchase, or if there’s something undesirable that explains the low asking price.

Sure enough, the records for those vehicles reveal that two of them have severe collision damage.

But the remaining vehicle appears to be in pristine condition, and is several thousand dollars less than the price you’d expect to pay, based on this analysis!

With the power of regression analysis and the Assistant, we’ve found a great used car—at a price you know is a real bargain.

Figures lie, so they say, and liars figure. A recent post at Ben Orlin's always-amusing mathwithbaddrawings.com blog nicely encapsulates why so many people feel wary about anything related to statistics and data analysis. Do take a moment to check it out, it's a fast read.

ask about the mean In all of the scenarios Orlin offers in his post, the statistical statements are completely accurate, but the person offering the statistics is committing a lie of omission by not putting the statement in context. Holding back critical information prevents an audience from making accurate assessment of the situation.

Ethical data analysts know better.

Unfortunately, unethical data analysts know how to spin outcomes to put them in the most flattering, if not the most direct, light. Done deliberately, that's the sort of behavior that leads many people to mistrust statistics completely.

Lessons for People Who Consume Statistics

So, where does this leave us as consumers of statistics? Should we mistrust statistics? The first question to ask is whether we trust the people who deliver statistical pronouncements. I believe most people try to do the right thing.

However, we all know that it's easy—all too easy—for humans to make mistakes. And since statistics can be confusing, and not everyone who wants or needs to analyze data is a trained statistician, great potential exists for erroneous conclusions and interpretive blunders.

Bottom line: whether their intentions are good or bad, people often cite statistics in ways that may be statistically correct, but practically misleading. So how can you avoid getting fooled?

The solution is simple, and it's one most statisticians internalized long ago, but doesn't necessarily occur to people who haven't spent much time in the data trenches:

Always look at the underlying distribution of the data.

Especially if the statistic in question pertains to something extremely important to you—like mean salary at your company, for example—ask about the distribution of the data if those details aren't volunteered. If you're told the mean or median as a number, are you also given a histogram, boxplot, or individual value plot that lets you see how the data are arranged? My colleague Michelle Paret wrote an excellent post about this.

If someone is trying to keep the distribution of the data a mystery, then the ultimate meaning of parameters like mean, median, or mode is also unknown...and your mistrust is warranted.

Lessons for People Who Produce Statistics

As purveyors and producers of statistics, who need to communicate results with people who aren't statistically savvy, what lessons can we take from this? After reading the Math with Bad Drawings blog, I thought about it and came up with two rules of thumb.

1. Don't use statistics to obscure or deflect attention from a situation.

Most people do not deliberately set out to distort the truth or mislead others. Most people would never use the mean to support one conclusion when they know the median supports a far different story. Our conscience rebels when we set out to deceive others. I'm usually willing to ascribe even the most horrendous analysis to gross incompetence rather than outright malice. On the other hand, I've read far too many papers and reports that torture language to mischaracterize statistical findings.

Sometimes we don't get the outcomes we expected. Statisticians aren't responsible for what the data show—but we are responsible for making sure we've performed appropriate analyses, satisfied checks and assumptions, and that we have trustworthy data. It should go without saying that we are ethically compelled to report our results honestly, and...

2. Provide all of the information the audience needs to make informed decisions.

When we present the results of an analysis, we need to be thorough. We need to offer all of the information and context that will enable our audience to reach confident conclusions. We need to use straightforward language that helps people tune in, and avoid jargon that makes listeners turn off.

That doesn't mean that every presentation we make needs to be laden with formulas and extended explanations of probability theory; often the bottom line is all a situation requires. When you're addressing experts, you don't need to cover the introductory material. But if we suspect an audience needs some background to fully appreciate the results of an analysis, we should provide it.

There are many approaches to communicating statistical results clearly. One of the easiest ways to present the full context of an analysis in plain language is to use the Assistant in Minitab. As many expert statisticians have told us, the Assistant doesn't just guide you through an analysis, it also explains the output thoroughly and without resorting to jargon.

And when statistics are clear, they're easier to trust.

Bad drawing by Ben Orlin, via www.mathwithbaddrawings.com

So the data you nurtured, that you worked so hard to format and make useful, failed the normality test.

Time to face the truth: despite your best efforts, that data set is never going to measure up to the assumption you may have been trained to fervently look for.

Your data's lack of normality seems to make it poorly suited for analysis. Now what?

Take it easy. Don't get uptight. Just let your data be what they are, go to the Stat menu in Minitab Statistical Software, and choose "Nonparametrics."

nonparametrics menu

If you're stymied by your data's lack of normality, nonparametric statistics might help you find answers. And if the word "nonparametric" looks like five syllables' worth of trouble, don't be intimidated—it's just a big word that usually refers to "tests that don't assume your data follow a normal distribution."

In fact, nonparametric statistics don't assume your data follow any distribution at all. The following table lists common parametric tests, their equivalent nonparametric tests, and the main characteristics of each.

correspondence table for parametric and nonparametric tests

Nonparametric analyses free your data from the straitjacket of the normality assumption. So choosing a nonparametric analysis is sort of like removing your data from a stifling, conformist environment, and putting it into a judgment-free, groovy idyll, where your data set can just be what it is, with no hassles about its unique and beautiful shape. How cool is that, man? Can you dig it?

Of course, it's not quite that carefree. Just like the 1960s encompassed both Woodstock and Altamont, so nonparametric tests offer both compelling advantages and serious limitations.

Advantages of Nonparametric Tests

Both parametric and nonparametric tests draw inferences about populations based on samples, but parametric tests focus on sample parameters like the mean and the standard deviation, and make various assumptions about your data—for example, that it follows a normal distribution, and that samples include a minimum number of data points.

In contrast, nonparametric tests are unaffected by the distribution of your data. Nonparametric tests also accommodate many conditions that parametric tests do not handle, including small sample sizes, ordered outcomes, and outliers.

Consequently, they can be used in a wider range of situations and with more types of data than traditional parametric tests. Many people also feel that nonparametric analyses are more intuitive.

Drawbacks of Nonparametric Tests

But nonparametric tests are not completely free from assumptions—they do require data to be an independent random sample, for example.

And nonparametric tests aren't a cure-all. For starters, they typically have less statistical power than parametric equivalents. Power is the probability that you will correctly reject the null hypothesis when it is false. That means you have an increased chance making a Type II error with these tests.

In practical terms, that means nonparametric tests are less likely to detect an effect or association when one really exists.

So if you want to draw conclusions with the same confidence level you'd get using an equivalent parametric test, you will need larger sample sizes.

Nonparametric tests are not a one-size-fits-all solution for non-normal data, but they can yield good answers in situations that parametric statistics just won't work.

Is Parametric or Nonparametric the Right Choice for You?

I've briefly outlined differences between parametric and nonparametric hypothesis tests, looked at which tests are equivalent, and considered some of their advantages and disadvantages. If you're waiting for me to tell you which direction you should choose...well, all I can say is, "It depends..." But I can give you some established rules of thumb to consider when you're looking at the specifics of your situation.

Keep in mind that nonnormal data does not immediately disqualify your data for a parametric test. What's your sample size? As long as a certain minimum sample size is met, most parametric tests will be robust to the normality assumption. For example, the Assistant in Minitab (which uses Welch's t-test) points out that while the 2-sample t-test is based on the assumption that the data are normally distributed, this assumption is not critical when the sample sizes are at least 15. And Bonnett's 2-sample standard deviation test performs well for nonnormal data even when sample sizes are as small as 20.

In addition, while they may not require normal data, many nonparametric tests have other assumptions that you can’t disregard. For example, the Kruskal-Wallis test assumes your samples come from populations that have similar shapes and equal variances. And the 1-sample Wilcoxon test does not assume a particular population distribution, but it does assume the distribution is symmetrical.

In most cases, your choice between parametric and nonparametric tests ultimately comes down to sample size, and whether the center of your data's distribution is better reflected by the mean or the median.

If the mean accurately represents the center of your distribution and your sample size is large enough, a parametric test offers you better accuracy and more power.
If your sample size is small, you'll likely need to go with a nonparametric test. But if the median better represents the center of your distribution, a nonparametric test may be a better option even for a large sample.

The other day I was talking with a friend about control charts, and I wanted to share an example one of my colleagues wrote on the Minitab Blog. Looking back through the index for "control charts" reminded me just how much material we've published on this topic.

Whether you're just getting started with control charts, or you're an old hand at statistical process control, you'll find some valuable information and food for thought in our control-chart related posts.

Different Types of Control Charts

One of the first things you learn in statistics is that when it comes to data, there's no one-size-fits-all approach. To get the most useful and reliable information from your analysis, you need to select the type of method that best suits the type of data you have.

The same is true with control charts. While there are a few charts that are used very frequently, a wide range of options is available, and selecting the right chart can make the difference between actionable information and false (or missed) alarms.

What Control Chart Should I Use? offers a brief overview of the most common charts and a discussion of how to use the Assistant to help you choose the right one for your situation. And if you're a control chart neophyte and you want more background on why we use them, check out Control Charts Show You Variation that Matters.

We extol the virtues of a less commonly used chart in Beyond the "Regular Guy" Control Charts: An Ode to the EWMA Chart, and explain how to use control charts to track rare events in Using G-Whiz Charts to Track Elusive Affirmations from Almost Adolescents.

In Using the Laney P' Control Chart in Minitab Software Development, Dawn Keller discusses the distinction between P' charts and their cousins, described by Tammy Serensits in P and U Charts and Limburger Cheese: A Smelly Combination.

And it's good to remember that things aren't always as complicated as they seem, and sometimes a simple solution can be just as effective as a more complicated approach. See why in Take It Easy: Create a Run Chart.

Control Chart Tutorials

Many of our Minitab bloggers have talked about the process of choosing, creating, and interpreting control charts under specific conditions. If you have data that can't be collected in subgroups, you may want to learn about How to Create and Read an I-MR Control Chart.

If you do have data collected in subgroups, you'll want to understand why, when it comes to Control Charts, Subgroup Size Matters.

It's often useful to look at control chart data in calendar-based increments, and taking the monthly approach is discussed in the series Creating a Chart to Compare Month-to-Month Change and Creating Charts to Compare Month-to-Month Change, part 2.

If you want to see the difference your process improvements have made, check out Analyzing a Process Before and After Improvement: Historical Control Charts with Stages and Setting the Stage: Accounting for Process Changes in a Control Chart.

While the basic idea of control charting is very simple, interpreting real-world control charts can be a little tricky. If you're using Minitab 17, be sure to check out this post about a great new feature in the Assistant: The Stability Report for Control Charts in Minitab 17 includes Example Patterns.

Finally, one of our expert statistical trainers offers his suggestions about Five Ways to Make Your Control Charts More Effective.

Control Chart Examples

Control charts are most frequently used for quality improvement and assurance, but they can be applied to almost any situation that involves variation.

My favorite example of applying the lessons of quality improvement in business to your personal life involves Bill Howell, who applied his Six Sigma expertise to the (successful) management of his diabetes. Find out how he uses Control Charts to Keep Blood Sugar in Check.

Some of our bloggers have applied control charts to their personal passions, including holiday candies in Control Charts: Rational Subgrouping and Marshmallow Peeps! and bicycling in The Problem With P-Charts: Out-of-control Cycle LaneYs!.

If you're into sports, see how control charts can reveal When NHL Goalies Should Get Pulled. Or look to the cosmos to consider Signal to Noise: Detecting Extraterrestrials and Special Causes. And finally, compulsive readers like myself might be interested to see how relevant control charts are to literature, too, as Cody Steele illustrates in Laney P' Charts Show How Poe Creates Intensity in "The Fall of the House of Usher."

How are you using control charts?

I confess: I'm not a natural-born decision-maker. Some people—my wife, for example—can assess even very complex situations, consider the options, and confidently choose a way forward. Me? I get anxious about deciding what to eat for lunch. So you can imagine what it used to be like when I needed to confront a really big decision or problem. My approach, to paraphrase the Byrds, was "Re: everything, churn, churn, churn." question to answer

Thank heavens for Pareto charts.

What Is a Pareto Chart, and How Do You Use It?

A Pareto chart is a basic quality tool that helps you identify the most frequent defects, complaints, or any other factor you can count and categorize. The chart takes its name from Vilfredo Pareto, originator of the "80/20 rule," which postulates that, roughly speaking, 20 percent of the people own 80 percent of the wealth. Or, in quality terms, 80 percent of the losses come from 20 percent of the causes.

You can use a Pareto chart any time you have data that are broken down into categories, and you can count how often each category occurs. As children, most of us learned how to use this kind of data to make a bar chart:

bar chart

A Pareto chart is just a bar chart that arranges the bars (counts) from largest to smallest, from left to right. The categories or factors symbolized by the bigger bars on the left are more important than those on the right.

Pareto Chart

By ordering the bars from largest to smallest, a Pareto chart helps you visualize which factors comprise the 20 percent that are most critical—the "vital few"—and which are the "trivial many."

A cumulative percentage line helps you judge the added contribution of each category. If a Pareto effect exists, the cumulative line rises steeply for the first few defect types and then levels off. In cases where the bars are approximately the same height, the cumulative percentage line makes it easier to compare categories.

It's common sense to focus on the ‘vital few’ factors. In the quality improvement arena, Pareto charts help teams direct their efforts where they can make the biggest impact. By taking a big problem and breaking it down into smaller pieces, a Pareto chart reveals where our efforts will create the most improvement.

If a Pareto chart seems rather basic, well, it is. But like a simple machine, its very simplicity makes the Pareto chart applicable to a very wide range of situations, both within and beyond quality improvement.

Use a Pareto Chart Early in Your Quality Improvement Process

At the leadership or management level, Pareto charts can be used at the start of a new round of quality improvement to figure out what business problems are responsible for the most complaints or losses, and dedicate improvement resources to those. Collecting and examining data like that can often result in surprises and upend an organization's "conventional wisdom." For example, leaders at one company believed that the majority of customer complaints involved product defects. But when they saw the complaint data in a Pareto chart, it showed that many more people complained about shipping delays. Perhaps the impression that defects caused the most complaints arose because the relatively few people who received defective products tended to complain very loudly—but since more customers were affected by shipping delays, the company's energy was better devoted to solving that problem.

Use a Pareto Chart Later in Your Quality Improvement Process

Once a project has been identified, and a team assembled to improve the problem, a Pareto chart can help the team select the appropriate areas to focus on. This is important because most business problems are big and multifaceted. For instance, shipping delays may occur for a wide variety of reasons, from mechanical breakdowns and accidents to data-entry mistakes and supplier issues. If there are many possible causes a team could focus on, it's smart to collect data about which categories account for the biggest number of incidents. That way, the team can choose a direction based on the numbers and not the team's "gut feeling."

Use a Pareto Chart to Build Consensus

Pareto charts also can be very helpful in resolving conflicts, particularly if a project involves many moving parts or crosses over many different units or work functions. Team members may have sharp disagreements about how to proceed, either because they wish to defend their own departments or because they honestly believe they know where the problem lies. For example, a hospital project improvement team was stymied in reducing operating room delays because the anesthesiologists blamed the surgeons, while the surgeons blamed the anesthesiologists. When the project team collected data and displayed it in a Pareto chart, it turned out that neither group accounted for a large proportion of the delays, and the team was able to stop finger-pointing. Even if the chart had indicated that one group or the other was involved in a significantly greater proportion of incidents, helping the team members see which types of delays were most 'vital' could be used to build consensus.

Use Pareto Charts Outside of Quality Improvement Projects

Their simplicity also makes Pareto charts a valuable tool for making decisions beyond the world of quality improvement. By helping you visualize the relative importance of various categories, you can use them to prioritize customer needs, opportunities for training or investment—even your choices for lunch.

How to Create a Pareto Chart

Creating a Pareto chart is not difficult, even without statistical software. Of course, if you're using Minitab, the software will do all this for you automatically—create a Pareto chart by selecting Stat > Quality Tools > Pareto Chart... or by selecting Assistant > Graphical Analysis > Pareto Chart. You can collect raw data, in which each observation is recorded in a separate row of your worksheet, or summary data, in which you tally observation counts for each category.

1. Gather Raw Data about Your Problem

Be sure you collect a random sample that fully represents your process. For example, if you are counting the number of items returned to an electronics store in a given month, and you have multiple locations, you should not gather data from just one store and use it to make decisions about all locations. (If you want to compare the most important defects for different stores, you can show separate charts for each one side-by-side.)

2. Tally Your Data

Add up the observations in each of your categories.

3. Label your horizontal and vertical axes.

Make the widths of all your horizontal bars the same and label the categories in order from largest to smallest. On the vertical axis, use round numbers that slightly exceed your top category count, and include your measurement unit.

4. Draw your category bars.

Using your vertical axis, draw bars for each category that correspond to their respective counts. Keep the width of each bar the same.

5. Add cumulative counts and lines.

As a final step, you can list the cumulative counts along the horizontal axis and make a cumulative line over the top of your bars. Each category's cumulative count is the count for that category PLUS the total count of the preceding categories. If you want to add a line, draw a right axis and label it from 0 to 100%, lined up with the with the grand total on the left axis. Above the right edge of each category, mark a point at the cumulative total, then connect the points.

How to Add an "Update Data from My Database" Button to a Minitab Menu or Toolbar

What Is Acceptance Sampling?

How to Perform Acceptance Sampling by Variables, part 1

How to Perform Acceptance Sampling by Variables, Part 2

How to Perform Acceptance Sampling by Variables, part 3

How to Perform Acceptance Sampling by Attributes

Imprisoned by Statistics: How Poor Data Collection and Analysis Sent an Innocent Nurse to Jail

A Surgeon's View of Data-Driven Quality Improvement

A Six Sigma Healthcare Project, part 1: Examining Factors with a Pareto Chart

A Six Sigma Healthcare Project, part 2: Visualizing the Impact of Individual Factors

A Six Sigma Healthcare Project, part 3: Creating a Binary Logistic Regression Model for Patient ...

A Six Sigma Healthcare Project, part 4: Predicting Patient Participation with Binary Logistic ...

Those 10 Simple Rules for Using Statistics? They're Not Just for Research

Applying DOE for Great Grilling, part 1

Applying DOE for Great Grilling, part 2

Can Regression and Statistical Software Help You Find a Great Deal on a Used Car?

When Should You Mistrust Statistics?

Data Not Normal? Try Letting It Be, with a Nonparametric Hypothesis Test

Control Chart Tutorials and Examples

When to Use a Pareto Chart