Evaluation Approaches and Designs

Create a table with the 15 elements of the formative evaluation and the 6 elements of the process evaluation located in Tables 14.1 and 14.2 of the textbook. Clearly explain how each element will be addressed in your program.


mental health and stress reduction

PH 575 Evaluation Part 2





Process Evaluation

Addresses the timeframe of events, procedures and policies; reach of the program

Impact Evaluation

Measures the recipients’ experience

Outcome Evaluation

Determines the extent of which type of change


Objectives must correlate with each level of

Of evaluation!

This is basic overview of information covered in each type of evaluation.


How Important is an Outcome Evaluation?

Focuses on an ultimate goal or product of a program or treatment, generally measured in the health field by mortality or morbidity data in a population. Also may include:

Vital measures


Signs or physiological indicators on individuals

The Outcome evaluation is long-term in nature and generally takes more time and resources to conduct than impact evaluation. Ultimately it makes a determination of the effect of a program or policy on its beneficiaries


You need to examine the scope of your program and the feasibility of conducting outcome evaluation. This also depends of the overall needs of your stakeholders.



Characteristics of the “Right Question”

Relevant data can be collected

More than one answer is possible

Produces information that decision makers want and they feel they need



Three Levels of Intervention Effect Evaluations

Outcome Documentation Outcome Assessment Outcome Evaluation
“Process” “Impact” “Outcome”
Purpose Show that outcome and impact objectives were met Determine whether participants in the program experienced any change/benefit Determine whether participating in the program caused a change or benefit for participants
Level of Rigor Minimal Moderate Maximum
Data Collection Data type and collection timing based on objectives being measured Data type based on program effect theory; timing based on feasibility Data type based on program effect theory;



Differences Between Evaluation and Research

Characteristic Research Evaluation
Goal or purpose Generates new knowledge for prediction Social accountability and program or policy decision making
Questions addressed Scientists own questions Questions derived from program goals and impact objectives
Nature of problem Areas where knowledge is lacking Outcomes and impacts related to program
Guiding theory Theory used as basis for hypothesis testing Theory underlying the program interventions for behavior change
Appropriate techniques Sampling, statistics, hypothesis testing, and so on Whichever research techniques fit with the problem
Setting Anywhere appropriate to the question Any setting where evaluators can access the program recipients
Dissemination Scientific journals Internal and externally viewed program reports; scientific journals
Allegiance Scientific community Funding source, policy preference, scientific community

Differences Between Evaluation and Research:

These are important to review and understand the different characteristics of each.


Factors That Influence the Outcome Evaluation Process

Factors that cannot be controlled

Level of funding and the resources allocated for the evaluation

Personnel support

The time line


Access to the study population and to program data

Factors that can be controlled

Level and type of expertise on the evaluation team

Management of the evaluation process



Frequent Questions for Outcome Evaluations

Did the program make a difference and produce a change in those affected directly by it?

What results were produced by the initiative over time?

What changes in knowledge, attitudes, beliefs, or behavior were produced and maintained in those who participated?

What changes occurred in the environment as a result of the initiative?

What trends occurred in the incidence of the public health problem over time?

Review these questions that could be answered through outcome evaluations.


Practical Problems or Barriers in Evaluation

Failure to plan for evaluation

Inadequate resources

Organizational restrictions

Effects hard to detect; small, slow coming, or don’t last

Time allocated to evaluation

Restrictions in data collection

Difficult to distinguish between cause & effect

Difficult to evaluate multi-strategy interventions

Conflict between professional standards & do-it-yourselfers over appropriate design

Sometimes people’s motives get in the way

Intervention not delivered as intended

Failure to plan for evaluation- need to plan for evaluation from the beginning. Failure to do so impedes your ability to evaluate effectively

Inadequate resources- evaluation may not be feasible if resources are not adequate to deliver program as intended.

Organizational restrictions

Effects hard to detect; small, slow coming, or don’t last- may have to do with plan or dose

Time allocated to evaluation- may not have time to measure behavior change

Restrictions in data collection- may not be able to obtain information that you need

Difficult to distinguish between cause & effect

Difficult to evaluate multi-strategy interventions- where were the objectives impacted?

Conflict between professional standards & do-it-yourselfers over appropriate design

Sometimes people’s motives get in the way

Intervention not delivered as intended- process

Who will conduct the evaluation?

Internal evaluation – Advantages

More familiar with organization & program

Knows decision making style of organization

Present to remind people of results

Able to communicate results more frequently & clearly


External evaluation – Advantages

More objective; fresh outlook

Can ensure unbiased evaluation outcome

Brings global knowledge

Typical brings more breath & depth of technical expertise


Combination of internal & external

Must weigh the advantages and disadvantages to determine best choice for evaluator.

Your Evaluation Results

Who will receive the results of the evaluation?


In what form will they be delivered?


Different stakeholders may want different questions answered


The planning for the evaluation should include a determination of how the results will be used.


These questions need to be considered during the planning process. You may stress different aspects of the evaluation to different groups. Must include how the findings are going to be used.

Part 2 Outcome Evaluation Approaches and Designs



Experimental, Control, & Comparison Groups

Experimental group – the group of individuals who receive the intervention


Control group – should be similar to the experimental group, but the individuals in this group do not receive the intervention; given this label when individuals are randomly assigned to this group

Conner’s premises for control group use: 1)right to status quo, 2) informed of purpose, 3) right to new services, & 4) not subjected to infective or harmful programs


Comparison group – when individuals cannot be randomly assigned to an experimental or control group, this nonequivalent group may be formed

Outcome evaluation:

Review the different types of groups that may be included in an outcome evaluation

Allows systematic data collection

CDC suggests evaluation design be developed in PP Stage 3

There is no perfect evaluation design, but much though should go into best design for your program and situation.




type of data needed



Evaluation Designs – 1

Experimental design 1. uses a pretest to ensure that groups are similar prior to intervention.

Challenges: requires large numbers of participant, requires random assignment

Advantages: most control and validity, comparison relates most closely to the intervention

Quasi-experimental- functions when random assignment is not possible

Non-experimental- may use when a control group cannot be assigned, not as statistically powerful as experimental or quasi-experimental

Evaluation Designs

Experimental design – greatest control over confounding variables; involves randomization to experimental & control groups & measurement of each; most interpretable & defensible evidence of effectiveness


Quasi-experimental – results in interpretable & defensible evidence of effectiveness; no randomization; comparisons are made on experimental & comparison (usually intact) groups; some control over confounding variables


Nonexperimental – without use of a comparison or control group; little control over confounding variables

Review advantages and disadvantages of each

O1 X O2


O1 = pretest measurement, such as the average number of cigarettes smoked in 24 hours; measured via self-report one month prior


X = intervention (independent variable), such as the four week, eight session, smoking cessation program


O2 = posttest measurement, such as the average number of cigarettes smoked in 24 hours; measured via self-report one month after the last session

Let’s look at a basic evaluation design…

Pretest-Posttest design

Threats to Quality of Data (Validity)

Missing data or information


Internal validity – the degree to which the program (intervention, treatment, independent variable) & not extraneous factors (confounding variables) cause the change that was measured


External validity – the extent to which the program (intervention, treatment, independent variable) can be expected to produce similar effects in other populations (generalizability)


Which type of validity is most important in program evaluation?


Validity: measure of how sound your research is, findings truly represent what you are trying to measure




Threats to Internal Validity



Testing (e.g., pre-testing)


Statistical regression



Diffusion or imitation of interventions

Compensatory equalization or rivalry

Interaction of several threats


O1 X O2



History: event happens between pre and post test that changes results, but is not related to intervention (participant views PSA, etc)

Maturation: participants show pre to post changes as a result of their changes in maturity (strength, development, etc)

Testing: participants get used to the test (change forms of the test to combat)

Instrumentation: change in measurement

Selection: differences between experimental and control group

Attrition: participants drop out of the intervention


Best way to control is through proper planning and development of evaluation plan. Best to control through randomization of groups, if possible.

Threats to External Validity (Generalizability)

Reactive effects:

Social desirability

Expectancy effect

Hawthorne effect

Placebo effect

Social desirability- participants report falsely report behavior that they feel more socially desirable (teeth brushing)

Expectancy effect- attitudes projected onto individuals cause them to act in a certain way

Hawthorne effect- behavior change because of the special effect of being watched

Placebo effect- change due to the participants’ belief in the treatment




Choosing A Data-Collection Method

Should incorporate both qualitative and quantitative data collection methods whenever possible

Design should be simple, flexible and responsive to the changing needs of the project.

Consider resources that are available

Sensitivity to the respondents/participants in the project


Is the tool valid?

How reliable is it?

Are they suitable for the population

Can they detect the important issues?

Importance of the Information



Choosing A Data-Collection Approach (cont’d)

Surveys – advantages

Constructing items can be a difficult and time-consuming exercise BUT once developed, can be used over and over

Relatively easy to administer; collect a lot of info in a short time

Can be evaluated for reliability and validity

Generally less expensive than other research instruments

Can be administered in groups, individually, by mail, by telephone, or web-based



Selecting Outcomes

What is the difference between Program Output versus Outcome?

Output is usually measured in terms of numbers of things, events, or persons (proportions)


Outcomes refers to events or conditions of direct to the individual or community that are external to the program (percent of people or community)


Might be immediate (<1 year), intermediate (2-3 years) or long term (3 – 5 years)




Example of logic models for you to review. This is a little different, but is worth reviewing for you to gain understanding of what information goes in each column.


Another example of logic models for you to review. This is a little different, but is worth reviewing for you to gain understanding of what information goes in each column.



Part 3 Data Analysis

This is just a very superficial overview of data analysis.


Data Analysis

Begins with being able to identify the variables


Variables – a characteristic or attribute that can be measured or observed (Creswell, 2002)


Types of variables: independent (controlled or cause or exert some influence) & dependent (are outcome variables that are being studied)


Also, the level(s) of data collected are important







Data Analysis – 2

Descriptive statistics – used to organize, summarize & describe characteristics


Inferential statistics – concerned with relationships & causality to make generalizations about a population based on a sample



Univariate (1 variable)

Bivariate (2 variables)

Multivariate (More than 2 variables)


Univariate Data Analyses

Analyze and assess ONE variable at a time


Provides summary counts (frequency distributions)


Examine measures of central tendency – e.g., mean, median, & mode


Examine measures of spread or variation – e.g., range, standard deviation, variance


Bivariate Analyses

Can be non-statistical comparisons

Example of non-statistical comparisons (eyeballing the data)


Female Male

Yes 62 36


No 46 50




Bivariate Analyses – 2



Type I error – failing to reject when the null hypothesis is true

Type II error – failing to reject when the null hypothesis is not true

Level of significance (alpha level) – probability of making a type I error; e.g., p<.01



Statistical significance – refers to whether the observed differences between the two or more groups are real or not, or whether they are chance occurrences

Practical significance – measures the meaningfulness of the program regardless of statistical significance


Bivariate Analyses – 3

Statistical tests

Chi-square (nominal/ordinal data)


t-test (numerical)

Dependent (one group twice)

Independent (two groups once)


ANOVA (numerical) – means of ≥ 2 groups


Correlations (numerical) – strength of relationship)




Multivariate Data Analyses

Analyses to study three or more variables


Multiple regression – predicting by using several variables (numerical)



General linear


Examples of Evaluation Questions Answered Using Univariate, Bivariate, and Multivariate Data Analysis

Univariate Analysis Bivariate Analysis Multivariate Analysis
What was the average score on the cholesterol knowledge test? Is there a difference in smoking behavior between the individuals in the experimental and control groups after the healthy lifestyle program? Can the risk of heart disease be predicted using smoking, exercise, diet, and heredity?
How many participants at the worksite attended the healthy lifestyle presentation? Is peer education or classroom instruction more effective in increasing knowledge about the effects of drug abuse? Can mortality risk among motorcycle riders be predicted from helmet use, time of day, weather conditions, and speed?
What percentage of the participants in the corporate fitness program met their target goal? Do students’ attitudes about bicycle helmets differ in rural and urban settings? Which of the following most accurately predict successful management of stress among program participants: physical activity, diet, meditation, anger management, yoga, or deep breathing?

Measurement Considerations

Units of Observation

The unit that is observed or measured must match the level at which the program is targeted and delivered


Types of Variables

Nominal variables – simplest, “dichotomous”, categorical

Ordinal variables – indicate a level, order, sequence or rank

Interval variables – continuous, discrete, most complex


For every area of health and wellbeing,

health behavior etc., variables can be

constructed at each of the three levels




Advantages & Disadvantages of Using Each Type of Variable

Type Examples Advantage Disadvantage
Nominal (categorical) ZIP code; race; yes/no Easy to understand Limited information from the data
Ordinal (rank) Social class; Likert-type scale; “top 10” list (worst to best Gives considerable information; can collapse into nominal categories Sometimes statistically treated as a nominal variable; ranking can be a difficult task for respondents
Interval (continuous) Temperature, age, IQ, distances, dollars, dates of birth Gives most information; can collapse into nominal or ordinal categories; can use as a continuous variable Can be difficult to construct valid and reliable interval variables



Examples of Nominal, Ordinal and Continuous Variables

Dependent Outcome Variable Nominal Ordinal Interval/Continuous
Physical Abuse Yes/No have experienced physical abuse; type of abuse Level of abuse is same, more, or less than last month Rate of physical abuse in a county; number of times abused in past 6 months
Workplace injury ICD-10 code for injury Level of severity of injury Number of disability days per year in company or construction industry
Understand how alcohol affects judgment Agree/Disagree Three most common ways alcohol affects judgment Score on test of knowledge about effects of alcohol
Smoking Yes/No smoke Light, moderate, or heavy smoker Nicotine levels; number of cigarettes in past 24 hours
Readiness for change Yes/No likely to changed next week Stage of change in terms of readiness How likely (on 100 point scale) will change in next week



Final Thoughts

Evaluation design should be considered early in the planning process. Evaluators need to identify what measurements will be taken as well as when and how. In doing so, a design should be selected that controls for both internal and external validity.


This section of the lecture may be difficult if you have not had a course in data analysis or evaluation. If you remember nothing else, please have someone on board with expertise in this area BEFORE you complete the planning. It may involve hiring or contracting with someone but remember that all your hard work of planning a program is worth nothing if you do not have a decent evaluation plan and credible numbers to show that your intervention works!


Satisfaction surveys are NOT ENOUGH!



0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *