Evaluation Approaches and Designs
Create a table with the 15 elements of the formative evaluation and the 6 elements of the process evaluation located in Tables 14.1 and 14.2 of the textbook. Clearly explain how each element will be addressed in your program.
mental health and stress reduction
PH 575 Evaluation Part 2
OUTCOME EVALUATION
1
FIRST, A REVIEW….
Process Evaluation
Addresses the timeframe of events, procedures and policies; reach of the program
Impact Evaluation
Measures the recipients’ experience
Outcome Evaluation
Determines the extent of which type of change
Objectives must correlate with each level of
Of evaluation!
This is basic overview of information covered in each type of evaluation.
2
How Important is an Outcome Evaluation?
Focuses on an ultimate goal or product of a program or treatment, generally measured in the health field by mortality or morbidity data in a population. Also may include:
Vital measures
Symptoms
Signs or physiological indicators on individuals
The Outcome evaluation is long-term in nature and generally takes more time and resources to conduct than impact evaluation. Ultimately it makes a determination of the effect of a program or policy on its beneficiaries
You need to examine the scope of your program and the feasibility of conducting outcome evaluation. This also depends of the overall needs of your stakeholders.
3
DEVELOPING THE EVALUATION QUESTIONS
Characteristics of the “Right Question”
Relevant data can be collected
More than one answer is possible
Produces information that decision makers want and they feel they need
4
Three Levels of Intervention Effect Evaluations
Outcome Documentation | Outcome Assessment | Outcome Evaluation | |
“Process” | “Impact” | “Outcome” | |
Purpose | Show that outcome and impact objectives were met | Determine whether participants in the program experienced any change/benefit | Determine whether participating in the program caused a change or benefit for participants |
Level of Rigor | Minimal | Moderate | Maximum |
Data Collection | Data type and collection timing based on objectives being measured | Data type based on program effect theory; timing based on feasibility | Data type based on program effect theory; |
5
Differences Between Evaluation and Research
Characteristic | Research | Evaluation |
Goal or purpose | Generates new knowledge for prediction | Social accountability and program or policy decision making |
Questions addressed | Scientists own questions | Questions derived from program goals and impact objectives |
Nature of problem | Areas where knowledge is lacking | Outcomes and impacts related to program |
Guiding theory | Theory used as basis for hypothesis testing | Theory underlying the program interventions for behavior change |
Appropriate techniques | Sampling, statistics, hypothesis testing, and so on | Whichever research techniques fit with the problem |
Setting | Anywhere appropriate to the question | Any setting where evaluators can access the program recipients |
Dissemination | Scientific journals | Internal and externally viewed program reports; scientific journals |
Allegiance | Scientific community | Funding source, policy preference, scientific community |
Differences Between Evaluation and Research:
These are important to review and understand the different characteristics of each.
6
Factors That Influence the Outcome Evaluation Process
Factors that cannot be controlled
Level of funding and the resources allocated for the evaluation
Personnel support
The time line
Access to the study population and to program data
Factors that can be controlled
Level and type of expertise on the evaluation team
Management of the evaluation process
7
Frequent Questions for Outcome Evaluations
Did the program make a difference and produce a change in those affected directly by it?
What results were produced by the initiative over time?
What changes in knowledge, attitudes, beliefs, or behavior were produced and maintained in those who participated?
What changes occurred in the environment as a result of the initiative?
What trends occurred in the incidence of the public health problem over time?
Review these questions that could be answered through outcome evaluations.
8
Practical Problems or Barriers in Evaluation
Failure to plan for evaluation
Inadequate resources
Organizational restrictions
Effects hard to detect; small, slow coming, or don’t last
Time allocated to evaluation
Restrictions in data collection
Difficult to distinguish between cause & effect
Difficult to evaluate multi-strategy interventions
Conflict between professional standards & do-it-yourselfers over appropriate design
Sometimes people’s motives get in the way
Intervention not delivered as intended
Failure to plan for evaluation- need to plan for evaluation from the beginning. Failure to do so impedes your ability to evaluate effectively
Inadequate resources- evaluation may not be feasible if resources are not adequate to deliver program as intended.
Organizational restrictions
Effects hard to detect; small, slow coming, or don’t last- may have to do with plan or dose
Time allocated to evaluation- may not have time to measure behavior change
Restrictions in data collection- may not be able to obtain information that you need
Difficult to distinguish between cause & effect
Difficult to evaluate multi-strategy interventions- where were the objectives impacted?
Conflict between professional standards & do-it-yourselfers over appropriate design
Sometimes people’s motives get in the way
Intervention not delivered as intended- process
Who will conduct the evaluation?
Internal evaluation – Advantages
More familiar with organization & program
Knows decision making style of organization
Present to remind people of results
Able to communicate results more frequently & clearly
External evaluation – Advantages
More objective; fresh outlook
Can ensure unbiased evaluation outcome
Brings global knowledge
Typical brings more breath & depth of technical expertise
Combination of internal & external
Must weigh the advantages and disadvantages to determine best choice for evaluator.
Your Evaluation Results
Who will receive the results of the evaluation?
In what form will they be delivered?
Different stakeholders may want different questions answered
The planning for the evaluation should include a determination of how the results will be used.
These questions need to be considered during the planning process. You may stress different aspects of the evaluation to different groups. Must include how the findings are going to be used.
Part 2 Outcome Evaluation Approaches and Designs
12
Experimental, Control, & Comparison Groups
Experimental group – the group of individuals who receive the intervention
Control group – should be similar to the experimental group, but the individuals in this group do not receive the intervention; given this label when individuals are randomly assigned to this group
Conner’s premises for control group use: 1)right to status quo, 2) informed of purpose, 3) right to new services, & 4) not subjected to infective or harmful programs
Comparison group – when individuals cannot be randomly assigned to an experimental or control group, this nonequivalent group may be formed
Outcome evaluation:
Review the different types of groups that may be included in an outcome evaluation
Allows systematic data collection
CDC suggests evaluation design be developed in PP Stage 3
There is no perfect evaluation design, but much though should go into best design for your program and situation.
Questions:
Money
Time
type of data needed
validity
Evaluation Designs – 1
Experimental design 1. uses a pretest to ensure that groups are similar prior to intervention.
Challenges: requires large numbers of participant, requires random assignment
Advantages: most control and validity, comparison relates most closely to the intervention
Quasi-experimental- functions when random assignment is not possible
Non-experimental- may use when a control group cannot be assigned, not as statistically powerful as experimental or quasi-experimental
Evaluation Designs
Experimental design – greatest control over confounding variables; involves randomization to experimental & control groups & measurement of each; most interpretable & defensible evidence of effectiveness
Quasi-experimental – results in interpretable & defensible evidence of effectiveness; no randomization; comparisons are made on experimental & comparison (usually intact) groups; some control over confounding variables
Nonexperimental – without use of a comparison or control group; little control over confounding variables
Review advantages and disadvantages of each
O1 X O2
Where:
O1 = pretest measurement, such as the average number of cigarettes smoked in 24 hours; measured via self-report one month prior
X = intervention (independent variable), such as the four week, eight session, smoking cessation program
O2 = posttest measurement, such as the average number of cigarettes smoked in 24 hours; measured via self-report one month after the last session
Let’s look at a basic evaluation design…
Pretest-Posttest design
Threats to Quality of Data (Validity)
Missing data or information
Internal validity – the degree to which the program (intervention, treatment, independent variable) & not extraneous factors (confounding variables) cause the change that was measured
External validity – the extent to which the program (intervention, treatment, independent variable) can be expected to produce similar effects in other populations (generalizability)
Which type of validity is most important in program evaluation?
Validity: measure of how sound your research is, findings truly represent what you are trying to measure
17
Threats to Internal Validity
History
Maturation
Testing (e.g., pre-testing)
Instrumentation
Statistical regression
Selection
Attrition
Diffusion or imitation of interventions
Compensatory equalization or rivalry
Interaction of several threats
O1 X O2
History: event happens between pre and post test that changes results, but is not related to intervention (participant views PSA, etc)
Maturation: participants show pre to post changes as a result of their changes in maturity (strength, development, etc)
Testing: participants get used to the test (change forms of the test to combat)
Instrumentation: change in measurement
Selection: differences between experimental and control group
Attrition: participants drop out of the intervention
Best way to control is through proper planning and development of evaluation plan. Best to control through randomization of groups, if possible.
Threats to External Validity (Generalizability)
Reactive effects:
Social desirability
Expectancy effect
Hawthorne effect
Placebo effect
Social desirability- participants report falsely report behavior that they feel more socially desirable (teeth brushing)
Expectancy effect- attitudes projected onto individuals cause them to act in a certain way
Hawthorne effect- behavior change because of the special effect of being watched
Placebo effect- change due to the participants’ belief in the treatment
19
Choosing A Data-Collection Method
Should incorporate both qualitative and quantitative data collection methods whenever possible
Design should be simple, flexible and responsive to the changing needs of the project.
Consider resources that are available
Sensitivity to the respondents/participants in the project
Credibility
Is the tool valid?
How reliable is it?
Are they suitable for the population
Can they detect the important issues?
Importance of the Information
20
Choosing A Data-Collection Approach (cont’d)
Surveys – advantages
Constructing items can be a difficult and time-consuming exercise BUT once developed, can be used over and over
Relatively easy to administer; collect a lot of info in a short time
Can be evaluated for reliability and validity
Generally less expensive than other research instruments
Can be administered in groups, individually, by mail, by telephone, or web-based
21
Selecting Outcomes
What is the difference between Program Output versus Outcome?
Output is usually measured in terms of numbers of things, events, or persons (proportions)
Outcomes refers to events or conditions of direct to the individual or community that are external to the program (percent of people or community)
Might be immediate (<1 year), intermediate (2-3 years) or long term (3 – 5 years)
22
Example of logic models for you to review. This is a little different, but is worth reviewing for you to gain understanding of what information goes in each column.
23
Another example of logic models for you to review. This is a little different, but is worth reviewing for you to gain understanding of what information goes in each column.
24
Part 3 Data Analysis
This is just a very superficial overview of data analysis.
25
Data Analysis
Begins with being able to identify the variables
Variables – a characteristic or attribute that can be measured or observed (Creswell, 2002)
Types of variables: independent (controlled or cause or exert some influence) & dependent (are outcome variables that are being studied)
Also, the level(s) of data collected are important
Nominal
Ordinal
Interval
Ratio
Data Analysis – 2
Descriptive statistics – used to organize, summarize & describe characteristics
Inferential statistics – concerned with relationships & causality to make generalizations about a population based on a sample
Analyses
Univariate (1 variable)
Bivariate (2 variables)
Multivariate (More than 2 variables)
Univariate Data Analyses
Analyze and assess ONE variable at a time
Provides summary counts (frequency distributions)
Examine measures of central tendency – e.g., mean, median, & mode
Examine measures of spread or variation – e.g., range, standard deviation, variance
Bivariate Analyses
Can be non-statistical comparisons
Example of non-statistical comparisons (eyeballing the data)
Female Male
Yes 62 36
No 46 50
Bivariate Analyses – 2
Hypotheses
Null
Type I error – failing to reject when the null hypothesis is true
Type II error – failing to reject when the null hypothesis is not true
Level of significance (alpha level) – probability of making a type I error; e.g., p<.01
Alternative
Statistical significance – refers to whether the observed differences between the two or more groups are real or not, or whether they are chance occurrences
Practical significance – measures the meaningfulness of the program regardless of statistical significance
Bivariate Analyses – 3
Statistical tests
Chi-square (nominal/ordinal data)
t-test (numerical)
Dependent (one group twice)
Independent (two groups once)
ANOVA (numerical) – means of ≥ 2 groups
Correlations (numerical) – strength of relationship)
Multivariate Data Analyses
Analyses to study three or more variables
Multiple regression – predicting by using several variables (numerical)
Stepwise
Logistic
General linear
Examples of Evaluation Questions Answered Using Univariate, Bivariate, and Multivariate Data Analysis
Univariate Analysis | Bivariate Analysis | Multivariate Analysis |
What was the average score on the cholesterol knowledge test? | Is there a difference in smoking behavior between the individuals in the experimental and control groups after the healthy lifestyle program? | Can the risk of heart disease be predicted using smoking, exercise, diet, and heredity? |
How many participants at the worksite attended the healthy lifestyle presentation? | Is peer education or classroom instruction more effective in increasing knowledge about the effects of drug abuse? | Can mortality risk among motorcycle riders be predicted from helmet use, time of day, weather conditions, and speed? |
What percentage of the participants in the corporate fitness program met their target goal? | Do students’ attitudes about bicycle helmets differ in rural and urban settings? | Which of the following most accurately predict successful management of stress among program participants: physical activity, diet, meditation, anger management, yoga, or deep breathing? |
Measurement Considerations
Units of Observation
The unit that is observed or measured must match the level at which the program is targeted and delivered
Types of Variables
Nominal variables – simplest, “dichotomous”, categorical
Ordinal variables – indicate a level, order, sequence or rank
Interval variables – continuous, discrete, most complex
For every area of health and wellbeing,
health behavior etc., variables can be
constructed at each of the three levels
above
34
Advantages & Disadvantages of Using Each Type of Variable
Type | Examples | Advantage | Disadvantage |
Nominal (categorical) | ZIP code; race; yes/no | Easy to understand | Limited information from the data |
Ordinal (rank) | Social class; Likert-type scale; “top 10” list (worst to best | Gives considerable information; can collapse into nominal categories | Sometimes statistically treated as a nominal variable; ranking can be a difficult task for respondents |
Interval (continuous) | Temperature, age, IQ, distances, dollars, dates of birth | Gives most information; can collapse into nominal or ordinal categories; can use as a continuous variable | Can be difficult to construct valid and reliable interval variables |
35
Examples of Nominal, Ordinal and Continuous Variables
Dependent Outcome Variable | Nominal | Ordinal | Interval/Continuous |
Physical Abuse | Yes/No have experienced physical abuse; type of abuse | Level of abuse is same, more, or less than last month | Rate of physical abuse in a county; number of times abused in past 6 months |
Workplace injury | ICD-10 code for injury | Level of severity of injury | Number of disability days per year in company or construction industry |
Understand how alcohol affects judgment | Agree/Disagree | Three most common ways alcohol affects judgment | Score on test of knowledge about effects of alcohol |
Smoking | Yes/No smoke | Light, moderate, or heavy smoker | Nicotine levels; number of cigarettes in past 24 hours |
Readiness for change | Yes/No likely to changed next week | Stage of change in terms of readiness | How likely (on 100 point scale) will change in next week |
36
Final Thoughts
Evaluation design should be considered early in the planning process. Evaluators need to identify what measurements will be taken as well as when and how. In doing so, a design should be selected that controls for both internal and external validity.
This section of the lecture may be difficult if you have not had a course in data analysis or evaluation. If you remember nothing else, please have someone on board with expertise in this area BEFORE you complete the planning. It may involve hiring or contracting with someone but remember that all your hard work of planning a program is worth nothing if you do not have a decent evaluation plan and credible numbers to show that your intervention works!
Satisfaction surveys are NOT ENOUGH!
37
Leave a Reply
Want to join the discussion?Feel free to contribute!