POPULATIONS AND SAMPLING

 

Populations

Definition - a complete set of elements (persons or objects) that possess some common characteristic defined by the sampling criteria established by the researcher

Composed of two groups - target population & accessible population

 

Target population (universe)

The entire group of people or objects to which the researcher wishes to generalize the study findings

Meet set of criteria of interest to researcher

Examples

All institutionalized elderly with Alzheimer's

All people with AIDS

All low birth weight infants

All school-age children with asthma

All pregnant teens

Accessible population

the portion of the population to which the researcher has reasonable access; may be a subset of the target population

May be limited to region, state, city, county, or institution

Examples

All institutionalized elderly with Alzheimer's in St. Louis county nursing homes

All people with AIDS in the metropolitan St. Louis area

All low birth weight infants admitted to the neonatal ICUs in St. Louis city & county

All school-age children with asthma treated in pediatric asthma clinics in university-affiliated medical centers in the Midwest

All pregnant teens in the state of Missouri

 

Samples

Terminology used to describe samples and sampling methods

Sample = the selected elements (people or objects) chosen for participation in a study; people are referred to as subjects or participants

Sampling = the process of selecting a group of people, events, behaviors, or other elements with which to conduct a study

Sampling frame = a list of all the elements in the population from which the sample is drawn

Could be extremely large if population is national or international in nature

Frame is needed so that everyone in the population is identified so they will have an equal opportunity for selection as a subject (element)

Examples

A list of all institutionalized elderly with Alzheimer's in St. Louis county nursing homes affiliated with BJC

A list of all people with AIDS in the metropolitan St. Louis area who are members of the St. Louis Effort for AIDS

A list of all low birth weight infants admitted to the neonatal ICUs in St. Louis city & county in 1998

A list of all school-age children with asthma treated in pediatric asthma clinics in university-affiliated medical centers in the Midwest

A list of all pregnant teens in the Henderson school district

Randomization = each individual in the population has an equal opportunity to be selected for the sample

Representativeness = sample must be as much like the population in as many ways as possible

Sample reflects the characteristics of the population, so those sample findings can be generalized to the population

Most effective way to achieve representativeness is through randomization; random selection or random assignment

Parameter = a numerical value or measure of a characteristic of the population; remember P for parameter & population

Statistic = numerical value or measure of a characteristic of the sample; remember S for sample & statistic

Precision = the accuracy with which the population parameters have been estimated; remember that population parameters often are based on the sample statistics

 

Types of Sampling Methods - probability & non-probability

 

Probability Sampling Methods

Also called random sampling

  • Every element (member) of the population has a probability greater than) of being selected for the sample
  • Everyone in the population has equal opportunity for selection as a subject
  • Increases sample's representativeness of the population
  • Decreases sampling error and sampling bias

Types of probability sampling - see table in course materials for details

 

Simple random

  • Elements selected at random
  • Assign each element a number
  • Select elements for study by:

 

  1. Using a table of random numbers in book

A table displaying hundreds of digits from 0 to 9 set up in such a way that each number is equally likely to follow any other

See text for random sampling details & table of random numbers

  • Computer generated random numbers table
  • Draw numbers for box (hat)
  • Bingo #=s
  • Stratified random

    Population is divided into subgroups, called strata, according to some variable or variables in importance to the study

    Variables often used include: age, gender, ethnic origin, SES, diagnosis, geographic region, institution, or type of care

    Two approaches to stratification - proportional & disproportional

     

    Proportional

    Subgroup sample sizes equal the proportions of the subgroup in the population

    Example: A high school population has

    15% seniors

    25% juniors

    25% sophomores

    35% freshmen

    With proportional sample the sample has the same proportions as the population

    Disproportional

    Subgroup sample sizes are not equal to the proportion of the subgroup in the population

    Example

    Class

    Population

    Sample

    Seniors

    15%

    25%

    Juniors

    25%

    25%

    Sophomores

    25%

    25%

    Freshmen

    35%

    25%

     

     

     

    With disproportional sample the sample does not have the same proportions as the population

    Cluster random sampling

    A random sampling process that involves stages of sampling

    The population is first listed by clusters or categories

    Procedure

    Randomly select 1 or more clusters and take all of their elements (single stage cluster sampling); e.g. Midwest region of the US

    Or, in a second stage randomly select clusters from the first stage of clusters; eg 3 states within the Midwest region

    In a third stage, randomly select elements from the second stage of clusters; e.g. 30 county health dept. nursing administrators from each state

     

    Systematic

    A random sampling process in which every kth (e.g. every 5th element) or member of the population is selected for the sample after a random start is determined

    Example

    Population (N) = 2000, sample size (n) = 50, k=N/n, so k = 2000 ) 50 = 40

    Use a table of random numbers to determine the starting point for selecting every 40th subject

    With list of the 2000 subjects in the sampling frame, go to the starting point, and select every 40th name on the list until the sample size is reached. Probably will have to return to the beginning of the list to complete the selection of the sample.

     

    Non-probability sampling methods

    Characteristics

    Not every element of the population has the opportunity for selection in the sample

    No sampling frame

    Population parameters may be unknown

    Non-random selection

    More likely to produce a biased sample

    Restricts generalization

    Historically, used in most nursing studies

    Types of non-probability sampling methods

     

    Convenience - aka chunk, accidental & incidental sampling

    Selection of the most readily available people or objects for a study

    No way to determine representativeness

    Saves time and money

     

    Quota

    Selection of sample to reflect certain characteristics of the population

    Similar to stratified but does not involve random selection

    Quotas for subgroups (proportions) are established

    E.g. 50 males & 50 females; recruit the first 50 men and first 50 women that meet inclusion criteria

     

    Purposive - aka judgmental or expert's choice sampling

    Researcher uses personal judgement to select subjects that are considered to be representative of the population

    Handpicked subjects

    Typical subjects experiencing problem being studied

     

    Snowball

    Also known as network sampling

    Subjects refer the researcher to others who might be recruited as subjects

     

    Time Frame for Studying the Sample

    See design notes on longitudinal & cross-sectional studies

    Longitudinal

    Cross-sectional

     

    Sample Size

    General rule - as large as possible to increase the representativeness of the sample

    Increased size decreases sampling error

    Relatively small samples in qualitative, exploratory, case studies, experimental and quasi-experimental studies

    Descriptive studies need large samples; e.g. 10 subjects for each item on the questionnaire or interview guide

    As the number of variables studied increases, the sample size also needs to increase in order to detect significant relationships or differences

    A minimum of 30 subjects is needed for use of the central limit theorem (statistics based on the mean)

    Large samples are needed if:

    There are many uncontrolled variables

    Small differences are expected in the sample/population on variables of interest

    The sample is divided into subgroups

    Dropout rate (mortality) is expected to be high

    Statistical tests used require minimum sample or subgroup size

     

    Power Analysis

    Power analysis = a procedure for estimating either the likelihood of committing a Type II error or a procedure for estimating sample size requirements

     

    Background Information for Understanding Power Analysis:

    Type I and Type II errors

    Type I error

    Based on the statistical analysis of data, the researcher wrongly rejects a true null hypothesis; and therefore, accepts a false alternative hypothesis

    Probability of committing a type I error is controlled by the researcher with the level of significance, alpha.

    Alpha a is the probability that a Type I error will occur

    Alpha a is established by researcher; usually a = .05 or .01

    Alpha a = .05 means there is a 5% chance of rejecting a true null hypothesis; OR out of 100 samples, a true null hypothesis would be rejected 5 times out of 100 and accepted 95 times out of 100.

    Alpha a = .01 means there is a 1% chance of rejecting a true null hypothesis; OR out of 100 samples, a true null hypothesis would be rejected 1 time out of 100 and accepted 99 times out of 100

    Type II error

    Based on the statistical analysis of data, the researcher wrongly accepts a false null hypothesis; and therefore, rejects a true alternate hypothesis

    Probability of committing a Type II error is reduced by a power analysis

    Probability of a Type II error is called beta b

    Power, or 1- b is the probability of rejecting the null hypothesis and obtaining a statistically significant result

     

    Type I & Type II Errors 

     

     In the real world, the actual situations is that the null hypothesis is :

    True

     In the real world, the actual situations is that the null hypothesis is :

    False

     Based on statistical analysis, the researcher concludes that:

    Null true: Null hypothesis is accepted

     

    Correct decision: the actual true null is accepted

     

    Type II error: the actual false null is accepted

     Based on statistical analysis, the researcher concludes that:

    Null false: Null hypothesis is rejected & alternate is accepted

     

    Type I error: the actual true null hypothesis is rejected

     

    Correct decision: the actual false null is rejected & alternate is accepted

     

     

    Background Information for Understanding Power Analysis:

    Population Effect Size - Gamma g

    Gamma g measures how wrong the null hypothesis is; it measures how strong the effect of the IV is on the DV; and it is used in performing a power analysis

    Gamma g is calculated based on population data from prior research studies, or determined several different ways depending on the nature of the data and the statistical tests to be performed

    The textbook discusses 4 ways to estimate gamma (population effect size) based upon:

    Testing the difference between 2 means (t-test)

    Testing the difference between 3> means (ANOVA)

    Testing bivariate correlation (relationship) between 2 variables (Pearson's r)

    Testing the difference in proportions between 2 groups (chi-square)

    If there is no relevant research on topic to estimate the population effect size (gamma), then use guidelines for gamma g or its equivalent

    Testing the difference between 2 means (t-test) - gamma g for small effects g = .20; medium effects g = .50; large effects g = .80

    Testing the difference between 3> means (ANOVA) - eta squared h2 for small effects h2 = .01; medium effects h2 = .06; large effects h2 = .14

    Testing bivariate correlation (relationship) between 2 variables (Pearson's r) gamma g for small effects g = .10; medium effects g = .30; large effects g = .50

    Testing the difference in proportions between 2 groups (chi-square - no conventions for unknown populations

    Determining Sample Size through Power Analysis

    Need to have the following data:

    Level of significance criterion = alpha a, use .05 for most nursing studies and your calculations

    Power = 1 - b (beta); if beta is not known standard power is .80, so use this when you are determining sample size

    Population size effect = gamma g or its equivalent, e.g. eta squared h2; use recommended values for small, medium, or large effect for the statistical test you plan to use to answer research questions or test hypothesis

    Use tables on pages 455-459 of Polit & Hungler or other reference

    Mathematical formulas and computer programs can also be used for calculation of sample size

     

    Sampling Error and Sampling Bias

    Sampling error = The difference between the sample statistic (e.g. sample mean) and the population parameter (e.g. population mean) that is due to the random fluctuations in data that occur when the sample is selected

    Sampling bias

    Also called systematic bias or systematic variance

    The difference between sample data and population data that can be attributed to faulty sampling of the population

    Consequence of selecting subjects whose characteristics (scores) are different in some way from the population they are suppose to represent

    This usually occurs when randomization is not used

     

    Randomization Procedures in Research

    Randomization = each individual in the population has an equal opportunity to be selected for the sample

    Random selection = from all people who meet the inclusion criteria, a sample is randomly chosen

    Random assignment

    The assignment of subjects to treatment conditions in a random manner.

    It has no bearing on how the subjects participating in an experiment are initially selected.

    See Polit & Hungler, pg. 160-162 for random assignment to groups and group random assignment to tx. using a random numbers table

     

     Return to calendar/assignments