Data collection methods

24
Data Collection Data Collection Methods Methods Pros and Cons of Primary Pros and Cons of Primary and Secondary Data and Secondary Data

Transcript of Data collection methods

Data Collection Data Collection MethodsMethodsPros and Cons of Primary Pros and Cons of Primary and Secondary Dataand Secondary Data

Where do data come Where do data come from?from? We’ve seen our data for this We’ve seen our data for this lab, all nice and collated lab, all nice and collated in a database – from:in a database – from:– Insurance companies (claims, Insurance companies (claims, medications, procedures, medications, procedures, diagnoses, etc.)diagnoses, etc.)

– Firms (demographic data, Firms (demographic data, productivity data, etc.)productivity data, etc.)

Where do data come Where do data come from?from? Take a step back – if we’re Take a step back – if we’re starting from scratch, how starting from scratch, how do we collect / find data?do we collect / find data?– Secondary dataSecondary data– Primary dataPrimary data

Secondary DataSecondary Data Secondary data – data Secondary data – data someone else has collectedsomeone else has collected– This is what you were looking This is what you were looking for in your assignment.for in your assignment.

Secondary Data – Secondary Data – Examples of SourcesExamples of Sources County health departmentsCounty health departments Vital Statistics – birth, death Vital Statistics – birth, death certificatescertificates

Hospital, clinic, school nurse recordsHospital, clinic, school nurse records Private and foundation databasesPrivate and foundation databases City and county governmentsCity and county governments Surveillance data from state government Surveillance data from state government programsprograms

Federal agency statistics - Census, NIH, Federal agency statistics - Census, NIH, etc.etc.

Secondary Data – Secondary Data – LimitationsLimitations What did you find on the What did you find on the frustrating side as you frustrating side as you looked for data on the looked for data on the state’s websites?state’s websites?

Secondary Data – Secondary Data – LimitationsLimitations When was it collected? For When was it collected? For how long?how long?– May be out of date for what you May be out of date for what you want to analyze.want to analyze.

– May not have been collected long May not have been collected long enough for detecting trends.enough for detecting trends.

– E.g. Have new anticorruption laws E.g. Have new anticorruption laws impacted Russia’s government impacted Russia’s government accountability ratings?accountability ratings?

Secondary Data – Secondary Data – LimitationsLimitations Is the data set complete?Is the data set complete?

– There may be missing There may be missing information on some information on some observationsobservations

– Unless such missing information Unless such missing information is caught and corrected for, is caught and corrected for, analysis will be biased.analysis will be biased.

Secondary Data – Secondary Data – LimitationsLimitations Are there confounding Are there confounding problems?problems?– Sample selection bias?Sample selection bias?– Source choice bias?Source choice bias?– In time series, did some In time series, did some observations drop out over observations drop out over time?time?

Secondary Data – Secondary Data – LimitationsLimitations Are the data Are the data consistent/reliable?consistent/reliable?– Did variables drop out over Did variables drop out over time?time?

– Did variables change in Did variables change in definition over time?definition over time? E.g. number of years of education E.g. number of years of education versus highest degree obtained.versus highest degree obtained.

Secondary Data – Secondary Data – LimitationsLimitations Is the information exactly what you Is the information exactly what you need?need?– In some cases, may have to use “proxy In some cases, may have to use “proxy variables” – variables that may variables” – variables that may approximate something you really wanted approximate something you really wanted to measure. Are they reliable? Is to measure. Are they reliable? Is there correlation to what you actually there correlation to what you actually want to measure?want to measure?

– E.g. gauging student interest in U.W. E.g. gauging student interest in U.W. by their ranking on FAFSA – subject to by their ranking on FAFSA – subject to gamesmanship.gamesmanship.

Secondary Data – Secondary Data – AdvantagesAdvantages No need to reinvent the No need to reinvent the wheel.wheel.– If someone has already found If someone has already found the data, take advantage of it.the data, take advantage of it.

Secondary Data – Secondary Data – AdvantagesAdvantages It will save you money.It will save you money.

– Even if you have to pay for Even if you have to pay for access, often it is cheaper in access, often it is cheaper in terms of money than collecting terms of money than collecting your own data. (more on this your own data. (more on this later.)later.)

Secondary Data – Secondary Data – AdvantagesAdvantages It will save you time.It will save you time.

– Primary data collection is very Primary data collection is very time consuming. (More on this time consuming. (More on this later, too!)later, too!)

Secondary Data – Secondary Data – AdvantagesAdvantages It may be very accurate.It may be very accurate.

– When especially a government When especially a government agency has collected the data, agency has collected the data, incredible amounts of time and incredible amounts of time and money went into it. It’s money went into it. It’s probably highly accurate.probably highly accurate.

Secondary Data – Secondary Data – AdvantagesAdvantages It has great exploratory It has great exploratory valuevalue– Exploring research questions Exploring research questions and formulating hypothesis to and formulating hypothesis to test.test.

Primary DataPrimary Data Primary data – data you Primary data – data you collectcollect

Primary Data - Primary Data - ExamplesExamples SurveysSurveys Focus groupsFocus groups Questionnaires Questionnaires Personal interviewsPersonal interviews Experiments and Experiments and observational studyobservational study

Primary Data - Primary Data - LimitationsLimitations Do you have the time and money Do you have the time and money for:for:– Designing your collection Designing your collection instrument?instrument?

– Selecting your population or sample?Selecting your population or sample?– Pretesting/piloting the instrument Pretesting/piloting the instrument to work out sources of bias?to work out sources of bias?

– Administration of the instrument?Administration of the instrument?– Entry/collation of data?Entry/collation of data?

Primary Data - Primary Data - LimitationsLimitations UniquenessUniqueness

– May not be able to compare to May not be able to compare to other populationsother populations

Primary Data - Primary Data - LimitationsLimitations Researcher errorResearcher error

– Sample biasSample bias– Other confounding factorsOther confounding factors

Data collection Data collection choicechoice What you must ask yourself:What you must ask yourself:

– Will the data answer my Will the data answer my research question?research question?

Data collection Data collection choicechoice To answer that To answer that

– You much first decide what your research question is

– Then you need to decide what data/variables are needed to scientifically answer the question

Data collection Data collection choicechoice If that data exist in If that data exist in secondary form, then use secondary form, then use them to the extent you can, them to the extent you can, keeping in mind limitations.keeping in mind limitations.

But if it does not, and you But if it does not, and you are able to fund primary are able to fund primary collection, then it is the collection, then it is the method of choice.method of choice.