The need for data
Most research projects need data in order to answer a proposed research problem. The data that need to be acquired, and the sources of such data, must be identified as a matter of utmost importance. No amount or depth of subsequent data analysis can make up for an original lack of data quantity or quality.
Research problems and objectives (or hypotheses) need to be very carefully constructed and clearly defined, as they dictate the data that need to be obtained and analyzed in order to successfully address the objectives themselves. In addition, the quantity of data, their qualities, and how they are sampled and measured, have implications for the choice and effectiveness of the data analysis techniques used in subsequent analysis.
Fundamental questions to be asked (and hopefully answered) with respect to the proposed research and data include:
What data are needed?
What data need to be measured or obtained? What are the required characteristics of the data in terms of their quantities and qualities?
Do the data already exist and can they be obtained?
If so, what are the sources of the data? How were the data measured? What are the characteristics of the data in terms of their type, quality, resolution, precision, accuracy, and coverage? Is the quantity of data sufficient? Are their characteristics suited to, and sufficient for, the study? How will you actually assess their suitability?
If the data do not exist, what data need to be generated?
What data characteristics are required in terms of data type, quality, quanitity, resolution, precision, accuracy, and coverage, in order to properly address the research objectives? What variables will be measured? How will meaurements be made? What sampling scheme will be employed, and why? What logistical problems (e.g., accessibility) need to be considered? At what scale(s) will measurements be made? How will you ensure that you are measuring what you think you are measuring (a tricky one!)?
What implications are there for the subsequent analysis?
How does sample size constrain the effectiveness (e.g., power) of statistical tests? Are replicate observations needed? Is there a spatial dimension to your data, and if so have you worked out what the distance between your samples should be? Have you over-sampled or under-sampled, and can this be remedied beforehand? Are the data "representative" and how do you know? Are the data "random" or "stratified" or "nested" and does this matter? Does the type of data - ratio scale, interval, ordinal, discrete, nominal, closed, directional - have implications for data analysis (Yes!)?
- Most research requires data and data analysis.
- Data acquisition is of utmost importance and considerable effort should be made to obtain or generate good data.
- Good data are data whose characteristics enable the research objectives to be met.
- Data of poor quality or undesirably low quantity will lead to unsatisfactory data analysis and vague results.
- The characteristics of the data, particularly their type, quantity, and how they were sampled, constrain the choice of data analysis techniques able to be used on the data.
- Data analysis can only be as good as the original data allow.