Glossary

Anonymization: the process of removing or transforming identifying elements of data such that it cannot be used by a third-party or the data-custodian themselves to (re)identify an individual either directly or in combination with other extant data.

Artifact: an artificial object (e.g., computer program, document, method, model, practice, technique, template).

Chain-of-evidence: (in qualitative research) a mapping of raw data (e.g., quotations) to theoretical concepts (e.g., themes, categories), typically with one or more intermediate steps (e.g., codes, labels, subcategories), sometimes presented as a table.

Credibility: the extent to which conclusions are supported by rich, multivocal evidence.

Construct validity: Do the measures support the research objective? The questionnaire items (questions) and related response scales should accurately represent the research aims.

Deception: A situation where some aspects of a study are intentionally concealed from participants to permit aspects of the research that would not be possible if full information was given.

De-identification: the process of removing identifying elements of data; often used synonymously with both Anonymization and Pseudonymization even though these processes have important differences. It is usually better to specify whether data is anonymous or pseudonymous.

External validity: Can the conclusions be generalized to the target population? The characteristics and size of the sample should represent the extended population.

Gatekeeper: A person or organization who controls access to individuals (e.g., employees) or data being researched. Gatekeepers may influence individual’s participation decisions (e.g., employers may coerce employees to participate), affect research outcomes (e.g., by controlling whose voices are heard) or be affected by research outcomes (e.g., lose credibility with employees).

Generalizability: See External validity.

Informed consent: A freely-made decision by an individual to participate in research in the light of full information about the purposes, benefits, and risks to them of participating in the research, and the options available to them to withdraw themselves or their data during or after the study. The freedom to consent/withdraw may be affected by pre-existing relationships with the researchers (e.g., family/friendship) or gatekeepers.

Internal validity: Are the relationships between the investigated factors examined? In survey research, it is difficult to control the conditions in which the factors are studied and to account for potential confounding factors. Low internal validity is expected.

Multivocal: The property of being based on—and recognizing differences between—people with different opinions and backgrounds (including gender, culture, education, and class).

Pseudonymization: The process of removing directly-identifying elements of data and creating a separate explicit (e.g., allocating a random number to a participant’s record in place of their name) or implicit (e.g., relying on timestamps to resolve logs to sign-on information) map between the identifiable aspects of the data and the remainder. _Pseudo_nymizing differs from _anon_ymizing in that individuals can be re-identified using the map.

Objectivity: Are the results free from the bias of the researchers? This can be achieved through standardization of the procedures for data collection, analysis, and interpretation.

Recoverability: A study is recoverable when readers can understand how the work was done and why it was done that way. All research should be recoverable.

Reflexivity: the extent to which authors reflect on their potential biases and interactions with the team, organization or community, especially possible negative impacts on some participants or stakeholders.

Re-identification: the process of connecting ostensibly anonymous or pseudonymous data back to an individual, typically by combining field values from a single dataset, or by combining elements from multiple datasets.

Reliability: The degree to which a measure can be applied consistently across time (test-retest), or when applied by different people (inter-rater).

Replicability: A study is replicable, when the data collection and analysis (on the new data) can be repeated by an independent researcher. Positivist research should be replicable; interpretivists and postmodernists reject the notion that social science is replicable. Qualitative research is typically not replicable.

Reproducibility: A study is reproducible when an independent researcher can precisely recreate the results using the original study’s data and source code. Interpretivists and postmodernists reject the notion that social science is reproducible, and qualitative research is typically not reproducible. Much positivist research is not reproducible because it is impractical or unethical to publish the dataset.

Resonance: the extent to which a study’s conclusions make sense to (i.e., resonate with) participants

Rigor: the extent to which theory, data collection, and data analysis are sufficient, appropriate and not oversimplified.

Site: the conceptual space within which a study’s data collection occurs. In qualitative and especially case study research, the concept of site is not limited to physical location, but rather defines the boundaries within which the research takes place.

Theoretical sampling: choosing which data to collect based on the emerging theory, concepts or categories; typically used in qualitative research, especially Grounded Theory.

Transferability: the extent to which a study’s results could plausibly apply to other sites, people or circumstances.

Usefulness: the extent to which a study provides actionable recommendations to researchers, practitioners OR educators.