The Genealogical Research Process

Genealogical Research Process

Question Asking

The research process begins with a focused research question.

Information Gathering

Armed with an appropriate question, a researcher creates a search plan. This plan identifies all of sources that need to be searched to satisfy the “reasonably exhaustive search” Genealogical Proof Standard (GPS) requirement—i.e., all of the sources that may contain information suggesting answer(s) to the research question. As the researcher executes this search plan, he should avoid forming answers to the question (i.e., no hypotheses), focusing only on potential evidence (i.e., information suggesting tentative answers to the research question).

Hypothesis Testing

When the researcher is done executing his search plan, he is presumably in possession of all “reasonably available” evidence. The researcher then uses his gathered evidence to form hypotheses (i.e., tentative answers to the research question based on all gathered evidence) and subjects these hypotheses to testing. A thorough researcher will consider all hypotheses that can be reasonably construed from the evidence gathered, testing each of them.

Conclusion Accepting

If a sole hypothesis passes testing and all conflicting evidence can be resolved, this sole hypothesis is called a conclusion.

Proof Explained

If the researcher goes on to explain his conclusion by demonstrating the five GPS elements, the conclusion becomes proven.

Essential Data Concepts from the Genealogical Research Process

The GEDCOM X project identifies the following as the essential data concepts from the genealogical research process (represented here by the following data domain diagram and bulleted list of definitions):

Genealogical Research Process - Data Domain
  • Source — a container of Information1
  • Information — statements based on experience, fabrication, hearsay, intuition, observation, reading, research, or some other means; a Source’s surface content, including its physical characteristics; what we see or hear when we examine a source, not what we interpret2
  • Question — a question that research aims to answer; in genealogy, a focused question that seeks unknown Information about a documented person and that helps frame research scope, lead to relevant Information, and identify Evidence3
  • Evidence — a tentative Answer to a research Question that is the product of using Information to answer a research Question4
  • Analysis — notes or narrative text about the result of two processes: (a) recognizing the Information items a Source contains that are likely to answer a research question; (b) considering the characteristics, purpose, and history of a Source and its relevant Information items in order to determine their likely accuracy5
  • Hypothesis — a tentative Answer to a research Question resulting from correlating two or more independent items of Evidence
  • Conclusion — an accepted Answer to a research Question; a Hypothesis that has passed testing and for which conflicts can be resolved
  • Proof — a Conclusion explained in writing; an explanation that demonstrates the five GPS elements

Modeling Data Concepts from the Genealogical Research Process in GEDCOM X


At this time, the core GEDCOM X model does not include a provision for modeling Questions (research goals), but it might be a good candidate for a future extension.


Use specializations of Subject to record Answers to research Questions.


Use specializations of Subject to record Information found in a single Source. Configure the Subject to adhere to the Extracted Conclusion Constraints.


Use specializations of Subject to represent Evidence. Configure the Subject such that its evidence property contains only references to Information.

Hypotheses and Conclusions

Use specializations of Subject to represent Hypotheses. Configure the Subject such that its evidence property contains only references to Evidence. Conclusions are Hypotheses that have been successfully tested and accepted (presumably discussed in its associated Analysis or Proof).


Use a SourceDescription to describe a Source, then reference that description in the data entities in the Subject representing the Information from that Source by adding instances of SourceReference as appropriate.

Analysis and Proof

Use a Document of type to record analyses and/or proof arguments. To support more complex narrative layout needs (e.g., footnotes, tables, images, etc.), the Document class includes an option for XHTML text (see Document.textType). Associate an analysis document with the entities being analyzed via the analysis property (found in EvidenceReference, SourceDescription and all specializations of Conclusion).

Further Reading

If you are not familiar with the genealogical research process as it has been touched upon here, you might consider the following resources for a more in-depth study of the topic:

  • Jones, Thomas W. Mastering Genealogical Proof. Arlington, VA: National Genealogical Society, 2013.
  • Board for Certification of Genealogists. Genealogy Standards. 50th Anniversary Ed. Nashville, Tennessee:, 2014.
  • Rose, Christine. Genealogical Proof Standard: Building a Solid Case. 3rd revised edition. San Jose, California: CR Publications, 2009.

1. Thomas W. Jones, Mastering Genealogical Proof (Arlington, VA: National Genealogical Society, 2013), 139.
2. Jones, Mastering Genealogical Proof, 136.
3. Jones, Mastering Genealogical Proof, 138-9.
4. Jones, Mastering Genealogical Proof, 13-14, 134.
5. Jones, Mastering Genealogical Proof, 133.