The GREnDL 2.0 Data Model

The GREnDL 2.0 Data Model

"GREnDL 2.0 - Main Class Diagram"
(Illustration provided by Tim Forsythe)
GREnDL 2.0 - Main Class Diagram

I developed version 1 of the Genealogical Record Exchange and Description Language () specification in 2004 as an alternative to the poorly written standard. It included a lot of improvements to GEDCOM, and used rather than the lineage-linked grammar developed by The Church of Latter Day Saints.

GREnDL 2.0 is an improved data model. It was designed based upon my personal understanding of genealogy data paradigms, and more specifically to promote an evidence-based genealogy model, while still supporting conclusion-based models. The main class diagram above covers the core of the GREnDL model. Utility classes are shown here, and a simple extension to handle Groups is shown below.

"GREnDL 2.0 - Common Classes"
(Illustration provided by Tim Forsythe)
GREnDL 2.0 - Common Classes

As can be seen in the main class diagram in the upper left, evidence-base genealogy begins with a repository, the location where a source is found. A repository need not be a physical entity like a library or archive, though it could be. A repository may also be a website, your file cabinet, or wherever else the source may be located. The purpose of declaring a repository is so that if someone is looking to get a hold of the source, they will know where to look. Repositories are not critical to genealogical research, and are often omitted. If a repository is added it may reference any number of sources, each using a reference number. The reference number may be a call number, file number, a textual description, or whatever else is needed to help locate the source in the repository. In many cases, the reference number may be omitted. Sources may also be located in more than one repository. The "*" shown on the diagram, the symbol for "many", is used to represent this. Also, anytime you see an attribute as an IDList, it indicates a "many" relationship, and is usually implemented as a vector of IDs. An ID attribute without the word list, indicates a "1" relationship and is shown on the chart as such. A number of the class relationship arrows have been left off the diagrams because of the great number of connections. The "many" and "1" relationships I just described should allow you to imply their relationships.

For many people, the Source record is their entry point into evidence-based data entry. The source record has a number of fields that pertain to it. I won't be detailing them as most of them are common enough. The restriction type field can be found in any of the records that might conceivably need to restrict access, such as private or sensitive information, copyrighted material, or local data not accessible globally. The Authority, Concurrency, and Association fields are used for source record categorizations. Each source can have any number of quotations, but each quotation can be associated with only one source. Quotations are just that, actual text copied from the source.

Quotations can generate more than one claim, and likewise, claims may be generated by more than one quotation. There are several types of genealogical claims GREnDL recognizes. Firstly, when we talk about claims, we are talking about claims made about people. Claims made about other things, such as locations, are not genealogical in nature, and are handled differently. The four basic claim types are Names, Gender, Relationships, and Properties. Properties being the catch all for all types of events and attributes. The only relationship type built into GREnDL is that of a Biological Child. All other relationship types are supported as user defined types, of which any number can be defined. The single occurrence flag is used to define when only one possible type is valid as in a paternal grandfather. Several property types are built into GREnDL, including all vital types such as birth, death, etc. The field definitions are used to define any number of specific fields for a type, such as a cause of death, or a census enumeration type. Any number of user defined types may be defined as well, along with living and flourishing flags, and field definition mapping. These user types and field definition mapping allow GREnDL to be extended to handle virtually any attribute or event that can be dreamed up by third party vendors. All claim types support things like dates, locations, and ages. Ordinals, such as order of birth or marriage can apply to multiple claim types. The primary flag can be used to indicate when multiple occurrence claims are made, which one might be considered paramount, although this can be overridden based upon reliability assessments. Reliability assessments are supported for every claim. They can be entered directly, or calculated based upon source categories. The voting record is an extension that need not be supported, but can be implemented on multi-user sites to allow subscriber social voting.

Lastly, each claim can be associated with only one person, but persons can have any number of claims associated with them. When claims appear to involve multiple persons, then they should be separated into individual claims. Groups can be used for assigning claims to group entities, but not its members. GREnDL has no built in Group types, but as with some of the claim types, any number of user defined group types can be invented, such as persons present at a baptism, or soccer teams.

"GREnDL 2.0 - Group Class Diagram"
(Illustration provided by Tim Forsythe)
GREnDL 2.0 - Group Class Diagram

What I have described here I refer to as a top-down approach to evidence-based genealogy. Repository->Source->Quote->Claim->Person. The process could also be reversed, by adding persons first, then adding claims, finding any quotations in a source that supports that claim, adding the source, followed by a repository. The disadvantage to the bottom-up approach is that it is too easy to not complete the thread. A person could be added along with claims, but no sources, which is a bad practice, but there is nothing in the GREnDL model that would prevent you in doing so. Some conclusion-based genealogists who are not used to including their sources will find this comforting. In the end it will be up to importing utilities and data entry forms to enforce top-down processing if it is determined that that is the best approach … or not.

Built with Gigatrees 4.5.2