Bonkers: The GEDCOM Sanity Checker

Please sign in to generate a report.

Now serving number:   1241

46

Sooner or later, we all get to the point where we realize that there is a lot of data in our database that is just completely bonkers.

Bonkers will help you quickly locate questionable data by identifying groups of claims in your database that are inconsistent with each other. Bonkers works by running your GEDCOM file thorugh a battery of recursive, multipass, deep scanning algorithms designed to rat out even the most elusive errors.

Configuration Options

Bonkers provides you with the ability to fine tune your results. Specifically, Bonkers allows you enable options to include additional items in your report, and allows you to configured common age and date thresholds to best match your data. All selections are preserved between sessions.

  • +Options

    Select which options you prefer. Note each of the below listed options allow you to configure which additional items you would like to see in your report.

    Assume Flourishing For All Vendor Specific Events
    Show Misformatted Dates
    Show Persons Missing One Parent
    Show Persons Missing Both Parents
    Show Unmappable Locations
    Show Impossibilities Only (hides improbabilities)
    Use Estimated Birthdates For Probabilities
    Match Alternate Given Names For Duplicate Children
    Ignore Non-overlapping Timelines For Duplicate Children

     

    Assume Flourishing For All Vendor Specific Event Types allows you to specify that any events of an unknown event type that occur while a person is living also occurred while the person was flourishing. This is useful if you've added new event types to your database. Do not check this option if you use these new event types for children.

  • +Age Thresholds

    Set the age thresholds for the following items to best match your GEDCOM data. By default, the most common ages are defined.

     

    Maximum Child Baptism Age is the maximum age at which a person was baptized or christened as a child. Adult christenings are not considered in this context.

    Minimum Marriage Age is the minimum age that a person could have been married. Betrothals and Contracts are not considered in this context.

    Maximum Wife's Marriage Age is the maximum age at which a woman could have been married.

    Maximum Age Between Spouses is the maximum number of years of age between spouses.

    Minimum Child Bearing Age and Maximum Child Bearing Age are the range of ages within which a woman could bear children.

    Minimum Flourishing Age and Maximum Flourishing Age are the range of ages that a person is considered to be an adult and would be found containing adult attributes and events such as Nobility Title or Occupation.

    Maximum Age (Life Span) is the maximum age at death of a person.

    Maximum Event Age Error is the maximum error expected between an event and their known age. A common phenomenon in genealogical records results in persons being listed at the time of an event with an age that is not accurate. Making this value too narrow may prevent Vivify from being able to resolve conflicting dates preventing it from establishing an clean estimate.

  • +Date Thresholds

    Set the date thresholds for the following items to best match your GEDCOM data. By default, the most common dates are defined.

     

    Dead if Born Before Year is the earliest year a living person could have been born, before which is considered improbable.

    Dead if Married Before Year is the earliest year a living person could have been married, before which is considered improbable.

Description

Bonkers is as a free web service. It uses the same GEDCOM Parser that is used by VGedX Online, my 100% compatible GEDCOM validator. Bonkers, however, supports a special expanded GEDCOM dictionary that will recognize numerous vendor specific events and date formats giving it a distinct advantage over vendor specific consistency checkers, and makes it useable by virtually anyone with a GEDCOM file.

The following is a list of the various conditions the Bonkers currently detects.

  • Persons born after they were baptized
  • Persons born after they were married
  • Persons born after having children
  • Persons born after they died
  • Persons born after they were buried
  • Persons born after their parent's death
  • Persons born after their parent's burial
  • Persons baptized too old
  • Persons baptized after they were married
  • Persons baptized after they died
  • Persons baptized after they were buried
  • Persons married too young
  • Wives married too old
  • Persons who are much older than their spouse
  • Persons married after they died
  • Persons married after they were buried
  • Mothers who bore children too young
  • Mothers who bore children too old
  • Parents who died before having children
  • Persons who died after they were buried
  • Persons who lived too long
  • Persons who were buried before having children
  • Persons with multiple sets of parents
  • Persons with no parents
  • Persons with only 1 parent
  • Persons whose birth dates could not be estimated
  • Misformatted dates
  • Persons with ancestral loops
  • Persons with duplicately named children
  • Unmappable Locations

When birth dates cannot be estimated, the following abbreviated codes will appear in the report to give you an indication as to why they couldn't.

  • fb = father's birth
  • mb = mother's birth
  • pm = parrent's marriage
  • b = birth
  • bp = baptism
  • sb = earliest spouses birth
  • xb = latest siblings birth
  • m = earliest marriage
  • cb = child's birth (range)
  • cm = earliest child's marriage
  • pd = latest parent's death
  • d = death
  • bu = burial
  • le = living event (range)
  • fe = flourishing event (range)

Leave a Comment
46 Comments
  • Tim Forsythe
    28 Dec 2013

    Colin, to the right of the menu bar are 3 icons, 1 for Google, 1 for Facebook, and 1 that says T4 which allows you to create a local account. Just click on whichever of these you would like to use to authenticate your email address. Google and Facebook popup their secure dialogs so that you can enter your credentials. T4 will popup a dialog allowing your to register and once done will send you an email with an authentication link for you to click on.

  • Colin
    28 Dec 2013

    How do you sign up / sign in? I can't find a link anywhere. :(

  • Tim Forsythe
    19 Jul 2013

    The generated report is persistent ... I don't delete them very often, so you could just save the link to it and reopen it at any time. You could also right click on your Bonkers report and select save as html file as well. That should save the file onto your hard disk allowing you to return to it at any time. Whatever works for you.

  • Ed
    19 Jul 2013

    Tim
    Thanks for such a prompt reply.
    Seems a very convoluted way but will give it a try
    Cheers. Ed

  • Tim Forsythe
    19 Jul 2013

    Ed,
    Bonkers is strictly an online report. If you need to view it offline, you need to run Adam which will generate a downloadable family tree website in zip format. Part of the resulting website is the Bonkers report. If you wish you can choose to disable other types of pages on Adam's configuration page, but having all the various profile pages makes it so much easier to click thru the report.
    Tim

  • Ed
    19 Jul 2013

    Hi

    excellent result but cant work out how to download the report to my pc so that I can work on checking the items at my leisure.

    cheers Ed

  • chas vigneron
    05 Feb 2013

    Thank you.

  • Tim Forsythe
    29 Jan 2013

    John, I hope i can quote you on that. :)

    Spread the word folks, there are still a few people that haven't had a chance to use it.

  • John Coldwell
    29 Jan 2013

    Tim
    Thanks. That worked fine.
    This a brilliant program and I think anyone with more than just a few names needs to use it!
    Regards
    John

  • Tim Forsythe
    29 Jan 2013

    John, thanks for the feedback. I glad to here the program worked well for you.

    VGedX is a strict GEDCOM validator so requires a GEDCOM version in the file to know which dictionary to compare it against. Your GEDCOM file does not include the GEDCOM version in the header record, hence "empty". The file should have the following beneath the 0 HEAD line in your file.

    1 GEDC
    2 VERS 5.5

    You can manually do the insertion, and then VGedX should work for you.

    Bonkers, and the rest of the tools, are more relaxed in that they all use an expanded GEDCOM dictionary to compare against, so they ignore the GEDCOM version. This is the reason your file runs correctly in Bonkers.

  • John Coldwell
    29 Jan 2013

    I ran my gedcom file (16,000 people) and Bonkers gave me four A4 pages of Errors. I went though the list and made the corrections in my family history program. An excellent tool!
    I have generated a new gedcom and want to rerun the file (Coldwell 4.ged) to check if I missed anything and I cannot get a result. VGedX gives the message "E028 198176 Unsupported GEDCOM version detected (empty)". I am puzzelled by this because I am using identicle programs and methods which worked correctly the first time. Also the gedcom file works correctly when loaded into my family history program.
    I would much appreciate help sorting this out.

    Regards
    John

  • Tim Forsythe
    12 Jan 2013

    I've updated the web services for Bonkers, CENtaR and Vivify to now timeout in about 10 minutes and display an error screen. Prior, the service would just stop, giving no indication to the user what happened. It seems that files that are large and complex in several areas just take too long to process. I will take a closer look at these files when I get a chance to find out why they take so long, and try to find a workaround.

  • Tim Forsythe
    12 Jan 2013

    Bonkers has been updated to ver 0.95. Bonkers now shows all statistics checked, even if the total is 0. Before when the total issue found was 0, the entry was omitted, so you could not be sure that your file passed that test. I also added showing at the top of the report, the total number of impossibilities and the total number of improbabilities so you can tell at a glance how well your file performed.

  • Tim Forsythe
    10 Jan 2013

    Sue, The limit is currently set to 50 MB. I recently ran a file for someone with 220K people that was 130 MB and it took 4.5 hours to complete (successfully I might add). I've only recently begun tracking times so have no data on the maximum file run so far. The service times out in 10 minutes though. I generally recommend for users with large databases to export a single ancestral line that they are interested in and run that.

  • cybersuezee
    10 Jan 2013

    Is there a limit to the size of Gedcom file I can run through Bonkers? I use Legacy with currently 205208 individuals?
    BTW I think you can have both "Persons Born After One of Their Parents Died" and "Persons Died Before Having Children". Some babies are born after their father died.

  • Tim Forsythe
    07 Jan 2013

    Colm, would it be possible for you email me the error? I don't see any entry in my logs.

  • Colm Hughes
    07 Jan 2013

    I tried loading my gedcom file but got an sql error

  • Tim Forsythe
    06 Jan 2013

    Thanks Jean. I have some more free GEDCOM tools coming out soon.

  • Jean M. Simoneau
    06 Jan 2013

    Extraordinary piece of work. Thank you so much. I plan to confer with your web site quite often.

  • Tim Forsythe
    03 Jan 2013

    Bonkers has been updated to revision 0.94 and with it, a new option has been added to set the maximum event age error. Bonkers has also now been converted to use Vivify, the GEDCOM Birth Date Estimator.

  • Tim Forsythe
    19 Dec 2012

    Any of you who read my "HybridAuth MySQL Demo" article will not be surprised that I have added social media authentication to the Bonkers page. The only purpose this serves is to restore your last used configuration options on signing in, and to save them on submit. Granted, there are not that many options used for Bonkers, and few people ever change the defaults anyway, but it is also serving as a foundation for a new web service, Adam, my family tree generator, available in the coming year.

    Signing in is not required to use Bonkers. It will continue to work as it did before.

    If anyone would like to sign in, but cannot use any of the available social media sites listed, please let me know so that I can look into adding support for the site you need.

    Enjoy

  • Tim Forsythe
    19 Dec 2012

    Jay, I don't, but thanks for thinking it worthy.

  • Jay
    19 Dec 2012

    Do you have a logo for BONKERS that I can use on my site to say to visitors that Bonkers has been used on my gedcom ?

  • Tim Forsythe
    18 Dec 2012

    Keith, I think you're probably right. That's pretty funny. I must have gotten carried away when I originally wrote these.

  • Keith Riggle
    18 Dec 2012

    I ran Bonkers on my 4000+ person file with most of the defaults, Assume Flourishing For All Vendor Specific Event Types, Show Misformatted Dates, and Hide Improbabilities (show impossibilities only), and it picked up 25 records to check, and even after having run the file through a different plausibility check, so good job! However, some of the checks seem a bit redundant. For example, isn't "Persons Born After One of Their Parents Died" the same as "Persons Died Before Having Children?" Anyway, now I'm going to unhide the improbabilities and see what it comes up with. Thanks for the great tool!

  • Tim Forsythe
    15 Dec 2012

    Tony, it also takes into consideration data qualifiers, such that it must sometimes make assumptions when comparing dates, i.e. how do we compare "bef 1900? with "bef 1901? or "bet 1900 and 1910? with "bef 1905?. Is one earlier than the other. This can sometimes cause results to skew, or seem skewed and false positives. If you have an example I can take a look at it closer.

    I calculate the Julian Day Number for all dates, and my plan is to eventually use these so that when comparing exact dates, it can be calculated to with a single day, however there will always be assumptions made when qualifiers are used.

  • Tim Forsythe
    15 Dec 2012

    Tony, Yes, it does not currently resolve to months, or days.

  • tony harris
    15 Dec 2012

    Bonkers comes up with errors if critical dates are close together i.e in the same year. Does it only consider years and not months and days?

  • Tim Forsythe
    13 Dec 2012

    I've updated Bonkers to rev 0.92. This update includes expanded GEDCOM support, which recognizes many more non-standard dates and event records. This will improve Bonker's ability to identify inconsistencies in your data. This will, of course only affect those files which include the recognized non-standard types.

  • Tim Forsythe
    13 Dec 2012

    Andy, I agree. Until then I've made this tool available in my ongoing effort to provide free and useful tools to the genealogy community. Genealogists, researchers and family historians (whatever one wishes to call oneself) now will have an opportunity to easily identify some of the most obvious errors in their databases - and fix them once and for all. I have been using the tool myself for about 10 years and have found it to be remarkably accurate and able to pinpoint inconsistencies where I would have never expected them. It is also great for identifying areas that need to be researched further and cleaned up. Happy Bonking.

  • Tim Forsythe
    13 Dec 2012

    Gary, it available online as a web service so that anyone with the internet can use it.

  • Andy Hatchett
    13 Dec 2012

    Too bad all online trees aren't required to go thru this before being uploaded to to a site.
    Think of the junkology that would never make it to the web!

  • Gary Roberts
    13 Dec 2012

    Does this work with Windows 8? Tks, Gary

  • Keith Riggle
    12 Dec 2012

    Outstanding! Looking forward to trying it out

  • Tim Forsythe
    14 Nov 2012

    Fixed a bug in Bonkers when questions marks were part of the date.
    Fixed another bug that could result in birth date estimates that were beyond the individuals death date.
    Fixed an additional bug that could result in birth date estimates into the future.

  • Tim Forsythe
    14 Nov 2012

    The output of Bonkers is now presented to you as a link, which you can access to view your results.

  • Tim Forsythe
    13 Nov 2012

    Here is one more tip while I'm thinking about it.

    The GEDCOM parser that Bonkers uses is the same one that is used by VGedX (http://timforsythe.com/tools/vgedx), my GEDCOM validator. I periodically look through the VGedX reports to find new event types that are being used by different genealogy vendors and add these to the parser. By doing this, Bonkers is able to use these new events in its calculations, increasing the accuracy of its reports. Unfortunately, I don't scan the VGedX reports very often anymore since I am no longer supporting the Ancestors Now Tree Ring. I am more than happy to add new event types on request though, so if you want to improve your Bonkers results, you should run your GEDCOM file through VGedX and post here any new event types that are shown in the VGedX report. I'll add these at my earliest convenience. This will not only improve your results, but also the results of anyone else who uses the same genealogy application that you do. Win, win!

  • Tim Forsythe
    13 Nov 2012

    Here is a strategy that you can use to increase the accuracy of your Bonkers report.

    Bonkers categorizes all claims as being either Single Occurrence Events (SOEs) or otherwise. An SOE is an event that can only occur once per individual, such as their birth or death. Other types of events such as marriage and graduation can occur multiple times. When performing calculations that rely on SOEs, Bonkers uses the first record of that event type found in the GEDCOM file. So if your genealogy editor allows you to order your claims such that when you export your GEDCOM file, the record order is retained, you can move your 'best' SOE first to improve Bonkers accuracy. So, for instance, if you have multiple birth dates, move the one you have concluded is that most accurate to be first. This is actually a good general rule that can be useful for other types of genealogy applications as well.

  • Tim Forsythe
    13 Nov 2012

    Here is another tip for reducing clutter in your Bonkers output.

    If you are an evidence driven genealogist rather than a conclusion driven one, you probably have multiple, but similar events in your database. For instance, you might have several birth claims for an individual, each with a different date and source reference (this is common when multiple census records are entered). Some of these claims might be incorrect, or you might suspect them as so, but you don't want to remove them from your database, because they are still valuable references. You can tell Bonkers to ignore them by entering either of the strings "[Disproved]" or "[Not Applicable]" in the claim record's CAUSe field.

  • Tim Forsythe
    12 Nov 2012

    More Hints:

    Flagged items do not all have to be resolved. Sometimes they cannot be. For instance, I might have conflicting birth and baptism claims for the same individual each from a different equally reliable source, or perhaps better said, equally unreliable sources. I cannot discount one or the other, because I don't know which of the two is incorrect. This isn't necessarily a bad thing. It gives me insight on where to concentrate my research. If you were to look closely at the Improbability List (http://timforsythe.com/tree/tjforsythe/improbs) on my personal tree you would see a lot these types of issues - they are in limbo waiting to be resolved. The advantage of publishing an Improb List (or Bonkers List) along with your tree, is that you cannot be faulted, or faulted as badly, for not informing the public (or so I wish) :).

  • Tim Forsythe
    12 Nov 2012

    Here's another hint for users.

    I initially set the parameters too narrow for my database, and then after addressing any flagged claims, I will begin to widen them until the false positives increase substantially.

    An example of this is that I may start with a minimum child bearing age of say 12 or 13. Something that is just not possible. Anything flagged should be addressed. Then I'll start dialing it up. Somewhere around 15 or 16, we start to get within the realm of possibility. Anything flagged at 16 years or greater should be scrutinized closely, because it may be a false positive.

  • Tim Forsythe
    12 Nov 2012

    LK>

    Flourishing is the span of time when we expect to find non-vital records associated with a living person, such as graduation, military, etc. Built-in GEDCOM events are categorized as flourishing based on their type, so for instance, First Communion would not be. Vital records such as Marriage is, but Baptism is not. You get the point.

    When Deep Scanning is enabled, Bonkers will first attempt to estimate birth dates for any individuals missing them, before embarking on its recursive hunt. We give up speed and accuracy (in some cases), but can gain other insights into problems, especially when birth dates cannot be estimated. These types of problems are not always obvious when doing straight comparisons. Deep Scanning will flag individuals already flagged for other reasons, so you do get some duplicates. I recommend it for anyone patient enough to wait an extra minute or two for the results.

    I generally run several combinations of options, for instance I first run while ignoring improbs and address the impossible issues when I can. Then I'll turn on improbs w/o deep scanning, and after addressing those issues, I turn on deep scanning.

  • Louis Kessler
    12 Nov 2012

    Tim,

    What do you mean by "flourishing"? And what will "deep scanning" do differently when checked and when unchecked?

  • Tim Forsythe
    12 Nov 2012

    Fixed a bug in Bonkers with the max and min flourishing ages being swapped. I apologize to the person I knocked off.