Home Menu ↓
Welcome back.
Sign In via
to save/restore your configuration.

Bonkers: The GEDCOM Sanity Checker


Sooner or later, we all get to the point where we realize that there is a lot of data in our database that is just completely bonkers.


Bonkers: The GEDCOM Sanity Checker will help you quickly locate this data by identifying groups of claims in your database that are inconsistent with each other. Bonkers works by running your GEDCOM file thorugh a battery of recursive, multipass, deep scanning algorithms designed to rat even the most elusive errors.

Bonkers is now available to the public as a free web service. Bonkers uses the same GEDCOM Parser that is used by VGedX, my 100% compatible GEDCOM validator. Bonkers, however, supports a special expanded GEDCOM dictionary that will recognize numerous vendor specific events and date formats giving it a distinct advantage over vendor specific consistency checkers, and makes it useable by virtually anyone with an exported GEDCOM file.

The following is a list of the various conditions the Bonkers currently detects.

  • Persons born after they were baptized
  • Persons born after they were married
  • Persons born after having children
  • Persons born after they died
  • Persons born after they were buried
  • Persons born after their parent's death
  • Persons born after their parent's burial
  • Persons baptized too old
  • Persons baptized after they were married
  • Persons baptized after they died
  • Persons baptized after they were buried
  • Persons married too young
  • Wives married too old
  • Persons who are much older than their spouse
  • Persons married after they died
  • Persons married after they were buried
  • Mothers who bore children too young
  • Mothers who bore children too old
  • Parents who died before having children
  • Persons who died after they were buried
  • Persons who lived too long
  • Persons who were buried before having children
  • Persons with multiple sets of parents
  • Persons with no parents
  • Persons with only 1 parent
  • Persons whose birth dates could not be estimated
  • Misformatted dates

Bonkers sometimes needs to make compromises when dealing with qualified dates. This may occasionally result in it flagging claims that could be interpreted as a false positive. To help prevent these, Bonkers uses Vivify, the GEDCOM Birth Date Estimator to calculate accurate and reliable missing birth dates when the Deep Scanning option is enabled. See the Vivify page for a discussion of its configurable options. Bonkers has several other configurable options that can also be adjusted to fine-tune your results.

Dead if Born Before Year is the earliest year a living person could have been born, before which is considered improbable.

Dead if Married Before Year is the earliest year a living person could have been married, before which is considered improbable.

Bonkers now supports saving and restoring your configuration options when signed in. Processing is relatively fast, however, it could take several minutes to process large or complex databases.

If you have comments, questions or suggestions, please don't hesitate to chime in using the form provided below, and pleeeeease, for the love of coffee, take a moment and give Bonkers an honest >> review <<.

I've add a few hints and tips in the comments below that might help you make the most effective use of Bonkers.

Happy Bonking!
Tim Forsythe

GEDCOM Filename:

Now serving number:   478


  • Assume Flourishing For All Vendor Specific Events
  • Show Misformatted Dates
  • Show Persons Missing 1 Parent
  • Show Persons Missing Both Parents
  • Show Impossibilities Only (hides improbabilities)
  • Deep Scanning (estimates birth dates)

    Maximum Child Baptism Age
    Minimum Marriage Age
    Maximum Wife's Marriage Age
    Maximum Age Between Spouses
    Minimum Child Bearing Age
    Maximum Child Bearing Age
    Minimum Flourishing Age
    Maximum Flourishing Age
    Maximum Age (Life Span)
    Maximum Event Age Error
   
    Dead if Born Before Year
    Dead if Married Before Year



When birth dates cannot be estimated, the following abbreviated codes will appear in the report to give you an indication as to why they couldn't.
  • fb = father's birth
  • mb = mother's birth
  • pm = parrent's marriage
  • b = birth
  • bp = baptism
  • sb = earliest spouses birth
  • xb = latest siblings birth
  • m = earliest marriage
  • cb = child's birth (range)
  • cm = earliest child's marriage
  • pd = latest parent's death
  • d = death
  • bu = burial
  • le = living event (range)
  • fe = flourishing event (range)



Reviews:

   Genealogy Software Reviews - Bonkers

Mentions:

   Modern Software Experience, Tamura Jones, Nov.12.2012 (update), Online Genealogy Consistency Checks
   Modern Software Experience, Tamura Jones, Dec.30.2012 Genealogy 2012
   Modern Software Experience, Tamura Jones, Dec.31.2012, Genealogy Trends 2012
   reddit, roland19d, Jan.1.2012 Bonkers: The GEDCOM Sanity Checker (free online tool)


  • Add a Comment

    • chas vigneron
      05 Feb 2013

      Thank you.

    • Tim Forsythe
      29 Jan 2013

      John, I hope i can quote you on that. :)

      Spread the word folks, there are still a few people that haven't had a chance to use it.

    • John Coldwell
      29 Jan 2013

      Tim
      Thanks. That worked fine.
      This a brilliant program and I think anyone with more than just a few names needs to use it!
      Regards
      John

    • Tim Forsythe
      29 Jan 2013

      John, thanks for the feedback. I glad to here the program worked well for you.

      VGedX is a strict GEDCOM validator so requires a GEDCOM version in the file to know which dictionary to compare it against. Your GEDCOM file does not include the GEDCOM version in the header record, hence "empty". The file should have the following beneath the 0 HEAD line in your file.

      1 GEDC
      2 VERS 5.5

      You can manually do the insertion, and then VGedX should work for you.

      Bonkers, and the rest of the tools, are more relaxed in that they all use an expanded GEDCOM dictionary to compare against, so they ignore the GEDCOM version. This is the reason your file runs correctly in Bonkers.

    • John Coldwell
      29 Jan 2013

      I ran my gedcom file (16,000 people) and Bonkers gave me four A4 pages of Errors. I went though the list and made the corrections in my family history program. An excellent tool!
      I have generated a new gedcom and want to rerun the file (Coldwell 4.ged) to check if I missed anything and I cannot get a result. VGedX gives the message "E028 198176 Unsupported GEDCOM version detected (empty)". I am puzzelled by this because I am using identicle programs and methods which worked correctly the first time. Also the gedcom file works correctly when loaded into my family history program.
      I would much appreciate help sorting this out.

      Regards
      John

    • Tim Forsythe
      10 Jan 2013

      Sue, The limit is currently set to 50 MB. I recently ran a file for someone with 220K people that was 130 MB and it took 4.5 hours to complete (successfully I might add). I've only recently begun tracking times so have no data on the maximum file run so far. The service times out in 10 minutes though. I generally recommend for users with large databases to export a single ancestral line that they are interested in and run that.

    • cybersuezee
      10 Jan 2013

      Is there a limit to the size of Gedcom file I can run through Bonkers? I use Legacy with currently 205208 individuals?
      BTW I think you can have both "Persons Born After One of Their Parents Died" and "Persons Died Before Having Children". Some babies are born after their father died.

    • Tim Forsythe
      07 Jan 2013

      Colm, would it be possible for you email me the error? I don't see any entry in my logs.

    • Colm Hughes
      07 Jan 2013

      I tried loading my gedcom file but got an sql error

    • Tim Forsythe
      06 Jan 2013

      Thanks Jean. I have some more free GEDCOM tools coming out soon.

    • Jean M. Simoneau
      06 Jan 2013

      Extraordinary piece of work. Thank you so much. I plan to confer with your web site quite often.

    • Tim Forsythe
      19 Dec 2012

      Jay, I don't, but thanks for thinking it worthy.

    • Jay
      19 Dec 2012

      Do you have a logo for BONKERS that I can use on my site to say to visitors that Bonkers has been used on my gedcom ?

    • Tim Forsythe
      18 Dec 2012

      Keith, I think you're probably right. That's pretty funny. I must have gotten carried away when I originally wrote these.

    • Keith Riggle
      18 Dec 2012

      I ran Bonkers on my 4000+ person file with most of the defaults, Assume Flourishing For All Vendor Specific Event Types, Show Misformatted Dates, and Hide Improbabilities (show impossibilities only), and it picked up 25 records to check, and even after having run the file through a different plausibility check, so good job! However, some of the checks seem a bit redundant. For example, isn't "Persons Born After One of Their Parents Died" the same as "Persons Died Before Having Children?" Anyway, now I'm going to unhide the improbabilities and see what it comes up with. Thanks for the great tool!

    • Tim Forsythe
      15 Dec 2012

      Tony, it also takes into consideration data qualifiers, such that it must sometimes make assumptions when comparing dates, i.e. how do we compare “bef 1900″ with “bef 1901″ or “bet 1900 and 1910″ with “bef 1905″. Is one earlier than the other. This can sometimes cause results to skew, or seem skewed and false positives. If you have an example I can take a look at it closer.

      I calculate the Julian Day Number for all dates, and my plan is to eventually use these so that when comparing exact dates, it can be calculated to with a single day, however there will always be assumptions made when qualifiers are used.

    • Tim Forsythe
      15 Dec 2012

      Tony, Yes, it does not currently resolve to months, or days.

    • tony harris
      15 Dec 2012

      Bonkers comes up with errors if critical dates are close together i.e in the same year. Does it only consider years and not months and days?

    • Tim Forsythe
      13 Dec 2012

      Andy, I agree. Until then I’ve made this tool available in my ongoing effort to provide free and useful tools to the genealogy community. Genealogists, researchers and family historians (whatever one wishes to call oneself) now will have an opportunity to easily identify some of the most obvious errors in their databases – and fix them once and for all. I have been using the tool myself for about 10 years and have found it to be remarkably accurate and able to pinpoint inconsistencies where I would have never expected them. It is also great for identifying areas that need to be researched further and cleaned up. Happy Bonking.

    • Tim Forsythe
      13 Dec 2012

      Gary, it available online as a web service so that anyone with the internet can use it.

    • Andy Hatchett
      13 Dec 2012

      Too bad all online trees aren’t required to go thru this before being uploaded to to a site.
      Think of the junkology that would never make it to the web!

    • Gary Roberts
      13 Dec 2012

      Does this work with Windows 8? Tks, Gary

    • Keith Riggle
      12 Dec 2012

      Outstanding! Looking forward to trying it out

    • Tim Forsythe
      13 Nov 2012

      Here is one more tip while I'm thinking about it.

      The GEDCOM parser that Bonkers uses is the same one that is used by VGedX (http://timforsythe.com/tools/vgedx), my GEDCOM validator. I periodically look through the VGedX reports to find new event types that are being used by different genealogy vendors and add these to the parser. By doing this, Bonkers is able to use these new events in its calculations, increasing the accuracy of its reports. Unfortunately, I don't scan the VGedX reports very often anymore since I am no longer supporting the Ancestors Now Tree Ring. I am more than happy to add new event types on request though, so if you want to improve your Bonkers results, you should run your GEDCOM file through VGedX and post here any new event types that are shown in the VGedX report. I'll add these at my earliest convenience. This will not only improve your results, but also the results of anyone else who uses the same genealogy application that you do. Win, win!

    • Tim Forsythe
      13 Nov 2012

      Here is a strategy that you can use to increase the accuracy of your Bonkers report.

      Bonkers categorizes all claims as being either Single Occurrence Events (SOEs) or otherwise. An SOE is an event that can only occur once per individual, such as their birth or death. Other types of events such as marriage and graduation can occur multiple times. When performing calculations that rely on SOEs, Bonkers uses the first record of that event type found in the GEDCOM file. So if your genealogy editor allows you to order your claims such that when you export your GEDCOM file, the record order is retained, you can move your 'best' SOE first to improve Bonkers accuracy. So, for instance, if you have multiple birth dates, move the one you have concluded is that most accurate to be first. This is actually a good general rule that can be useful for other types of genealogy applications as well.

    • Tim Forsythe
      13 Nov 2012

      Here is another tip for reducing clutter in your Bonkers output.

      If you are an evidence driven genealogist rather than a conclusion driven one, you probably have multiple, but similar events in your database. For instance, you might have several birth claims for an individual, each with a different date and source reference (this is common when multiple census records are entered). Some of these claims might be incorrect, or you might suspect them as so, but you don't want to remove them from your database, because they are still valuable references. You can tell Bonkers to ignore them by entering either of the strings "[Disproved]" or "[Not Applicable]" in the claim record's CAUSe field.

    • Tim Forsythe
      12 Nov 2012

      More Hints:

      Flagged items do not all have to be resolved. Sometimes they cannot be. For instance, I might have conflicting birth and baptism claims for the same individual each from a different equally reliable source, or perhaps better said, equally unreliable sources. I cannot discount one or the other, because I don't know which of the two is incorrect. This isn't necessarily a bad thing. It gives me insight on where to concentrate my research. If you were to look closely at the Improbability List (http://timforsythe.com/tree/tjforsythe/improbs) on my personal tree you would see a lot these types of issues - they are in limbo waiting to be resolved. The advantage of publishing an Improb List (or Bonkers List) along with your tree, is that you cannot be faulted, or faulted as badly, for not informing the public (or so I wish) :).

    • Tim Forsythe
      12 Nov 2012

      Here's another hint for users.

      I initially set the parameters too narrow for my database, and then after addressing any flagged claims, I will begin to widen them until the false positives increase substantially.

      An example of this is that I may start with a minimum child bearing age of say 12 or 13. Something that is just not possible. Anything flagged should be addressed. Then I'll start dialing it up. Somewhere around 15 or 16, we start to get within the realm of possibility. Anything flagged at 16 years or greater should be scrutinized closely, because it may be a false positive.

    • Tim Forsythe
      12 Nov 2012

      LK>

      Flourishing is the span of time when we expect to find non-vital records associated with a living person, such as graduation, military, etc. Built-in GEDCOM events are categorized as flourishing based on their type, so for instance, First Communion would not be. Vital records such as Marriage is, but Baptism is not. You get the point.

      When Deep Scanning is enabled, Bonkers will first attempt to estimate birth dates for any individuals missing them, before embarking on its recursive hunt. We give up speed and accuracy (in some cases), but can gain other insights into problems, especially when birth dates cannot be estimated. These types of problems are not always obvious when doing straight comparisons. Deep Scanning will flag individuals already flagged for other reasons, so you do get some duplicates. I recommend it for anyone patient enough to wait an extra minute or two for the results.

      I generally run several combinations of options, for instance I first run while ignoring improbs and address the impossible issues when I can. Then I'll turn on improbs w/o deep scanning, and after addressing those issues, I turn on deep scanning.

    • Louis Kessler
      12 Nov 2012

      Tim,

      What do you mean by "flourishing"? And what will "deep scanning" do differently when checked and when unchecked?


Terms of Service | Privacy Policy | Email | Google+ | Copyright © 1999-2013 Tim Forsythe. All Rights Reserved.