VGedX: The GEDCOM Validator
Capture

What is VGedX?

VGedX is a WIN32 DOS application that functions as a fully compliant GEDCOM 5.5, 5.5.1 and 5.6 (lineage-linked format only) parser. It will open a GEDCOM file, parse it, validate it based on the options configured, and print appropriate warning messages to the display. The validation results can also be saved to a log file. The log file is tab delimited so that automated applications can be used to batch validate large numbers of user GEDCOM files and process the validation results easily.

VGed uses The GEDCOM Parser (TGP) library as its backend. A web service is available from the Tools menu, as well as a Windows 32-bit version .

Where can I find the VGedX web service?

The latest version of VGedX can be tried out online at the following link (most options are available):

VGedX: The GEDCOM Validator

Where can I download VGedX?

The latest version of VGedX can be downloaded from the Downloads Page.

Download and unzip the attached file in the folder where you want it. You may need to have installed on your computer the Microsoft Visual C++ 2008 SP1 Redistributable Package (x86) to run this program. 64 bit versions are also available from Microsoft through this link.

There are three options available for running the program.

The first of these allows building a test database with the minimum required occurences, or at last one, of all defined GEDCOM 5.5, 5.5.1, or 5.6 records and fields. This is useful for building a test file to validate import handling of proper GEDCOM files within different genealogy applications. Not only can manufacturers use this to test their software, but application users can also use it to help them decide for themselves which applications handle the importing of GEDCOM files to the level of compatibility that they require. Since the files generated are built by traversing The GEDCOM Parser’s state machine recursively, they can also be used as a sanity check for validating The GEDCOM Parser’s state machine, by manually comparing it to the proper GEDCOM specifications.

The syntax for executing VGedX in this mode is any of the following:

vgedx -b55 filename
vgedx -b551 filename
vgedx -b56 filename

The resulting output files are already available on the Downloads page.

The second option for running VGedX, the normal usage, is to validate a GEDCOM file.

The format for executing VGedX in this mode is:

vgedx -f filename [-l logfile] [-s savefile] [options]

The validation results will be written to the console as will any processing errors. The console output will also be written to a tab delimited logfile if specified.

When a savefile is specified, the GEDCOM file will be rewritten after removing any records that cause errors. Records causing warnings will not be removed. There is also no attempt made to fix errors or warnings. Note that continued text on lines less than 255 characters will be concatenated forcing long lines to adhere to the maximum line length.

The third option for running VGedX is to validate a batch of GEDCOM files.

The format for executing VGedX in this mode is:

vgedx -b filelistname [-l logfile] [options]

The file list is a standard text file with the complete path to each file to be validated on its own line. The validation results will be written to the console as will any processing errors. The console output will also be written to a tab delimited logfile if specified. In batch mode, the version, vendor and file information is written to each line so that it may be sorted in your favorite spreadsheet. The +info option will list only the version, vendor and file information for each file.

What options does VGedX support?

The normal validation procedure will detect all errors, as well as some warnings, such as minimum and maximum occurrences of record fields, and maximum line and data lengths. VGedX does not validate data enumerations. The following options allow ignoring common violations of the GEDCOM standards/drafts, or to detect other conditions which may be useful, but are not in fact, violations.

Ignore Unused Records
-used | It is usually expected that all root records found in a GEDCOM file will be used within that file as determined by internal references to the record. This is a common warning and may be ignored.

Ignore Missing Records
-ref | It is usually expected that all records referenced in a GEDCOM file will be found in that same file.

Ignore Duplicate Records
-dups | It is usually expected that all root records found in a GEDCOM file will be unique.

Ignore Undefined Records
-unk| It is usually expected that all vendor specific record tags will begin with an underscore (_). When they do not, they are classified as undefined.

Ignore User Defined Records
-user | All vendor specific record tags beginning with an underscore (_) are classified as user defined. This is a common warning and may be ignored.

Ignore Minimum Data Limits
-nodata | All record fields have minimum data lengths defined.

Ignore Maximum Data Limits
-len| All record fields have maximum data lengths defined. This is a common warning and may be ignored.

Ignore Missing ID References
-noref| Some tags required ID references.

Ignore ID Reference Substitutions
-subs| Some vendors substitute data for tags where ID references are usually required.

Ignore Missing Continuation Tags
-cont| Some vendors allow note fields to include embedded carriage returns.

Ignore Trailing Spaces
-endl| GEDCOM lines do not usually end with spaces or tabs (delimiters). This is a common warning and may be ignored.

Ignore Trailing Data
-dang| GEDCOM lines do not usually have data following ID references.

Ignore Unpaired Ampersands
-amp| It is usually expected that ampersands (@) when embedded in text are paired i.e. tjforsythe@@gmail.com. This ensures that they will not be confused for ID references i.e. @I1@.This is a common warning and may be ignored.

Ignore Level Number Gaps
-gaps| Each GEDCOM line begins with a level number. The level number indicates which field is the line container. A field with level number ‘n’ is considered container by the previous field at level number ‘n-1′. If there is no level number ‘n-1′, the the field is a root record at level number 0. It is usually expected that the level numbers will increment by 1. Whenever a level number increment by 2 or more, it is considered a level number gap.

Ignore Tag Occurrence Limits
-tags| All record fields have minimum and maximum occurrence limits defined.

Ignore Invalid Date Formats
-dates| All date fields have a defined allowed structure. Any data field not adhering to this structure is considered a free form date phrase.This is a common warning and may be ignored.

Ignore Tag Error Duplicates
-limit| Many tag errors are the result of vendor programming and as such will apply to all fields of a particular type in the file being validated. You may want to limit these to the first error only so that the log does not become bloated with similar errors.

Show File Information Only
-info| Normally, the output include the version, vendor and file information along with a list of errors. This option allows you to hide the errors section. This is especially useful in batch mode where the log file will include the information for each file on a separate line.

Skip Processing Failed Records
+skip| Normally, when a record fails validation due to a warning, it can continue to be processed. Errors generally cannot and are not affected by this option. This option allows the fields of these records to be skipped so that their errors do not fill the log.

Use Extended GEDCOM Validation
+tgp| The GEDCOM Parser (TGP) library is the backend that is used for VGedX. It supports validation of GEDCOM 5.5, 5.5.1, and 5.6 files. It also supports validation of an extended GEDCOM dictionary which includes not only the GEDCOM 5.5, 5.5.1, and 5.6 dictionaries, but also many vendor specific records as well. When this option is enabled, the GEDCOM version of the file is ignored, and the extended dictionary is used instead. This is only useful when validating a GEDCOM file against an application built using the extended TGP library, such as Adam. (VGedX also supports building an extended TGP GEDCOM dictionary test file using the -bTGP filename option.)

Where can I download VGedX

Visit the Downloads Page

Mentions

VGedX (Modern Software Experience, 1/15/2012)

Leave a Comment