Genealogy

2003-May+June:

While googling one day, I stumbled across some online cemetary records for Meade KS, where one of my great-grandfathers is buried.  This led to email correspondence with Carol Friesen of Tulsa OK, an avid genealogist who does a lot of genealogical data entry.  She asked if I could send any updates for "my" family ... 

Here is the Peter R Reimer family-tree, based on the book "Peter R Reimer 1845-1915 Family Book" by Abe R Reimer of Blumenort MB, and on updates and corrections to the Klaas P L Reimer descendants collected by my father Peter K Reimer, all of which has been entered by Carol Friesen of Tulsa OK.  That webpage resides on her website, and was mechanically converted to HTML by her.  In 2003-June, she submitted that data (in Gedcom form) to the GRANDMA project of calmenno.org (CMHS), for inclusion in their next release.  (The linked webpage becomes obsolete once the data is so included.) 

And here is the Klaas P L Reimer family-tree, a subset of the above, which resides on this website.  This copy contains further corrections, which still need to be submitted to the GRANDMA project, namely:
- gender of Jewel Lynn Faith Reimer, b.  Jul 10 1969
- gender of Jessie Reimer, b.  Jun 6 1957
- gender of Dawn Allyson Reimer, b.  May 12 1962. 

With a lot of help from my sister Iris, we also compiled a list of corrections needed in the family-tree of Peter R Penner, our maternal grandfather.  I have the list of corrections, and the before-corrections tree, but did not end up with a corrected tree, in this case.  However, I understand Carol Friesen has submitted these also to Grandma.


2006-March:

I finally got around to ordering a copy of the Grandma database, now at version 4.23.  I briefly tried the Gramps program on this data, but gave up on it, as its performance is unbearably slow on a database of this size.  It wasn't really all that brief, as it took 3 days to try it - 48 hours just to convert the Gedcom file to its database format.  This database is 183MB in Gedcom form;  in the Gramps (grdb) form it becomes 2.2GB, which strikes me as roughly 2 gigabytes too much.  Gramps has several problems, the most serious one is taking 3.5 hours just to "open" the grdb-form of the database, and during those 3.5 hours my computer is pretty much unusable for doing anything else.  And, echoing Churchill, up with that I will not put! 

Next I tried the genealogy program called LifeLines.  With it, the conversion from Gedcom took even longer - 57 hours - but after that everything I've tried has been instantaneous.  Even restarting it.  This is more like it! 

For anyone needing to work with a large genealogical database, I can recommend LifeLines;  you'll need to put up with its old-fashioned non-GUI user-interface, but it performs very well indeed, its database form is perfectly reasonable in size, and it comes with a wealth of reporting tools.  (The people at calmenno.org, who distribute the Grandma data, recommend BrothersKeeper, but as it's not available for Linux, nor for Macs, who would want it?)  The LifeLines FAQ#40 recommends using the rbtrees option, for improved performance on large databases.  I am curious about how much time this will shave off such a 57-hour import, but have yet to try it. 

The Grandma database knows a good deal about my ancestors: 
here is an Ahnentafel report;
here is an End-of-Line report; and
here is a By-Generation Count report;
each produced by the LifeLines program, from the Grandma data.  Most of these lines encounter a dead-end in the 1700's in Prussia, however some have been traced back to the 1500's, in one case to 1480, to people with names like Harnasveger, Pieters, deVeer, Grauwerts, vanDijck (also as vanDyck, vonDyck), Conrad, in the Netherlands.  Incidentally, Ahnentafel is a German word for a numbered ancestor report, where the parents of person N are always 2N and 2N+1;  for example, the parents of person 7 are persons 14 and 15.

In 2003, Carol sent a BrothersKeeper-report on how (her husband) Duane Friesen and I are related.  That BK report struck me as obviously incorrect, since the first relationship shown for the vonRiesen common ancestor, is NOT the closest relationship through that ancestor!  AHA, I believe I've found an explanation.  For each possible "distance" (4th+1x, 5th, 5th+1x, 5th+2x, etc), BK shows EXACTLY ONE path of that length.  Since it has already shown a length-5th+1x path for the Brandt common ancestor, it omits the one I'm expecting for the vonRiesen ancestor, which is also of length 5th+1x.  Having found an explanation, I won't call it wrong, but it is incomplete in strange and bizarre ways;  and it may lead people to think that Duane and I are related in exactly 6 ways, and that would be incorrect. 
[To my mind, a more intuitive way to "prune" the output would be to show each common ancestor just once, together with the shortest path connecting the two subjects through that ancestor.  But I would still want this to be accompanied by shorter notes on the other paths.  Summarizing relatedness by a single number indicating the extent to which two people "share DNA" is another thing I'd like to see.  Such a "Coefficient of Consanguinity" is provided by the GeneWeb genealogical software, and by LifeLines as it turns out.]

LifeLines has 6 different "how related" reports.  Here's the output from 5 of them together with the BrothersKeeper output for comparison:
here is the BrothersKeeper-related-report for Duane Friesen and myself;
here is the LifeLines-cousins-report for Duane Friesen and myself;
here is the LifeLines-relation-report for Duane Friesen and myself;
here is the LifeLines-relate-report for Duane Friesen and myself;
here is the LifeLines-genetics-report for Duane Friesen and myself;  and
here is the LifeLines-cons-report for Duane Friesen and myself.
The cousins report is even less complete than the BK version.  One positive note: it's also less likely to mislead anyone into thinking it is complete.  The relation and relate reports are interesting, but their "sister-in-law of step-brother" notion of relatedness, is not the kind of relatedness I'm interested in.  Not one of those 3 is what I want, and at 17min, 15min, and 25min respectively, they're not exactly instantaneous either.  Aha, the cons report is what I seek:  it shows all paths through all common ancestors;  it shows the consanguinity coefficient;  and it does all this in mere seconds!  Finding the output is not easy as it provides no hints, the documentation provides no hints - I had to read the source to learn where the output went (it is written to /tmp/t1).  Checking the output for correctness was rather a lot of work; our ancestors married relatives so often that we are related in 56 ways!  The description of the cons report has flaws:  it is misleading about the usefulness of a consanguinity coefficient, it fails to mention where the output goes, and it provides no hint that this report really is what the description of cousins led one to hope it would be.  All of these flaws led me to try 3 other reports before this one, and then to very nearly give up on this one.  But instead of needing to write my own "relatedness" program, now I only feel the need to make minor enhancements to this one.  The improvements I'm planning to make:  fix infinite-loop & crashing for A and B unrelated, or same person;  describe each path in English as Mth cousin Nx removed;  highlight the common ancestor in a style much like that used by BK;  improve the ordering of output so that shortest paths come first, and so that a husband and wife who are both common ancestors are adjacent (and possibly combined);  A few more examples:
LifeLines-cons-report for my parents -- an example of unfortunate ordering;
LifeLines-cons-report for my sister and myself -- with Parent-Relatedness for our father's parents, and unfortunate ordering.

In browsing through my relatives, I observe that the corrections we sent in for the Peter R Penner family have made it into this version of Grandma;  my unmarried aunt Helen and cousin Norma Brandt, Mrs Sault, are no longer "merged" with different persons having similar names - both are now rid of those erroneous spouses.  Aunt Helen also got a year older in the process, and perhaps we'll manage to celebrate her next big birthday in the right year:-)  However, the larger set of additions and corrections on my paternal side, to the Peter R Reimer tree, have NOT made it into this version of Grandma.  They were sent in almost 3 years ago, but it seems they are awaiting Grandma-version-5.0.

[I may shorten this page, creating 2 sub-pages:  one on genealogy-software,  one on relatedness.]


2007-Jan:

A few more examples:
the LL-cons report showing the 10 ways Elmer (Al) Reimer and I are related;
the LL-cons report showing the 37 ways Donald S Reimer (Reimer Express) and I are related.

Origin of the name Reimer:  according to the Brothers Grimm, the name Reimer, also spelled as Rhymer, Reumer, Reume, Reime, Reumen, Reimen, is from an old germanic word rîm for number, and thus a Reimer is a computer, reckoner, or calculator;  see germazope.uni-trier.de/Projects/WBB/woerterbuecher/dwb/wbgui?lemid=GR03660.  (Their quote from Goethe "keinen reimer wird man finden, der sich nicht den besten hielte" may be an observation about Reimers, or was he talking about poets?)  Elsewhere one can find speculation that the surnames Rymer, Rimmer, Raemer, Raimer, Ramer, Raymer, Ramiro, and several others are also variants of Reimer.


2010-Feb:

Thomas Wetmore in a 2010-02-06 post on LINES-L@LISTSERV.NODAK.EDU provides valuable information on Lifelines and large databases, including suggestions on how performance could be greatly improved, and the size-limits extended.  Lifelines presently handles databases of up to roughly 46-million records, which likely means about 25-million persons.  In a 2010-02-20 post he reports how a "lazy write" scheme in Lifelines provides a 200-fold performance improvement, or loading at 43,000 records per second, which means loading one-million records takes about 30-seconds (on modern hardware).  A welcome improvement indeed! 

Although I purchased a copy of GRANDMA-5 when it came out, I have yet to "load" it into Lifelines (not every day am I willing to start something that takes more than a day to run), and I now intend to skip it entirely by going directly to GRANDMA-6 with the lazy-write-enhanced Lifelines.  Incidentally, the 4.23 version of GRANDMA has just under one-million persons, and yet it has 6.3 million records.  That however is the number of "level 1" GEDCOM records, which may not be the relevant number for the Lifelines representation? 

related.ll: 
I finally got back to the improvements in the cons-report mentioned above.  I made some of them 4 years ago, but did not publish the result as it seemed unfinished.  While reading about the subject I had learned there were some tough choices about whose definitions to use, and some of the material was so confusing it made my head hurt.  Which may explain my taking a 4-year break:-)  There's probably no easy answer, no approach that will please everyone, however I have made my choices and gone with them.  Some notes on the Literature and the Algorithm that I made while writing it that were originally comments in the program but got to be too long for that;  my rationale for discarding the "inbreeding-correction-factor" is explained in two paragraphs, one under Literature, one under Algorithm.  After deciding to depart from the definitions used by Teschler (cons.ll), I've given the program an altogether different name, calling it related.ll, and am calling the relatedness-metric "Relatedness" rather than "Consanguinity-coefficient".  I contend that related.ll is a complete replacement for the Lifelines report-scripts cons.ll, cousins.ll, genetics.ll, genetics2.ll, but not for the similarly named relate.ll nor relation.ll. 

Here are the same examples that were shown above for cons.ll, but now as done by related.ll, along with a few new ones:
LL-related report for me and Duane Friesen
LL-related report for me and Donald S Reimer
LL-related report for me and Elmer (Al) Reimer
LL-related report for me and my sister Iris
LL-related report for me and Jakob Harnasveger (a distant ancestor)
LL-related report for me and my father
LL-related report for my parents (with options:GB)


Send your questions, suggestions, corrections, bug-reports to ereimer@shaw.ca.