Loading...
 

Parsi DNA

Parsi DNA

for the Zoroastrian Persian population from the Indian Subcontinent


Currently just a place to collect information for our benefit as we try to explain the special situation about this population. May evolve into a more formal study at some point.


We created this page to start documenting information found on the Parsi population as it relates to DNA studies. Mostly from a genealogical perspective. Many Parsi's have tested using the consumer level genetic genealogy tests available. But they are confused by the results compared to what the testing companies promote. Testing company information is based on traditional, published literature for the European-descent community with very low endogamy. But much of the world is not so thoroughly endogamous in their DNA. There are isolated populations where close cousin marriages were practiced and even encouraged. The traditional royal families in Europe come to mind immediately. As do the Mennonites and Amish in the USA, Ashkenazi Jews, Acadians and similar. Oddly, when many of these populations are mentioned, Parsi's never enter into the picture. So little has been studied or understood Yet, based on our limited experience, they are likely the most strongly endogamous (both near and far timeline) population group that exist. Hence, we are looking to collect information about any experience and guidelines available to provide assistance to Parsi's in interpreting their genetic genealogy test results.

Note 1: An interesting find in 2017 with triangulated groups. Specifically, using the GEDMatch Tier 1 Triangulated Group tool. For Europeans they have tens to a hundred or so triangulated groups when running the tool with default parameters. For Parsi's, it is just a few. So it appears that although there is this 1-2% floor of matching DNA appearing, it does not carry through to common matching segments between 3 or more people. We have yet to (a) study the GEDMatch tool in more detail and (b) more fully study the statistics of the size of population and matches. But this looks promising as a method to weed out the match list for Parsi's to find more likely relatives. Does require more than 2 testers to compare and match though.
TesterCompany# of MatchesFirst 20 Total AvgFirst 20 #Segs AvgFirst 20 Longest Avg# Triang Groups
JD23andMe
RM23andMe839 (1.65 to 1.24%) (11 to 13)

Notes: First 20 does not include known relatives. Tester results transferred in are marked with the source test company (A,M,T,H as used by GEDMatch)

Note 2: In a Facebook group post in 2019, Leah LaPerle Larkin mentions a metric she has developed for the endogamous Acadians that she studies. Specifically, that endogamous false matches appear to have low average segment lengths. The average segment length can be easily determined; even on Ancestry where segments are not reported. Simply divide the total matching segment amount by the number of matching segments reported. (note: this is tougher to do on GEDMatch because they do not report the number of segments!) A number at 18 and higher is a match to look into. 15 and lower likely to be set aside initially. It is not clear if this same technique will apply to the Parsi community yet. Or how the test company and matching database source affects this. (For example, FTDNA and MyHeritage include segments as small as 1 cM in their total amount although they only claim to use as small as 5cM.)

Note 3: In Sep 2017, we discovered 43 kits uploaded to GEDMatch Genesis that are all labeled simply "Parsi" and appear, in their entirety, in other Parsi's match lists. We are trying to determine the source and if a study is already in progress. More importantly, are these made-up, false or real person kits. These kits disappeared after a few months.

Introduction

Parsi's were always believed a highly endogamous population; historically and to the present day. atDNA testing is confirming that. For the Parsi DNA kits we manage, we have seen a noise floor of 1 to 1.5% in matching amount where pretty much any Parsi who tests is showing that match strength to the other Parsi's. Only 2nd cousins or closer will exceed that noise floor. But confirmed relatives can often fall in the middle and below this noise level. As a result, atDNA testing has not been very helpful for members of this community. Some sites, that do not scrub their segments or limit their match lengths, show 2-3%.

For non-endogamous Europeans, the noise floor is below 15cM or roughly .2%. With 7cM being a key metric for segment length where below that length the matching segment is more likely to be a false-positive match than a relative with a real DNA match. There is an assumption of using at least 500 SNPs to determine a segment of that length also (the density of SNP testing is a crucial factor as well). We have as yet to get a good metric, but it does appear Parsi's have longer matching segments that are false. On the order of 20cM. So this floor of a minimum segment length to represent a reliable match may also be raised.

How do we know the matches are all Parsi's? While a few surnames overlap with the Indian dispora, most Parsi surnames are quite unique to their population. See the section below on Parsi surnames. But even without the surname (we have an adoption case we have worked on), the detection of a Parsi tester and matches is pretty evident when it appears.

Autosomal (and X) Segment Matching

We personally started testing on 23andMe exclusively. And immediately saw a difference there. The European-descent population tested see a 0.1% or better DNA match with just 1 or 2 segments larger than 7cM on 1 or 2 chromosomes; and can find they are distantly related. This is barely above the noise floor for the testing process, On our sister project, H600, we have been able to dig deep to find some matching with likely relatives that are 5th to 7th cousins; beyond the suspected limits. On the other hand, the Parsi community is showing a 1.5 to 2% total match strength, with tens of segments over 7cM, across 8 to 12 chromosomes (on average) for people that show no real relation (that can be discovered) in a genealogical time. The longest matching segment noise floor seems to be 20cM also (whereas the testing process and studies indicate a 7cM longest segment floor, in general). (For help, roughly 70cM in total matching segments is about 1%.)

So the question becomes: how do you interpret the Parsi community test results on Autosomal and X Chromosome SNP analysis? Is there a new noise floor to set for this community? Do we have to look for ONLY longer segments more than 20cM in length over multiple chromosomes? Or should we simply discount / throw out certain segments where larger matching segments tend to be recreated? Can any useful information be extracted from autosomal DNA testing? How has work with communities with similar historical endogamy handled this (like Ashkenazi Jews)? Studies of endogamous populations show traits such as lengthy Runs of Homoozygous (RoH) base-pairs — an indicator of recent and ancient mixing. This may be needed to distinguish segments between nearer term relatives.

I should add that Pedigree Collapse (related parents) analysis shows minimal relatedness. At least when measured using Immanuel's y-str.org tool or on GEDMatch's "Are your parents related" tools. Very few rise to 0.5% or higher. And there are no Full-Identical match regions; only Half-Identical. So the strong endogamy is not contributing there (which is odd in itself as we would expect this to occur as well). But, we surmise that 1-2% is a small amount and the likelihood of this overlap by chance is small.

Likely, should really work towards starting a yDNA and Autosomal project for the community as a whole. mtDNA is not as helpful just because there are not enough data points as Parsi men have always been able to marry a non-Parsi and the family (children and spouse) are still considered Parsi. But this does not apply to Parsi women. If they marry a non-Parsi man, they are not longer accepted in the temple. A study performed on mtDNA and referenced below seems to confirm the practice that outside women are accepted in. But maybe such a study on just the Dastur surname line is possible; similar to how the Cohen's were studied in the Jewish community before. Those with the Dastur surname are not allowed to marry outside the community IF they want to retain their Priestly status and thus surname.

Y Haplogroups (Patriline)

Interesting to note that our single Parsi Priest (Dastur) tester is determined to be L-M22 from the 23andMe general test. This is the predominant Haplogroup for Indian Zoraastrian priests found with the study published in 2017 (see ref below). The general Zorastrian population tested in India, with no priests included, resided in Haplogroup J. As did both Iranian Zoroastrians (both priests and lay). Haplogroup L is simply not found among Zoroastrians in Iran. Haplogroup L is more associated with the early human population development of the Indian Subcontinent.

Mitochondrial Haplogroups (Matriline)



Surnames

So what about Parsi surnames are so unique? Interesting link / first blush on explaining some Parsi surnames as being locative. This is one of the three main sources of surnames indicated by the Guild of One-Name Studies (GoONS):

  • Locative : derives from the place where someone came from or lived
    • Toponymic : derived from a place name
  • Occupational or Metronymic : derives from the occupation of the bearer
  • Post Holder
As soon as you see the surnames Engineer, Doctor, Contractor, Captain, Driver and similar in a match list; then we know a Parsi match is involved. Surnames became quickly adopted by Parsis when the English rule started. And hence the English forms. Dastur is likely an example of a Post Holder as Dasturji means Priest.

A special note must be given for surnames ending in "wala". This is more common among the Parsi's in Gujurat — the origin of their culture in India. "Wala" is simply a way of specifying the occupation of the surname holder. Unlike the English surname forms given earlier, these are in the native tongue of Gujurati or even Persian. Examples are Campwala, Khariwala, Daroowala, Todiwala, Limbuwala, Botliwala, Bottlewala, Unwala, Pitavala and so on. Mistry, Modi, and Mehta are special ones we should mention. Cross-overs and common in the general Indian population as well.

Given names are often (ancient) Persian and have thus become used in some Islamic communities; especially those established early on in India. Feroz, Kersas, Farhad, Cyrus, Roshan, and Darius just to name a few.

References

For further information on DNA studies of the Parsi and related regions

Parsi direct


South Asian / Indian Subcontinent


Persia / Arabia



Resources


See Also