Genetic Similarity of Siblings

INTRODUCTION

Brothers and sisters tend to have similar features but are not usually identical. This is because each child gets exactly half of their father’s genes and half of their mother’s genes in a random and mostly unpredictable scrambling. It is thus barely possible two siblings will have no genes in common at all. That is, a second child may get exactly that half from each parent that the first child did not get. It is also possible they will inherit exactly the same genes. Both of these particular outcomes are unlikely and in general siblings share about half of their total genetic makeup. From fairly straightforward considerations, we can calculate the odds two children from the same parents will have varying degrees of relatedness which lie between these two extremes.

INHERITANCE

Living things have two distinct characteristics. They assimilate inanimate matter, i.e. they eat stuff to provide the energy to grow and to live. And they reproduce themselves. Children are much like their parents because their characteristics are transmitted mostly without error in a DNA (deoxyribonucleic acid) molecule found at the center of each cell. This molecule is effectively a long string of literally billions of similar atomic subunits which is able to make an identical copy of itself as a cell divides.

In humans, our DNA string curls around itself to form 46 distinct sections weakly stitched together in a linear fashion. Each section is called a chromosome and contains long strings of many different genes. Further these chromosomes stick together to form 23 pairs where each member of a pair has the same length.

Human children get one chromosome from their father and one from their mother for each of their own 23 pairs. Assuming your Dad has the pair (A_Dad, B_Dad) and your Mom has (C_Mom, D_Mom) means there are roughly four possibilities for each of your own 23 chromosomal pairs as follows

1. A_Dad from your father and C_Mom from your mother

2. A_Dad from your father and D_Mom from your mother

3. B_Dad from your father and C_Mom from your mother

4. B_Dad from your father and D_Mom from your mother

Or we could say instead that there are two possibilities for each of your 46 chromosomes. Each of the 23 that comes from your father has two possibilities and the same for the 23 from your mother. It is slightly more complicated, however, because the chromosome donated to the child is a recombination of the genes from both of the chromosomes in the parent.

ODDS OF DIFFERENT DEGREES OF RELATEDNESS

The inheritance described above means that there are at least 4²³= 2⁴⁶ = 70.37 trillion total possibilities for your individual genetic makeup. Thus each of us is truly a unique individual probably never before seen and never to be repeated. In any event, the odds of two siblings sharing “n” chromosomes out of the total of 46 (in 23 pairs) are given as follows

So we can tabulate these odds, as follows

Probability of Siblings Sharing "n" Chromosomes (out of 46 total)
Relatedness	n	Probability, P_n	One Chance in	n	Relatedness
0.0%	0	0.0000%	70,368,744,177,664	46	100.0%
2.2%	1	0.0000%	1,529,755,308,210	45	97.8%
4.3%	2	0.0000%	67,989,124,809	44	95.7%
6.5%	3	0.0000%	4,635,622,146	43	93.5%
8.7%	4	0.0000%	431,220,665	42	91.3%
10.9%	5	0.0000%	51,335,793	41	89.1%
13.0%	6	0.0000%	7,512,555	40	87.0%
15.2%	7	0.0001%	1,314,697	39	84.8%
17.4%	8	0.0004%	269,681	38	82.6%
19.6%	9	0.002%	63,872	37	80.4%
21.7%	10	0.006%	17,263	36	78.3%
23.9%	11	0.019%	5,275	35	76.1%
26.1%	12	0.055%	1,808	34	73.9%
28.3%	13	0.145%	691.5	33	71.7%
30.4%	14	0.341%	293.4	32	69.6%
32.6%	15	0.727%	137.5	31	67.4%
34.8%	16	1.41%	70.97	30	65.2%
37.0%	17	2.49%	40.22	29	63.0%
39.1%	18	4.01%	24.96	28	60.9%
41.3%	19	5.90%	16.94	27	58.7%
43.5%	20	7.97%	12.55	26	56.5%
45.7%	21	9.87%	10.13	25	54.3%
47.8%	22	11.21%	8.918	24	52.2%
50.0%	23	11.70%	8.547	23	50.0%

Please recall that each parent contributes 23 chromosomes to their child and everyone has a total of 46. So the odds of two siblings sharing no chromosomes at all, is the same as sharing all 46. And, as is required, the sum of all the probabilities is one.

And we can graph the probability as follows

As expected, the binominal coefficients begin to resemble a Gaussian distribution. Siblings are most likely to share 23, or half, of their chromosomes. But what is interesting is the relatively low probability of any particular match. The probability of matching exactly half is only about 11.7%. But for sharing between 21/46 and 25/46 is 53.9%, from 1/3 to 2/3 is about 97.4%, and from ¼ to ¾ is more than 99.9% as seen from the following table.

Range of Shared Chromosomes	Minimum Relatedness	Maximum Relatedness	Probability of Match In Range
23	50.0%	50.0%	11.70%
23 ± 1	47.8%	52.2%	34.13%
23 ± 2	45.7%	54.3%	53.86%
23 ± 3	43.5%	56.5%	69.80%
23 ± 4	41.3%	58.7%	81.61%
23 ± 5	39.1%	60.9%	89.62%
23 ± 6	37.0%	63.0%	94.59%
23 ± 7	34.8%	65.2%	97.41%
23 ± 8	32.6%	67.4%	98.86%
23 ± 9	30.4%	69.6%	99.55%
23 ± 10	28.3%	71.7%	99.84%
23 ± 11	26.1%	73.9%	99.95%
23 ± 12	23.9%	76.1%	99.98%

CHARACTERISTICS OF CHROMOSOMES

Every cell in any given individual contains identical DNA but what makes them different, e.g. a brain cell rather than a fingernail, is the genes in that cell which have been turned on to code for different proteins. The overall structure has several levels of hierarchies.

At the most basic level, the DNA molecule is made up of four letters (or atomic subunits) identified as, A-T, C-G, T-A, or G-C corresponding to the molecules (or nucleotide bases) adenine, cytosine, guanine, and thymine and arranged in a long chain double helix. Groups of these letters are arranged into genes which when activated produce very different proteins. Genes and non-coding sections for proteins are arranged into 46 total chromosomes grouped into 23 pairs of identical length.

What makes an individual gene is a unique combination of three letter words called codons. Any one of several start codons begin the formation of a protein from some 20 (+2) amino acid subunits. Subsequent three letter sections attach different amino acids to the forming protein until a unique three letter word stops transcription. The protein then folds into a unique shape forming highly specific chemical binding sites (or folds or keys) that enable different chemical processes within a cell.

In any event, in a rapidly evolving understanding our chromosomes are thought to have the following characteristics [https://en.wikipedia.org/wiki/Human_genome]

Chromosomal Pair	Length (mm)	Numbr. Of Letters (4 possibilities)	Number of Genes
1	85	248,956,422	2058
2	83	242,193,529	1309
3	67	198,295,559	1078
4	65	190,214,555	752
5	62	181,538,259	876
6	58	170,805,979	1048
7	54	159,345,973	989
8	50	145,138,636	677
9	48	138,394,717	786
10	46	133,797,422	733
11	46	135,086,622	1298
12	45	133,275,309	1034
13	39	114,364,328	327
14	36	107,043,718	830
15	35	101,991,189	613
16	31	90,338,345	873
17	28	83,257,441	1197
18	27	80,373,285	270
19	20	58,617,616	1472
20	21	64,444,167	544
21	16	46,709,983	234
22	17	50,818,468	488
23-X	53	156,040,895	842
23-Y	20	57,227,415	71
	TOTALS	3,088,269,832	20,399

Note that most of the DNA has “nonsense” stretches without genes that nevertheless serve as a rich breeding ground for unfettered experimentation. It is in these regions where new protein coding sections can spontaneously arise as random chance dictates. Most such mutations are harmful, some fatal, but occasionally such scrambling produces an entirely new advantage. It is also possible the long “non-sense” stretches can loop around to catalyze or inhibit genes on nearby sections of the DNA strand and so play a more important role than would otherwise be the case.

CAVEATS

In the above discussion, there are several considerations that bear on these statistics but which have been neglected. These include the unequal numbers of genes on chromosomes, their relative contribution to one’s physical makeup, and the different inheritance of brothers as compared to sisters.

The numbers of genes on each chromosome vary by about a factor of 2-3 and are roughly proportional to length as given in the table above. And some genes are important to your physical appearance and general chemistry and some are less so as described in more detail in the reference pages.

Also what has been omitted is 23^rd chromosome that determines sex. Females have two “X” chromosomes and males have one “X” and one “Y”. Thus daughters get one of two “X” chromosomes from their mother and the single “X” from their father. And sons get one of the two “X” strands from their mother and the single “Y” from their father. In this case, each sex has only two, not four, possibilities for the 23^rd chromosome. But then also there are two possibilities determining one’s sex.

Fortunately these considerations tend to average out in their effect on the statistics.

Besides the DNA at the center of each cell, one’s genetic makeup also includes a small packet of DNA in a separate structure called the mitochondria which produces energy for the cell. This powerhouse is in the mother’s egg before fertilization and consists of 16,569 DNA letters arranged into only 37 genes. And thus fathers contribute nothing to these secondary structures.

CONCLUSIONS

An interesting consequence of inheritance in combinations of only 46 chromosomes, is that there may be ancestors with whom you share no genetic connection whatever. This is somewhat shocking as everyone in your family tree gave birth to a direct ancestor without whom you would not exist. But somehow in the shuffling of chromosomes, some of them were edited out of your particular linage.

Also, the idea of a single “selfish-gene” has been overstated because inheritance is not so much in single gene but in long strings of them. Rather our many genes act in concert, some inhibiting others, some acting as catalysts, and some slightly adding to or subtracting from some characteristic in aggregate. Despite our yearning for simple explanations, the situation is generally more complex than such a single minded principle allows. The selfishness of individual genes is not so much right or wrong as it is meaningless.