Genetic Similarity of Siblings

 

INTRODUCTION

 

Brothers and sisters tend to have similar features but are not usually identical.   This is because each child gets exactly half of their father’s genes and half of their mother’s genes in a random and mostly unpredictable scrambling.  It is thus barely possible two siblings will have no genes in common at all.   That is, a second child may get exactly that half from each parent that the first child did not get.  It is also possible they will inherit exactly the same genes.   Both of these particular outcomes are unlikely and in general siblings share about half of their total genetic makeup.  From fairly straightforward considerations, we can calculate the odds two children from the same parents will have varying degrees of relatedness which lie between these two extremes.

 

INHERITANCE

 

Living things have two distinct characteristics.  They assimilate inanimate matter, i.e. they eat stuff to provide the energy to grow and to live.   And they reproduce themselves.  Children are much like their parents because their characteristics are transmitted mostly without error in a DNA (deoxyribonucleic acid) molecule found at the center of each cell.  This molecule is effectively a long string of literally billions of similar atomic subunits which is able to make an identical copy of itself as a cell divides.

 

In humans, our DNA string curls around itself to form 46 distinct sections weakly stitched together in a linear fashion.  Each section is called a chromosome and contains long strings of many different genes.   Further these chromosomes stick together to form 23 pairs where each member of a pair has the same length.

 

Human children get one chromosome from their father and one from their mother for each of their own 23 pairs.  Assuming your Dad has the pair (ADad, BDad) and your Mom has (CMom, DMom) means there are four possibilities for each of your own 23 chromosomal pairs as follows

 

1.      ADad from your father and CMom from your mother

2.      ADad from your father and DMom from your mother

3.      BDad from your father and CMom from your mother

4.      BDad from your father and DMom from your mother

 

Or we could say instead that there are two possibilities for each of your 46 chromosomes.  Each of the 23 that comes from your father has two possibilities and the same for the 23 from your mother.

 

ODDS OF DIFFERENT DEGREES OF RELATEDNESS

 

The inheritance described above means that there are 423 = 246 = 70.37 trillion total possibilities for your individual genetic makeup.   Thus each of us is truly a unique individual probably never before seen and never to be repeated.  In any event, the odds of two siblings sharing “n” chromosomes out of the total of 46 (in 23 pairs) are given as follows

 

 

So we can tabulate these odds, as follows

 

Probability of Siblings Sharing "n" Chromosomes
 (out of 46 total)

Relatedness

n

Probability, Pn

One Chance in

n

Relatedness

0.0%

0

0.0000%

70,368,744,177,664

46

100.0%

2.2%

1

0.0000%

1,529,755,308,210

45

97.8%

4.3%

2

0.0000%

67,989,124,809

44

95.7%

6.5%

3

0.0000%

4,635,622,146

43

93.5%

8.7%

4

0.0000%

431,220,665

42

91.3%

10.9%

5

0.0000%

51,335,793

41

89.1%

13.0%

6

0.0000%

7,512,555

40

87.0%

15.2%

7

0.0001%

1,314,697

39

84.8%

17.4%

8

0.0004%

269,681

38

82.6%

19.6%

9

0.002%

63,872

37

80.4%

21.7%

10

0.006%

17,263

36

78.3%

23.9%

11

0.019%

5,275

35

76.1%

26.1%

12

0.055%

1,808

34

73.9%

28.3%

13

0.145%

691.5

33

71.7%

30.4%

14

0.341%

293.4

32

69.6%

32.6%

15

0.727%

137.5

31

67.4%

34.8%

16

1.41%

70.97

30

65.2%

37.0%

17

2.49%

40.22

29

63.0%

39.1%

18

4.01%

24.96

28

60.9%

41.3%

19

5.90%

16.94

27

58.7%

43.5%

20

7.97%

12.55

26

56.5%

45.7%

21

9.87%

10.13

25

54.3%

47.8%

22

11.21%

8.918

24

52.2%

50.0%

23

11.70%

8.547

23

50.0%

 

 

Please recall that each parent contributes 23 chromosomes to their child and everyone has a total of 46.   So the odds of two siblings sharing no chromosomes at all, is the same as sharing all 46.   And, as is required, the sum of all the probabilities is one.

 

 

And we can graph the probability as follows

 

 

 

As expected, the binominal coefficients begin to resemble a Gaussian distribution.   Siblings are most likely to share 23, or half, of their chromosomes.  But what is interesting is the relatively low probability of any particular match.   The probability of matching exactly half is only about 11.7%.  But for sharing between 21/46 and 25/46 is 53.9%, from 1/3 to 2/3 is about 97.4%, and from ¼ to ¾ is more than 99.9% as seen from the following table.

 

Range  of Shared Chromosomes

Minimum Relatedness

Maximum Relatedness

Probability of Match In Range

23

50.0%

50.0%

11.70%

23 ± 1

47.8%

52.2%

34.13%

23 ± 2

45.7%

54.3%

53.86%

23 ± 3

43.5%

56.5%

69.80%

23 ± 4

41.3%

58.7%

81.61%

23 ± 5

39.1%

60.9%

89.62%

23 ± 6

37.0%

63.0%

94.59%

23 ± 7

34.8%

65.2%

97.41%

23 ± 8

32.6%

67.4%

98.86%

23 ± 9

30.4%

69.6%

99.55%

23 ± 10

28.3%

71.7%

99.84%

23 ± 11

26.1%

73.9%

99.95%

23 ± 12

23.9%

76.1%

99.98%

 

 

 

CHARACTERISTICS OF CHROMOSOMES

 

Every cell in any given individual contains identical DNA but what makes them different, e.g. a brain cell rather than a fingernail, is the genes in that cell which have been turned on to code for different proteins.  The overall structure has several levels of hierarchies.

 

At the most basic level, the DNA molecule is made up of four letters (or atomic subunits) identified as, A-T, C-G, T-A, or G-C corresponding to the molecules (or nucleotide bases) adenine, cytosine, guanine, and thymine and arranged in a long chain double helix.  Groups of these letters are arranged into genes which when activated produce very different proteins.   Genes and non-coding sections for proteins are arranged into 46 total chromosomes grouped into 23 pairs of identical length.

 

What makes an individual gene is a unique combination of three letter words called codons.  Any one of several start codons begin the formation of a protein from some 20 (+2) amino acid subunits.  Subsequent three letter sections attach different amino acids to the forming protein until a unique three letter word stops transcription.  The protein then folds into a unique shape forming highly specific chemical binding sites (or folds or keys) that enable different chemical processes within a cell.

 

In any event, in a rapidly evolving understanding our chromosomes are thought to have the following characteristics [https://en.wikipedia.org/wiki/Human_genome]

 

Chromosomal Pair

Length
(mm)

Numbr. Of Letters
(4 possibilities)

Number of Genes

1

85

248,956,422

2058

2

83

242,193,529

1309

3

67

198,295,559

1078

4

65

190,214,555

752

5

62

181,538,259

876

6

58

170,805,979

1048

7

54

159,345,973

989

8

50

145,138,636

677

9

48

138,394,717

786

10

46

133,797,422

733

11

46

135,086,622

1298

12

45

133,275,309

1034

13

39

114,364,328

327

14

36

107,043,718

830

15

35

101,991,189

613

16

31

90,338,345

873

17

28

83,257,441

1197

18

27

80,373,285

270

19

20

58,617,616

1472

20

21

64,444,167

544

21

16

46,709,983

234

22

17

50,818,468

488

23-X

53

156,040,895

842

23-Y

20

57,227,415

71

TOTALS

3,088,269,832

20,399

 

Note that most of the DNA has “nonsense” stretches without genes that nevertheless serve as a rich breeding ground for unfettered experimentation.  It is in these regions where new protein coding sections can spontaneously arise as random chance dictates.  Most such mutations are harmful, some fatal, but occasionally such scrambling produces an entirely new advantage.  It is also possible the long “non-sense” stretches can loop around to catalyze or inhibit genes on nearby sections of the DNA strand and so play a more important role than would otherwise be the case.

 

CAVEATS

 

In the above discussion, there are several considerations that bear on these statistics but which have been neglected.   These include the unequal numbers of genes on chromosomes, their relative contribution to one’s physical makeup, and the different inheritance of brothers as compared to sisters.

 

The numbers of genes on each chromosome vary by about a factor of 2-3 and are roughly proportional to length as given in the table above.  And some genes are important to your physical appearance and general chemistry and some are less so as described in more detail in the reference pages.

 

Also what has been omitted is 23rd chromosome that determines sex.   Females have two “X” chromosomes and males have one “X” and one “Y”.  Thus daughters get one of two “X” chromosomes from their mother and the single “X” from their father.  And sons get one of the two “X” strands from their mother and the single “Y” from their father.   In this case, each sex has only two, not four, possibilities for the 23rd chromosome.  But then also there are two possibilities determining one’s sex.

 

Fortunately these considerations tend to average out in their effect on the statistics. 

 

Besides the DNA at the center of each cell, one’s genetic makeup also includes a small packet of DNA in a separate structure called the mitochondria which produces energy for the cell.   This powerhouse is in the mother’s egg before fertilization and consists of 16,569 DNA letters arranged into only 37 genes.  And thus fathers contribute nothing to these secondary structures.

 

CONCLUSIONS

 

An interesting consequence of inheritance in combinations of only 46 chromosomes, is that there must be ancestors with whom you share no genetic connection whatever.  For instance if you go back six generations, you have 64 ancestors but at most 46 of them could have contributed a single chromosome to you.  This is somewhat shocking as everyone in your family tree gave birth to a direct ancestor without whom you would not exist.  But somehow in the shuffling of chromosomes, some of them were edited out of your particular linage.

 

Since chromosomes are nearly perfect copies in each new generation, then baring mutations, you can track your ancestry quite a long way into the past, but only by 46 different paths.

 

Also, the idea of a single “selfish-gene” has been overstated because inheritance is not so much in single gene but in long strings of them.   Rather our many genes act in concert, some inhibiting others, some acting as catalysts, and some slightly adding to or subtracting from some characteristic in aggregate.    Despite our yearning for simple explanations, the situation is generally more complex than such a single minded principle allows.  The selfishness of individual genes is not so much right or wrong as it is meaningless.