Share this article.

How many rare diseases are there and why is that important?

  • For scientists to better understand diseases, it is important to categorise them appropriately and understand their diversity.
  • There are a vast number of rare diseases, but we lack appropriate language to describe them – making it challenging to accurately diagnose, understand, and communicate information about them.
  • Professor CI Edvard Smith and Dr Daniel W Hagey at Karolinska Institutet, Sweden, are establishing a new framework to understand rare diseases by introducing ideas about how to calculate and describe their rarity.
  • Their work sets a goal for scientists to come up with treatments tailored to each individual, known as precision medicine.

Diseases are defined as rare when they only affect a small percentage of the population. However, since there are far more rare diseases than common ones, collectively they are a major cause of human illness, accounting for 10% of all disease. Previous estimates suggest that there are approximately 10,000 rare diseases, a number which Professor CI Edvard Smith and Dr Daniel Hagey at Karolinska Institute, Sweden, believe to be a grave underestimate. The researchers reviewed the existing literature and created a useful thought experiment that can be used to better understand disease, with the aim of developing more accurate diagnoses where treatments are individualised.

The idea of hyper-rare

The definition of rare disease in Europe includes any condition that affects fewer than one in 2,000 people. Sometimes researchers use the term ultra-rare disease to describe conditions that affect less than one in 50,000 people. These diseases are already extremely rare, but could there be others that appear so infrequently that at some points in time no one on the planet is affected by them? Although it might seem an insignificant possibility, addressing such unlikely diseases can improve precision or personalised medicine, a branch of medicine that involves tailored treatments for patients based on their individual characteristics. For this reason, Smith and Hagey have introduced the term hyper-rare disease, which is used to describe disorders that affect fewer than one in 100 million individuals. Significantly, this paradigm shift in thinking will be important in precision medicine with improved scope for diagnosis and ultimately treatment for patients.

What is a disease after all?

For scientists to better understand diseases, it is important to categorise them appropriately. Since this is not a straightforward procedure, it is useful to employ the concept of phenotype, or characteristic. A phenotype describes the observable signs and symptoms of the disease that result from the combination of a person’s DNA and the environmental factors. This sometimes means that a different combination of genes and environmental conditions can lead to the same presentation. Conversely, the same DNA variant might lead to different diseases, or sometimes no disease at all, depending on the external (environmental) factors. It’s therefore important for scientists to tell apart diseases that although have similar causes, might have a completely different response to treatments.

The quantity of toxins and germs

As well as identifying the environmental factors that cause disease, researchers often have to quantify these factors as well. The amount of a toxin a person is exposed to or the number of bacteria that have entered their blood circulation is critical for the appearance of the disease. There are also other factors that can affect the end result. For example, the patient’s susceptibility to a disease or a doctor’s ability to suspect an uncommon environmental factor as the cause.

The term hyper-rare diseases is used to describe disorders that affect fewer than one in 100 million individuals.

Blame the genes

The total number of human genes exceeds 24,000 and all of these genes have multiple variants that can influence disease. The simplest cause of disease is a result of a single nucleotide variation. So we can only start to imagine the number of possible rare diseases caused by one gene (monogenic disease). On top of this, there are rare diseases that are caused by more than one variant gene (polygenic) and this can make them very difficult to suspect and diagnose. A combination of two genes responsible for two diseases is also possible, although extremely rare. These situations are even more challenging to identify since they can either result in diseases that present due to the sum of the symptoms of the two distinct diseases, or can have an entirely different presentation if any at all.

To better understand diseases, it is important to categorise them appropriately and understand their diversity.

What makes rare genetic diseases more complex is the fact that they are not only caused by inherited genes, but also from genuinely new changes to a person’s DNA, called mutations. These variations happen at all stages of development, at the level of the egg or sperm formation, during the proliferation of the first foetal cells or in stem cells (unspecialised cells from which all other cells are generated) situated in different parts of our bodies, or in more mature cells. This results in mosaicism, where different cells of a single organism tend to have a different genetic setup. Mosaicism does not usually result in disease, however, it can explain any puzzling differences in the phenotypes between identical twins. Smith explains ‘Our understanding of the number of possible diseases and their interactions has expanded rapidly. Importantly, it influences the way in which we should all think when confronted with the symptoms of disease.’

To make the necessary corrections to the number of total diseases, the team introduced a mathematical tool called correction coefficient.

Calculating the numbers

To calculate an accurate estimate of the total number of diseases, Smith and Hagey suggest combining the number of disease-causing genes with all environmental factors that could together result in distinguishable disorders. To make the necessary corrections to the resulting number the researchers introduced a mathematical tool called correction coefficient. The coefficient is used to limit the number of the actual diseases by ruling out the combinations that don’t have an effect on a person’s phenotype. The number of rare diseases calculated this way is still enormously greater than 10,000, a result that also supports their hypothesis of hyper-rare disease.

Advancing precision medicine

Smith and Hagey suggest using the term hyper-rare to describe diseases that appear in less than one out of 100 million people as an addition to the previous categories of rarity. The researchers also introduce a correction coefficient to help estimate the total number of potential diseases. They advise that the number of rare diseases will continue increasing over the next few years, especially as our ability to discover disease goes hand in hand with our understanding of them as well as our ability to cure them.

Smith and Hagey are establishing a new framework to understand rare diseases.

By improving our knowledge of rare diseases and their diagnostics, the scientific community doesn’t just aim to cure them, but also expects that these new personalised treatment approaches will apply to many diseases for which the molecular mechanisms can be identified, including common diseases.

What inspired you to conduct research on rare diseases and their classification?

We observed that the estimate of ‘10,000 diseases’ was frequently used in the scientific literature, although there really was no source regarding how this figure was reached. We therefore decided to address this shortcoming with a view to a future where genetic diagnostics and personalised medicine have matured.We felt it was important to introduce this paradigm shift at this time, since clinicians will have to embrace the individualised nature of disease to ultimately improve treatment. As more people have their genomes sequenced and diagnostics becomes increasingly multifactorial, treatments also need to be personalised to take advantage of this additional information. One of the first steps towards this goal is to dispel the idea that there are a defined number of diseases with set treatments suitable for everyone.

In an ideal world, how would you see your work affect clinical medicine?

We believe that the realisation that there is an almost infinite number of possible diseases will become a new insight for the scientific community, including clinical medicine. This means that there is not a limited number of disease entities that fits all. In turn, this means that clinicians can rethink disease classification which will impact on treatment strategies for instance.We hope that this hypothesis will encourage clinicians to grasp the potential benefits of genetic diagnostics and find ways to incorporate this information into clinical practice. Human disease is incredibly complex, and our understanding of it is has been limited by the tools we can use to describe it. Rapid progress in DNA sequencing and editing technologies is shifting the resolution at which we can understand and treat disease, and we hope to accelerate the uptake of these developments into clinical practice.

Were any of your research findings completely different to what you expected while reviewing the literature on rare diseases?

We always suspected that the number of possible diseases was far higher than 10,000. However, when we made the calculations, we were surprised that no matter which way we made the calculations, the number was always colossal. This was clear even when considering just the number of disorders that could be caused by single genetic variations. This is because, not only can each of our 24,000 genes lose their function, but they can also gain new functions depending on the specific genetic variation. Even when removing the large number of variations that would not cause something one could reasonably define as disease, the number is still massive.

Another surprising finding when surveying the literature on different diseases is how our hypothesis is already playing out in various fields of medicine – particularly cancer. Increased DNA sequencing of tumours has already made clear that what was once defined as a single type of cancer is actually several types. As research drives forward and technology continues to increase the granularity of our understanding and strategies for treatment of disease, we believe this trend will continue across the field of medicine.

Related posts.

Further reading

Smith, CIE, et al (2022), Estimating the number of diseases – the concept of rare, ultra-rare, and hyper-rare. iScience [online], 25, 104698.

Haendel, M, et al (2020). How many rare diseases are there? Nature Reviews Drug Discovery, 19, 77–78.

CI Edvard Smith

CI Edvard Smith is professor in molecular genetics. His main research interest is cellular signalling, especially protein phosphorylation and synthetic oligonucleotide therapies for altering gene expression. His research group cloned the disease gene, BTK, for X-linked agammaglobulinemia in a collaborative effort and this kinase and inhibitors thereof is a major interest.

Daniel W Hagey

Daniel W Hagey is an assistant professor in laboratory medicine. His main research surrounds the development of sequencing-based diagnostics for early cancer diagnosis.

Contact Details

Centre of Excellence for Long-acting Therapeutics (CELT)

Daniel W Hagey & CI Edvard Smith
Department of Laboratory Medicine
Biomolecular and Cellular Medicine and Translational Research Center Karolinska (TRACK)
Karolinska Institutet, Stockholm, Sweden

C I Edvard Smith
Department of Infectious Diseases
Karolinska University Hospital Huddinge Stockholm, Sweden


This work was supported by CIMED, the Center for Innovative Medicine, Region Stockholm, Sweden, the Swedish Cancer Society, the Swedish Childhood Cancer Fund and the Stellenbosch Institute for Advanced Study, Wallenberg Research Centre, Stellenbosch University, Stellenbosch South Africa.


Peter Bergman, Karolinska Institutet

Competing interest statement

CIE Smith is founder and board member of NextCell Pharma, a company developing stem cell therapies.

Cite this Article

Smith, E, Hagey, D W, (2023), How many rare diseases are there and why is that important? Research Features, 145. Available at: 10.26904/RF-145-3831074217  

Creative Commons Licence

(CC BY-NC-ND 4.0) This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Creative Commons License

What does this mean?
Share: You can copy and redistribute the material in any medium or format