Scientific Research

Evaluating Evidence in Research Publications

Dr. Nancy Malik gives an overview of common research terminology that can help evaluate medical research, especially pertaining to homeopathy.


How to
1. make sense and critically evaluate the evidence in scientific article/research paper?
2. sort out best available evidence?
3. make evidence-based decision?

Self limiting disease:

A person with a disease will get cured once the disease has run its course of action. But the medical treatment can shorten the time to bring the disease to an end more quickly.

p-value and Confidence Interval

1. P-value stands for probability of error. It’s value can vary from 0 to 1. e.g. p=0.04 means 0.04*100 = 4% probability that a result could have occurred by chance.
2. The amount of evidence required to accept that an event is unlikely to have arisen by chance is known as threshold/critical p-value/significance level (called alpha). This value is set before you do the experiment.
3. The statistical significance of a result tells us something about the degree to which the result is “true”. Statistical significance can be considered to be the confidence one has in a given result being non-random. Confidence is a proportional to square-root of sample size.
4. Confidence Interval (CI) indicates how reliable the results are.
a. If the P-value computed while performing the experiment < threshold/critical P-value (called alpha) set (normally 0.05) before you do the experiment, the difference is “statistically significant”. That means smaller the p-value better it is.
b. If the P value > threshold, the difference is “not statistically significant”.

Confidence level: Generally set at 95%. If confidence level is 95%, then critical p-value is calculated as 100-95= 5% i.e. 0.05

Power Analysis

It reveals the appropriate/minimum sample size required to achieve expected effect
for a given confidence level.
A sample size less than the appropriate will have the probability of false null
hypothesis/type-2 error.

creative research systems (free)
nQuery Advisor + nTerim (paid) // industry standard

It is the ratio of the chance that something will happen to the chance that it will not happen. So mathematically, odds ratio = chance/ (1-chance)

Likelihood ratio:
It is a ratio of True Positives to False Positives,
ex. the prevalence of symptom in the population cured by certain medicine divided by the prevalence of the symptom in the rest of the treated population


1. Error is of two types: random and systematic error
2. Unlike random error, systematic error does not get reduced as sample size (n) increases.
3. Bias is a systematic error i.e. the difference between study findings/outcomes and the truth.
4. Broadly, bias is of two types: Selection bias and information/observation bias.
5. Selection bias is due to wrong selection of samples (subjects included for participation in study)
6. Information bias is due to wrong methods by which information is collected from the participants who met inclusion criterion.
7. Publication bias: The tendency for studies with a positive result to be published more readily than those which are negative or inconclusive.
8. Confirmation bias is a bias on the part of researcher to favour results that confirms their hypothesis
9. A bias can switch the result of an RCT from positive to negative or viceversa. 10. It’s impossible to eliminate the bias completely. We can only reduce the bias.
11. As bias approaches towards zero, ‘generalisation‘ suffers that means you can’t generalise (read extend) inference/results of your study (based on sample data) to reference population which renders the study less useful. So there’s a tradeoff between the amount of bias accepted and generalisation capability of your study.

12. BMJ 2003: “Conclusion: Systematic bias favours products which are made by the company funding the research”

13. Methods to reduce bias: blinding, randomisation, prospective study instead of retrospective studies.


Neither the subject nor the person administering treatment knows which treatment any particular subject is receiving

Efficacy of Treatment

Efficacy is the extent to which the medical treatment/intervention has therapeutic effect under ideal test conditions

1. Specific (pharmacological/physiological) effects: patients getting better because of the therapeutic effects of the medical treatment/intervention.
2. Non-specific (psychological/placebo) effects:The body’s healing response is activated
a. By the patient’s anticipation of the cure because they expect to get better after visiting the doctor.
b. By the reassurance given by the doctor that he will be cured (empathy)
c. When given a medicine but not told that it is actually a placebo. Placebo also has a therapeutic value if the patient is not told that he is being given a placebo. The placebos have surprisingly positive clinical effects on patient’s medical condition because it involves the secretion of dopamine.
3. Any treatment is effective because of the both specific and non-specific effects. In fact non-specific effects are larger than specific effects. The drawback of RCT is that is designed to measure only specific effects. Placebo-controlled RCT measures if the effect is due to placebo or not.
4. Randomisation: Divide the patients/group (sample size) randomly into a treatment group that receives the treatment, and a control group that does not.
5. The treatment would then be judged effective only if the treatment group improves more than the control significant amount.


Effectiveness is the extent to which the medical treatment/intervention has therapeutic  effect under real-world settings.

Effectiveness studies: Controlled cohort studies, observational studies, outcomes studies, Health Technology Assessment

Which one is more important: Efficacy or Effectiveness?
and if you feel both are equally important, then which one should first be seeked: Efficacy or Effectiveness?

Levels of Evidence [2]
Level I meta-analyses and/or systematic reviews
Level IIa multiple RCT
Level IIb some RCT
Level IIIa multiple cohort studies
Level IIIb some cohort studies
Level IV opinion of experts

An inactive dummy/inert pill (substance such as lactose/saline which do not alter the disease condition) prescribed to the patient for the following purposes
1. To enhance the non-specific effects of the treatment. Placebo has a therapeutic value if the patient is not told that he is being given a placebo.
2. The doctor does not want to disturb the course of action of previously prescribed medicine till it completes its own run.
So administer a placebo during the intermittent period of two far apart doses of
medicine in time serves both the purposes
3. Given as control treatment in placebo-controlled trials during medical research

Nocebo Effect: The adverse effects of placebo

Null Hypothesis

1. The null hypothesis assumes that any kind of significance in a set of data is due to chance.
2. Null hypothesis is presume to be true until statistical evidence nullifies it.
3. When the null hypothesis is rejected, the result is said to be statistically significant.

Regression towards the mean

Explained with he help of an example [1]

Do a test on a bunch of subjects. Rank the subjects by their score and select the bottom half of the bunch. Retest the bottom half. The average score of the bottom half will probably improve somewhat on retest. Similarly, the average score of the top half will probably drop somewhat on retest. These changes in performance are called regression towards the mean. The name refers to a tendency for subjects who score below average on a test to do better next time, and for those who score above average to do worse.

The group you select doesn’t have to be the bottom or top half, and the test doesn’t have to be the first one. Any group or even any subject you choose with an average score below or above the mean of all the subjects in a given test will probably move (regress) noticeably closer to the mean in another test. ‘Regression towards the mean’ means ‘averaging out’

Biological Model:

Model of an organism to understand a particular biological phenomena. They are generally in-vivo models. They are of two types: Pharmacological and Toxicological model

in vitro

in-vitro studies are those that are conducted on components (instead of whole) isolated from a living organism such as cells, components of cells such as ribosomes or mitochondria, extracts of cells such as reticulocyte or cell molecules such as acids or proteins or tissues. Vitro means glass. The invitro studies are conducted in test tubes and culture dish


It is a statistical method of combining results from multiple randomised controlled trials in the form of a weighted average (some studies carries more weight than others) as an output. It helps in reducing information overload, find publication bias, if any and may explain heterogeneity between the results of individual studies.

Systematic Review

It is synthesis of the results of several studies (including conflicting studies) so as to assess the strength of the evidence.A review is termed systematic if it is based on clearly formulated peer-reviewed protocols (in advance) such as research question, clear inclusion/exclusion criterion, explicit search strategy, etc and employs the same level of scientific rigour as should be used to produce that research evidence in the first place. A systematic review can be independently replicated.

Outcome Research:

It focuses on
1. The treatment improved quality of life or not
2. Safety Profile of medicines used in treatment

Repeatability and Reproducibility

1. During a test/experiment, let’s say you do certain measurement. When the test is repeated (re-test/replicate) in same conditions [same observer (the person who is conducting the test), same laboratory and same equipment), three conditions arises
2. If you get the same measurement in same conditions, we say it’s 100% repeatable.
3. If the variation in measurement is with in the set tolerance (agreed difference), we say, the measurement is repeatable otherwise variable (variability)
4. If the re-test is conducted in same test conditions but in different locations (multi-centre) by different people (independent) and you get the same results (within the set tolerance), we say the test is reproducible

Research Fundamentals
1. Aim/Objectives of research
2. research design: research methods
3. sampling: selection of participants
4. Ethical issues
5. Taking care of bias
6. Data Collection
7. Data Analysis
8. Research Findings
9. How valuable is the research?: Contribution of study

Ethical issues in Research
1. Objectives and methods of research explained to participants
2. Informed consent by participants (consent form)
3. Confidentiality of information (Patient Information Sheet)
4. what to do with the outcome of study

Semmelweis reflex is a tendency to reject new evidence or knowledge because it contradicts established beliefs or paradigms.

Research Glossary



[2] Michel van Wassenhoven ECH Publication, LMHI 2008

About the author

Dr Nancy Malik

Medical Doctor of Homeopathy | Registered Medical Practioner | Ex-Medical Officer

Leave a Comment