A total of 60 studies were identified for inclusion (Fig 1 and Table 1) [19–69]. Thirteen studies were excluded due to using a qualitative assay, 8 studies were drug resistance testing studies, 2 studies used panels as primary sample types, 2 studies were review manuscripts lacking primary data, 2 studies used spike blood samples, and 1 study used an incorrect comparator. There was low to moderate heterogeneity in the analytical and clinical performance comparisons within technologies as well as in the viral load medians and distributions (Table 2).
Quality of studies
There was some risk of bias in patient selection, however, low risk of bias with the reference standard and index test (S2 Fig). Participants in most studies were not consecutively recruited or failed to report the process of patient recruitment and only 5% of studies reported the process of patient recruitment. There was a high applicability in patient selection, index test, and reference standard; however, there were some concerns as most studies (58%) were carried out in Africa; most studies (>90%) used venous blood prepared in the laboratory with a pipet; and most studies (>90%) used only 1 dried blood spot filter paper (Whatman 903).
Systematic review analysis
Mean bias was the most commonly reported analytical measurement across all studies included in the systematic review (82%); therefore, forest plots of each study were developed by technology (S3 Fig). Half of the studies included patients on antiretroviral therapy [21–23,29,33,35–37,39,40,42,43,45–47,51–54,56–59,62,64,65,68,69], whereas the remaining studies either included patients not on antiretroviral therapy or did not indicate such information. The study characteristics such as sample size, viral load medians, and patient viral load distributions are summarized in Table 2.
A total of 40 studies provided 45 data sets across the 6 technologies resulting in a total of 10,871 paired dried blood spot and plasma viral load results [22,24,26,28,30,31,33,34,36–45,48–53,56,58,59,62–66,68,69]. Those studies not included from the systematic review were due to primary authors’ inability to sharing data. Of these 58% of pairs were analyzed with the Roche COBAS TaqMan technology [22,26,40,43,48,49,51,56,58,65,66,69], 25% with the Abbott RealTime HIV-1 technology [26,28,31,37–39,42,45,58,59,63,69], 10% with the bioMérieux NucliSENS EasyQ technology [24,30,31,33,34,36,40,42,53,62], 5% with the Biocentric Generic HIV Charge Virale technology [41,52,64], 1% with the Hologic Aptima , and 1% with the Siemens VERSANT HIV-1 RNA technology . Approximately 70% of the paired data points were from studies conducted in Africa [22,24,26,28,33,34,37,39,40,42,43,48,53,56,58,59,64,65,74], of which 36% were from the Southern African Development Community region [22,26,28,42,43,56,59,64] and 24% from the East African Community region [24,33,37,48,65,69].
The viral load distribution for the 10,871 plasma specimens tested was relatively equally distributed across all viral load ranges (Fig 2A). While approximately 41% of all plasma specimens were undetectable (below the technology’s limit of detection), 30% of all plasma study specimens were between detectable (at or greater than the technology’s limit of detection) and 10,000 copies/ml. Furthermore, when including only plasma specimens from patients known to be on ART, we observed that just over 40% of patients had undetectable levels of viral load (Fig 2B). Approximately 31% of plasma specimens from patients on ART were between detectable and 10,000 copies/ml.
The median dried blood spot viral loads were higher than the median plasma viral loads for all but 2 technologies. Overall, the median difference was 1.03 log copies/ml (Table 3). The Abbott RealTime HIV-1 two-spot, Abbott RealTime HIV-1 one-spot, Biocentric Generic HIV Charge Virale, bioMérieux NucliSENS EasyQ HIV-1, Hologic Aptima, Roche COBAS TaqMan FVE, Roche COBAS TaqMan SPEX, and Siemens VERSANT HIV-1 RNA technologies had a difference between the median dried blood spot and plasma specimen viral loads of 0.09, 0.04, 0.17, −0.30, 0.12, 0.33, 1.99, and −0.13 log copies/ml, respectively. The mean bias for each technology was calculated by pooling all primary data for each technology as though one study (Table 3). The mean biases between the dried blood spot and plasma viral load values varied significantly depending on the technology. The overall mean bias was 0.30 log copies/ml. The Abbott RealTime HIV-1 two-spot (−0.12 log copies/ml), Abbott RealTime HIV-1 one-spot (0.02 log copies/ml), and Roche COBAS TaqMan FVE (0.06 log copies/ml) assay biases were closest to zero, while the bioMerieux NucliSENS EasyQ HIV-1 (−0.41 log copies/ml) and Roche COBAS TaqMan SPEX (1.03 log copies/ml) assay biases were furthest from zero. The Abbott RealTime HIV-1 two-spot, bioMerieux NucliSENS EasyQ HIV-1, and Siemens VERSANT HIV-1 RNA technologies had negative mean biases indicating under-quantification compared to the plasma viral load result, which is expected due to the lower input sample volume. The positive mean biases of Biocentric Generic HIV Charge Virale and Roche COBAS TaqMan SPEX reflect over-quantification compared to the plasma viral load result, likely due to processing and extraction chemistries resulting in amplification of total intracellular and extracellular nucleic acids.
WHO and many national clinical guidelines in resource-limited settings recommend using viral load testing as a binary result, above or below a specific threshold to identify treatment failure. We, therefore, compared several treatment failure thresholds for dried blood spot specimens (1,000, 3,000, 5,000, 7,500, and 10,000 copies/ml) to the currently suggested 1,000 copies/ml threshold for plasma specimens for correctly classifying patients (Table 3 and Fig 3). Using a dried blood spot specimen threshold of 1,000 copies/ml, all 6 technologies had a sensitivity of detecting a viral load above 1,000 copies/ml of greater than 80%. At the same threshold, the specificity of detecting a viral load below 1,000 copies/ml was over 80% for all technologies except for the Biocentric Generic HIV Charge Virale (55.16%), Hologic Aptima (73.44%), and Roche COBAS TaqMan SPEX (43.86%). Using a higher treatment failure threshold, such as 5,000 copies/ml, for dried blood spot specimens reduced the sensitivity and increased the specificity of all technologies. Finally, HSROC curves were created for those technologies where more than 4 studies were included in the meta-analysis (S4 Fig).
Fig 3. Forest plots of sensitivity and specificity of all studies included in the meta-analysis for each viral load technology using a treatment failure threshold of 1,000 copies/ml.
Abbott RealTime HIV-1 two-spot (a), Abbott RealTime HIV-1 one-spot (b), Biocentric Generic HIV Charge Virale (c), bioMerieux NucliSENS EasyQ HIV-1 (d), Hologic Aptima (e), Roche COBAS TaqMan FVE (f), Roche COBAS TaqMan SPEX (g), Siemens VERSANT HIV-1 RNA (h). Red bars and lines indicate the overall metrics for each viral load technology.
Additionally, to better understand the performance of dried blood spot specimens at lower treatment failure thresholds (below 1,000 copies/ml), we compared the 6 predefined treatment treatment failure thresholds—detectable, 200, 400, 500, 600, and 800 copies/ml—between dried blood spot specimens and plasma for each technology and protocol (Table 4). The Biocentric Generic HIV Charge Virale and Roche COBAS TaqMan SPEX technologies had poor specificity (<40%) at all lower thresholds below 1,000 copies/ml. The Siemens Versant had a sensitivity and specificity above 85% when using a threshold of 800 copies/ml; however, the specificity declined to below 80% at a threshold of 600 copies/ml and below 70% with all thresholds below 500 copies/ml. The Abbott RealTime HIV-1 two-spot and Roche COBAS TaqMan FVE protocols had high sensitivities and specificities at all lower thresholds except detectable. The new Abbott RealTime HIV-1 one-spot protocol, however, had high specificities at all lower thresholds, but sensitivity performance was below 85% at the 200 copies/ml and detectable thresholds. The Hologic Aptima had high sensitivities with all thresholds except detectable; however, specificity was lower than 85% at the thresholds of 800 copies/ml and 200 copies/ml. Finally, the bioMérieux NucliSENS EasyQ HIV-1 had sensitivities and specificities greater than 85% at all thresholds.
Quantitative polymerase chain reaction inherently introduces a level of variability in test results, generally +/−0.3 log copies/ml [68,69]. We, therefore, sought to understand if the performance observed with each technology was within the inherent assay variability limits. For the Abbott RealTime HIV-1 two-spot, Abbott RealTime HIV-1 one-spot, Biocentric Generic HIV Charge Virale, bioMérieux NucliSENS EasyQ HIV-1, Hologic Aptima, Roche COBAS TaqMan FVE, Roche COBAS TaqMan SPEX, and Siemens VERSANT HIV-1 RNA technologies, 59.28%, 68.71%, 38.04%, 52.54%, 50.40%, 62.03%, 33.45%, 47.22% of dried blood spot specimen test results were within the standard deviation of +/−0.3 log copies/ml of the paired plasma test result, respectively (Fig 4).
Fig 4. A substantial proportion of dried blood spot results fall outside of the plasma result +/−0.3 log copies/ml for each technology.
Abbott RealTime HIV-1 two-spot (a), Abbott RealTime HIV-1 one-spot (b), Biocentric Generic HIV Charge Virale (c), bioMerieux NucliSENS EasyQ HIV-1 (d), Hologic Aptima (e), Roche COBAS TaqMan FVE (f), Roche COBAS TaqMan SPEX (g), Siemens VERSANT HIV-1 RNA (h). Blue bars represent +/−0.3 log copies/ml of the plasma result, while orange triangles represent the paired dried blood spot viral load result.