1. Introduction
Arracacha (Arracacia xanthorrhiza Bancroft) is a root crop originating in the Andean region which is also its main production area. Arracacha roots have beneficial nutritional properties (such as high calcium content, starch and vitamin A), and is industrialized for processing into infant foods in Brazil. Commercial production takes place in Colombia, Brazil, Ecuador and Venezuela, where the total coproduction area is estimated to exceed 30000 hectares (Hermann, 1997). In Peru and Bolivia, arracacha is also widely cultivated for subsistence and generally only some surpluses go to local markets (Hermann, 1997). Several arracacha viruses and their characteristics have been described, but it is not clear how they affect crop yield. To date, the known viruses that infect arracacha are Arracacha virus A, Arracacha virus B, Arracacha potyvirus 1 (AP1), Arracacha virus 3 (AV3), Potato black ringspot virus (PBRSV), Arracacha mottle virus (AMoV), Arracacha virus V and Arracacha virus I (Jones & Kenten, 1978; Kenten & Jones, 1979; Santa Cruz, 1994; Lizarraga, 1997; Lizárraga, Chuquillanqui & Jayasinghe, 1994; Orílio et al., 2009; Orílio et al., 2018; Oliveira et al., 2017). In addition, Brunt et al. (1996) described: Arracacha potyvirus Y and Arracacha latent carlavirus but it is not clear if they are distinct from AP-1 and AV-3, respectively.
Because arracacha is propagated vegetatively, it can accumulate viruses over generations leading to degeneration, resulting in reduced root production (Orilio et al., 2009). Also, it can be a reservoir of viruses for other crops, since it has been reported that different Andean roots and tubers can be infected by the same viruses, such as Potato yellowing virus (PYV) in yacon (Silvestre et al., 2020) or Andean potato latent virus (APLV), Potato leafroll virus (PLRV) and Potato virus T (PVT) in ulluco (Lizárraga, Santa Cruz & Jayasinghe, 1996a; Lizárraga, Santa Cruz & Salazar, 1996b; Lizárraga et al., 1999).
Thus, viruses that can be detected in arracacha could be a source of emerging viruses and cause problems, not only in this crop, but also nearby crops such as potatoes (Jones & Kenten, 1978; Kenten & Jones, 1979) due to global warming and the movement of vectors to places where they used to be they were not found. Some viruses are difficult to detect by inoculation to indicator host plants, because their vectors are unknown, they are difficult to transmit mechanically and/or difficult to graft to other plants. Because of that, next-generation or high throughput sequencing (HTS) technology has emerged as a powerful alternative to detect and identify viruses that have been difficult to identify using traditional methods (Wu et al., 2015). HTS technologies enable, in a relatively short period of time, the characterization of known plant viruses and the discovery of novel ones, without prior knowledge (Fitzpatrick et al., 2021; Kutnjak et al., 2021). This technology has been successfully used with several crop species, and its use has increase dramatically in recent years. Because HTS is based on agnostic sequencing of nucleic acids, the detection capacity of this technology is in principal independent of environmental conditions, virus genetic variability, and host responses (Kavalappara et al., 2021). It is well documented that HTS based detection is at least equivalent to conventional biological and molecular methods in sensitivity, specificity, repeatability and reproducibility (Bester et al., 2021; Soltani et al., 2021).
The objective of this study was to identify and characterize the sequences related to viruses in arracacha samples by HTS and screen field samples of potato for the sequences associated with viruses initially detected in arracacha.
2. Materials and methods
2.1. Plant material
Leaves of 4 asymptomatic arracacha plants grown in the greenhouse in CIP-Huancayo station (Junin, Peru) were collected and sent to CIP-Lima to extract total RNA using Trizol reagent (Invitrogen, Carlsbad, CA) following the manufacturer’s instructions. Origin of the samples are showed in Figure1.
2.2. Sequencing and in-silico assembly
Small RNAs were isolated as described previously (Fuentes et al., 2012), and sent to Fasteris Life Sciences SA (Plan-les-Ouates, Switzerland) for library preparation and sequencing (bulked together with 16 samples of potato and sweetpotato to save cost), on Illumina HiSeq2000. Reads obtained were assembled denovo using Velvet v0.6.04 (Zerbino & Birney, 2008) as described previously (Fuentes et al., 2012). Contigs with similarity to viruses were identified by BLASTx against GenBank viral sequences. The gapped regions were successfully amplified from the large RNA fraction of the arracacha RNA extracts by specifically designed primers, after which the corresponding amplified frag ments were cloned into pGEM-T Easy vector (Promega, WI, USA) and sequenced (Macrogen-Korea). Sequences obtained were then edited and annotated using Vector NTI v.9 software package (Invitrogen). Attempts to amplify the 5’ and 3’ un-translated regions (UTRs) were done using Mod-One and Ban-two linkers. Phylogenetic and molecular evolutionary analyses were conducted using MEGA7 (Kumar, Stecher, & Tamura, 2016). Primers designed were used later to detect the identified viruses in potato and arracacha samples.
2.3. Insect transmission experiments
The arracacha plants where virus-associated sequences were found were placed separately in a cage together with 400 vector insects in a glass greenhouse. Myzus persicae, Bemisia tabaci, Trialeurodes vaporariorum, Empoasca spp. were used to realize the transmission experiments. The insects were left with the plants for an acquisition period of 48 hours inside cages, to later place 20 insects to independ ent potato seedlings inside cages for an inoculation period of 48 h. After that time, the insects were eliminated manually and by applying insecticides. 20 repetitions were made in 2 potato varieties: Canchan and Perricholi (40 potato seedlings in total). One month later, the inoculated plants were evaluated by RT-PCR for vitivirus and crinivirus.
3. Results and discussion
3.1. Viruses genome assembly
Small RNA library sequencing produced a total of 11,286,650 reads between 21-24 nucleotides (nts) (https://research.cip.cgiar.org/confluence/display/cpx/GAF13-14/GAF14_21-24.fastq.gz). In total, 22 and 21 contigs corresponded to RNA1 and RNA2 of a novel crinivirus, 8 contigs corresponded to a vitivirus, 20 corresponded to an enamovirus and 8 to ST9-like RNAs. Primers designed to amplify the regions between assembled contigs are shown in Table 1. Amplification of 3’ and 5’ ends was attempted using various approaches but was only successful for the vitivirus 3´-end.
It is proposed to name the new crinivirus as Arracacha latent virus C (ALVC), the vitivirus as Arracacha latent virus V (ALVV), the enamovirus arracacha latent virus E (ALVE) and the ST9-like RNA Arracacha latent virus E associated RNA (ALVEaRNA). Oliveira et al. (2017) described the identification of another vitivirus from arracacha in Brazil, which they named Arracacha virus V (AVV). ALVV showed 74% nt identity to AVV over the complete genome, while it had 74% nt and 85% aa identity in the replicase, and 82% nt and 92% aa in the coat protein. Thus, AVV and ALVV represent distant isolates of the same species.
To verify presence of these viruses, 20 arracacha accessions grown in the field in CIP-Lima were screened using primers designed for each virus (targeting the ORF1a, CP and replicase region for ALVC, ALVV and ALVE, respectively). Results are provided in Table 2.
Subsequently, Illumina barcoded small RNA libraries of positive accessions were prepared as describe above at CIP Virology Lab and sent for sequencing (as a bulk sample of three plants infected with ALVV, ALVC or ALVE+ALVEaRNA, respectively) to Boyce Thompson Institute (Ithaca, NY, USA) on an Illumina Hiseq 2000 machine. It produced 1,558,633 reads between 21-24 nts (https://research.cip.cgiar.org/confluence/display/cpx/GAF300/), of which 3210 reads aligned with ALVV (0.2% of the total siRNA), and 1483 reads aligned with ALVC [0.09% of the total siRNA: 678 reads aligned with RNA1 (0.04%) and 805 reads aligned with RNA2 (0.05%)]. Also, 3930 reads aligned with ALVE (0.25%) and 14137 aligned with ALVEaRNA (0.9%). The near complete sequences of ALVC and ALVV, and partial sequences of ALVE and ALVEaRNA were deposited in GenBank (accession no. KY451034, KY451035 for ALVC, KY451036 for ALVV, MF136435 for ALVE and MF136436 for ALVEaRNA). Like other criniviruses, ALVC consisted of two singles stranded RNAs of positive polarity. 7461nt of RNA1 and 7305nt of RNA2 were determined, containing most of the coding regions but lacking the 3’ and 5’ ends. An analysis of the coding capacity of RNA1 showed that it contains at least three ORFs (Figure 2) which corresponded to ORF1a, ORF1b, ORF2. The determined partial ORF1a contained 5613bp (1870aa, 21.3kDa), and contained conserved motifs associated with virus replication in closteroviridae: a methyltransferase domain (MTR), a helicase domain (HEL) (Martelli et al., 2012), and an RNA-dependent RNA polymerases (RdRp) domain in ORF1b which is thought to be translated by a +1 ribosomal frameshift from ORF1a. ORF3 (153pb, 51aa) encoded a predicted protein of 5.84 kDa (p6) but showed no significant similarity to corresponding proteins of other criniviruses.
Analysis of the sequenced RNA2 of ALVC revealed that it encoded at least seven ORFs (Figure 1 homologous to HSP70h (61.27 kDa), cell-to-cell movement protein (p59, 59.61 kDa), CP (28.18 kDa), and CPm (79.03 kDa). Other ORFs were also predicted, which were similar in size (6.24 kDa (p6.5), 10.20 kDa (p9), and 28.06 kDa (p26)) and genomic positions to those found in other criniviruses but lacked significant aa similarity. ORF4 (p9) and ORF6 (mCP) begin with the codon GTG. Since it was unable to obtain sequences from the ends of the RNAs, it remains possible there are additional ORFs present 3’ or 5’ of the determined sequences as found as in some criniviruses. In the phylogenetic analysis based on HSP70h and CP coding region (Figure 2), ALVC clustered together with PYVV, BYVaV, SPaV and BPYV, which belong to group 1 of the criniviruses, which are transmitted by Trialeurodes vaporariorum, together with those of group 3 (Martelli et al., 2012).
ALVV had a similar genome structure as AVV (Oliveira et al., 2017). The determined sequence was predicted to have at least five open reading frames (ORFs), which correspond to the replicase, 20kDa-like protein (p20), movement protein (MP), coat protein (CP) and nucleic acid binding protein (NABP) based on homology with other vitiviruses. The replicase ORF was sequenced only partially (4944bp, 1648aa, 18.74 kDa) but contained three conserved motifs found among proteins associated with virus replication (MTR, HEL and RdRp). The predicted 20kDa-like protein (371bp, 123aa, 14.09 kDa) also found in other vitiviruses, begins with a GTG codon and it was similar to an ORF found also in AVV, but not reported by Oliveira et al. (2017). The predicted MP (885bp, 295aa, 32.80kDa) and CP (576bp, 192aa, 21kDa) were clearly related with their homologs in other vitiviruses, whereas the nucleic acid binding protein (NABP; 330bp, 110aa, 12.58kDa) has weak homologies to proteins with RNA-binding properties. The two phylogenetic trees generated (Figure 3) confirmed the relationship of the vitivirus from arracacha with another species in this genus.
The determined sequence of 4035nts of ALVE was predicted to include four ORFs. The first two predicted ORFs were 5´incomplete and corresponded to P1 and P1-P2 fusion protein, which is translated through a -1 ribosomal frameshift from P1 (1185 aa, 152.7 kDa), and are involved in virus replication. The P1 contained a serine-like protease domain, and the P2 region the core domains of the viral RdRP as in other enamoviruses. The third ORF corresponded to P3 (189 aa, 21.3 kDa) of luteovirids, and the last identified ORF, which was 3’ incomplete is translated by readthrough of the P3 ORF to form a P3-P5 fusion protein. P3 is the coat protein (CP), whereas the CP readthrough extension (P5) of P3-P5 is thought to be an aphid transmission subunit of the virus. It was not possible to obtain sequence for the region encoding the ORF0 characteristic of the members of family Luteoviridae.
In samples that were positive for ALVE, additional sequences were identified with similarity to Beet western yellows luteovirus ST9-associated RNA (BWYV ST9aRNA) and carrot red leaf virus-associated RNA (CRLVaRNA), by PCR test, which it was proposed to name Arracacha latent virus E-associated RNA (ALVEaRNA). The predicted partial ORF1 translated into a hypothetical protein with unknown function and a partial ORF2 into a putative RdRp is the results from readthrough of stop codon, similar to found in other ST9 like satellite RNAs. Additional ORFs can found in related ST9 like satellite RNAs but were beyond the area sequenced in this study.
To evaluate if ALVC and ALVV were present in other plants, thirty-five additional arracachas maintained in greenhouse from CIP-Huancayo were evaluated for the presence of these viruses by RT-PCR. Also, because several potato viruses have been reported infecting Andean root crops and viceversa (Lizárraga et al., 1996a, 1996b; 1999), potato samples from Colombia (Chipaque - Cundinamarca, region near to Bogota) and from Peru (Huánuco and Cajamarca) were evaluated by RT-PCR to determine if these putative viruses were able to infect this crop. ALVC sequences were found in 25% of the tested arracachas (Fig. 3), whereas none of them tested positive for ALVV. From the evaluated Colombian potato samples, 20.61% and 5.15% resulted in positive amplification for ALVC and ALVV, respectively. Among the evaluated Peruvian potato samples, no sequence was detected in the 40 samples from Huánuco, while in the 41 samples from Cajamarca, 3 samples showed positive amplification for ALVC (7.31%) and 2 were positive for ALVV (4.87%). The PCR fragments were sequenced, aligned and submitted to phylogenetic analysis. The Colombian ALVC sequences formed a different group compared with the Peruvian ALVC sequences (Fig. 4a). On the other hand, all the ALVV sequences formed a single group (Figure 4B).
3.2. Virus transmission
Several attempts to transmit these viruses mechanically and by vectors were made. Mechanical inoculation on Nicotiana clevelandii x N. bigelovii, Gomphrena globosa, Chenopodium quinoa and Datura stramonium were unsuccessful suggesting they may not be mechanically transmissible. This was not unexpected as members of the genus Crinivirus are phloem restricted (Martelli et al., 2012) and vitiviruses are known to be difficult to transmit mechanically (Adams et al., 2012). Graft transmission to C. quinoa, C. amaranticolor, Datura stramonium and Physalis floridiana using petioles and leaves as scions (due to morphological incompatibility of arracacha stems with these indicator plants) were also unsuccessfully attempted. Taking in consideration that whiteflies are generally the vectors of criniviruses from group 1 (Martelli et al., 2012) to which ALVC belongs, transmission by Bemisia tabaci biotype B and Trialeurodes vaporarariorum were attempted with groups 20 insects twice according to Gamarra et al. (2010). For ALVV, the aphid Myzyus persicae was chosen to attempt transmission, as vectors of vitivirus in herbaceous plants are generally aphids (Adams et al., 2012). Arracacha plants containing ALVC and ALVV sequences were placed in a separated cage with 400 adult insects under glasshouse conditions (20 °C, 70-95% RH, with 18h light and 6h dark). For the three-insect species, they were left on the source plants for a 48-hour acquisition access period before transferring 20 insects to each 20 individual healthy potato seedlings of two different cultivars (Canchán and Perricholi, 40 seedlings in total) in separate cages for a 48-hour inoculation access period (IAP). After this time, the insects were removed and plants sprayed with insecticide Applaud (Syngenta) and Confidor (Bayer) for whiteflies and aphids, respectively. Plants were evaluated by RT-PCR to confirm transmission one month after the IAP. ALVC and ALVV were not detected in any of the plants tested by PCR. Whereas transmission of ALVE was not attempted with whiteflies or aphids, leafhoppers (Empoasca spp.) found colonizing an arracacha field at CIP la Molina station were positive for ALVE by PCR. Subsequent attempts to transmit ALVE to indicator plants failed however, because the leafhoppers were so aggressive, they rapidly killed the plants.
ALVC and ALVV belong to genera that typically are not known to have broad host ranges and it was therefor perhaps surprising to find them in such distantly related species as potato and arracacha (Figure 5). No obvious symptoms were associated with the presence of these viruses in either species and based on siRNA quantities, virus titers could be suspected to be quite low in arracacha at least and this may have affected results of our attempts at vector transmission. Obviously, the identified viruses are widespread in the Andean region at least from Colombia to Southern Peru, it is not necessarily the case that either arracacha or potato are their primary hosts even if both viruses can be maintained through vegetative cycles in arracacha at least. It would be interesting to screen weeds and wild plants found in association with crops in the Andes to identify plants that may serve as primary hosts and possible reservoirs from which these viruses enter cultivated crops.
4. Conclusions
In the present study we identified the presence of sequences corresponding to a new crinivirus (ALVC), a new enamovirus (ALVE) and a new ST9-like RNA (ELVEaRNA) in arracacha. Also, a vitivirus identified in this study (ALVV) was found to be closely related to AVV identified from arracacha in Brazil and which was published while this manuscript was in preparation. Thus, this report expands the range of AVV to include Peru. Using the assembled sequences, primers were designed for all identified viruses and could be used to identify additional infected arracacha and potato plants. These primers present useful tools to screen germplasm or planting material to ensure plant health.
Sequence variation found between the amplified PCR fragments of samples suggest an actively evolving virus population in the Andes. Nevertheless, because presence of these virus sequences was not associated with any obvious symptoms, and virus titers appeared to be low, and we were unable to identify the means of transmission, many questions remain open. Future studies should focus on identifying alternate hosts, the likely means of transmission, and possible effects on yields in the different crops they infect.