A total of 186 DNA transposons families including one integrated large DNA virus were identified in the genome of Pacific white shrimp Penaeus vannamei and are deposited in Repbase (www.girinst.org). Of these, 127 were identified in a small-scale genomic sequence (479 Mb) obtained from the specific pathogen-free (SPF) P. vannamei Kona Line produced by the breeding program of the U.S. Marine Shrimp Farming Program (USMSFP). These DNA transposons set consist of 42 DNA, 7 DNAV, 1 EnSpm, 11 Harbinger, 13 hAT, 2 Kolobok, 10 Mariner, 12 Merlin, 1 MuDR, 1 P, 8 piggyBac, 3 Polinton, 5 Sat, 8 TE, 2 Transib, and 1 Zator families. Additional 59 DNA transposons were also identified from the genome assembly of P. vannamei breed Kehai No.1 farmed in China [ASM378908v1) including 4 DNA, 1 Merlin, 1 hAT, 1 piggyBac, 33 Sat, 2 TE, and 17 uncharacterized REP families.
Some Sat transposons show similarity to microsatellite sequences published from the SPF P. vannamei Kona Line, including the telomeric pentanucleotide repeat (TAACC/GGTTA)n. These repeats are also the insertion site of the integrated large DNA virus, a nimavirus (Nimav-1_LVa, BFCD01000001, 217,415 bp).
The three whole genome sequence (WGS) databases available for P. vannamei (ASM378908v1, 1.7 Gb; CIBNOR_Pvan_1v2, 2.1 Gb; ASM3358929v1, 1.9 Gb) confirm that the telomeric pentanucleotide repeat (TAACC/GGTTA)n of P. vannamei is highly abundant and widely distributed in intron and intergenic regions of the genome. P. vannamei shares the same telomeric pentanucleotide repeat (TTAGG)n with most insects. The wide interstitial distribution of telomeric repeats is intriguing and may have important implications for the shrimp genome that is also rich in other simple sequence repeats. The evolutionary origin of telomeric repeats is not fully understood, but it has been suggested that the invasion of circular chromosomes by telomeric repeats may have given rise to linear chromosomes of eukaryotes. The wide interstitial distribution of telomeric repeats in the shrimp genome may represent extensive, recent, or active invasion by the pentanucleotide repeat. Considering that the estimated genome size of SPF P. vannamei from the United States is 2.89 Gb, a new contiguous, whole reference genome for P. vannamei is needed to fully characterize telomeric repeats. Information on the chromosome locations (# of HiC_scaffolds) of the 186 DNA transposons in the ASM378908v1 assembly will be presented.