A search limited to “complete genome” yields more sequences than the same search limited to “Gag” only. Why?
The reason this happens is an artifact of how we define “complete genome”. A search for “complete genome” will include all sequences >7000 base pairs. These “complete” genomes are not always 100% complete; many have a small truncation of the 5′ end of Gag. A search for “Gag” is limited to sequences that have a full-length Gag gene; those sequences that have a small truncation of Gag are omitted, and thus a smaller number of sequences is obtained. If you want to search for Gag sequences that include those sequences with small truncations of the 5′ end, it is best to search using exact genome coordinates, with the 5′ coordinate selected for the greatest truncation you are willing to accept.