RefSeq NM_123456 and GenBank AF123456 appear to be duplicates. Will one be removed?
No, both records will continue to be available. RefSeq and GenBank are separate databases, and both databases are available in the Entrez nucleotides data set. Provisional RefSeq records are usually quite similar to the source GenBank records from which they were drawn. However, when RefSeq records are reviewed by experts, additional sequence data, biological annotations, and references are often added. At that time, the original source GenBank record(s) and the corresponding RefSeq entry can be quite different — the RefSeq entry can represent a combination of information from various labs, which are credited in the Comments and/or References field of the record. The RefSeq database is designed to reduce duplication by selecting one representative sequence for each human locus, whereas GenBank is a repository of sequences that might contain numerous records for any given gene. The only duplicates in the RefSeq database will represent naturally occurring paralogs and splice variants.