TY - JOUR
T1 - Reevaluation of human cytomegalovirus coding potential
AU - Murphy, Eain
AU - Rigoutsos, Isidore
AU - Shibuya, Tetsuo
AU - Shenk, Thomas E.
PY - 2003/11/11
Y1 - 2003/11/11
N2 - The Bio-Dictionary-based Gene Finder was used to reassess the coding potential of the AD169 laboratory strain of human cytomegalovirus and sequences in the Toledo strain that are missing in the laboratory strain of the virus. The gene-finder algorithm assesses the potential of an ORF to encode a protein based on matches to a database of amino acid patterns derived from a large collection of proteins. The algorithm was used to score all human cytomegalovirus ORFs with the potential to encode polypeptides ≥50 aa in length. As a further test for functionality, the genomes of the chimpanzee, rhesus, and murine cytomegaloviruses were searched for orthologues of the predicted human cytomegalovirus ORFs. The analysis indicates that 37 previously annotated ORFs ought to be discarded, and at least nine previously unrecognized ORFs with relatively strong coding potential should be added. Thus, the human cytomegalovirus genome appears to contain ≈192 unique ORFs with the potential to encode a protein. Support for several of the predictions of our in silico analysis was obtained by sequencing several domains within a clinical isolate of human cytomegalovirus.
AB - The Bio-Dictionary-based Gene Finder was used to reassess the coding potential of the AD169 laboratory strain of human cytomegalovirus and sequences in the Toledo strain that are missing in the laboratory strain of the virus. The gene-finder algorithm assesses the potential of an ORF to encode a protein based on matches to a database of amino acid patterns derived from a large collection of proteins. The algorithm was used to score all human cytomegalovirus ORFs with the potential to encode polypeptides ≥50 aa in length. As a further test for functionality, the genomes of the chimpanzee, rhesus, and murine cytomegaloviruses were searched for orthologues of the predicted human cytomegalovirus ORFs. The analysis indicates that 37 previously annotated ORFs ought to be discarded, and at least nine previously unrecognized ORFs with relatively strong coding potential should be added. Thus, the human cytomegalovirus genome appears to contain ≈192 unique ORFs with the potential to encode a protein. Support for several of the predictions of our in silico analysis was obtained by sequencing several domains within a clinical isolate of human cytomegalovirus.
UR - http://www.scopus.com/inward/record.url?scp=0345445893&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0345445893&partnerID=8YFLogxK
U2 - 10.1073/pnas.1735466100
DO - 10.1073/pnas.1735466100
M3 - Article
C2 - 14593199
AN - SCOPUS:0345445893
SN - 0027-8424
VL - 100
SP - 13585
EP - 13590
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 23
ER -