INFORMS Journal on Computing
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


INFORMS JOURNAL ON COMPUTING
Vol. 16, No. 4, Fall 2004, pp. 331-340
DOI: 10.1287/ijoc.1040.0087
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chew, D. S. H.
Right arrow Articles by Leung, M.-Y.
Right arrow Search for Related Content

Palindromes in SARS and Other Coronaviruses

David S. H. Chew, Kwok Pui Choi, Hans Heidner, Ming-Ying Leung

Department of Mathematics, National University of Singapore, Singapore 117543, Singapore
Departments of Mathematics, and of Statistics and Applied Probability, National University of Singapore, Singapore 117543, Singapore
Department of Biology, University of Texas at San Antonio, San Antonio, Texas 78249, USA
Department of Mathematical Sciences, University of Texas at El Paso, El Paso, Texas 79968, USA

matchewd{at}nus.edu.sg
matckp{at}nus.edu.sg
hheidner{at}utsa.edu
mleung{at}utep.edu

With the identification of a novel coronavirus associated with the severe acute respiratory syndrome (SARS), computational analysis of its RNA genome sequence is expected to give useful clues to help elucidate the origin, evolution, and pathogenicity of the virus. In this paper, we study the collective counts of palindromes in the SARS genome along with all the completely sequenced coronaviruses. Based on a Markov-chain model for the genome sequence, the mean and standard deviation for the number of palindromes at or above a given length are derived. These theoretical results are complemented by extensive simulations to provide empirical estimates. Using a z score obtained from these mathematical and empirical means and standard deviations, we have observed that palindromes of length four are significantly underrepresented in all the coronaviruses in our data set. In contrast, length-six palindromes are significantly underrepresented only in the SARS coronavirus. Two other features are unique to the SARS sequence. First, there is a length-22 palindrome TCTTTAACAAGCTTGTTAAAGA spanning positions 25962–25983. Second, there are two repeating length-12 palindromes TTATAATTATAA spanning positions 22712–22723 and 22796–22807. Some further investigations into possible biological implications of these palindrome features are proposed.

Key words: Markov chain; palindrome counts; simulation; RNA viral genome; severe acute respiratory syndrome
History: received August 2003; accepted January 2004.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2004 by INFORMS.