|
|
||||||||
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
The theory of the discrete-time Markovian arrival process (DMAP) can be applied to some statistical problems encountered when searching for multiple words in a Markov sequence. Such word searches are often emphasized in studies of the human genome. There are several advantages to the DMAP approach we present. Most notably, its derivations are transparent, and they readily unify disparate results about the exact distributions of overlapping and nonoverlapping word counts. We also present several examples and applications of our theory, including a numerical study using a random DNA dataset from the human genome.
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
park{at}ncbi.nlm.nih.gov
spouge{at}ncbi.nlm.nih.gov
Key words: Markovian arrival processes; word occurrences; distance between occurrences; transition-probability matrix
History: received July 2003;
revised February 2004;
accepted April 2004.
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |