Vlado Dancik, Michael S. Waterman,
Simple Maximum Likelihood Methods for the Optical Mapping Problem.
Proceedings of the Workshop on Genome Informatics (GIW '97), 1997, (to appear).

Abstract.

Recently a new method for obtaining restriction maps was developed by David Schwartz at NYU. Using this method restriction maps are created from fluorescent images of individual molecules obtained using a microscope. For every individual observed molecule, image processing methods are used to generate a list of the approximate locations of the sites where the molecule is cut by the restriction enzyme. Our task is to find the location of all restriction sites given the observed cutting sites. This is also complicated by the fact that an orientation of the molecules is unknown, i.e. for a cut-site $x$ we do not know whether $x$ or $1-x$ corresponds to a restriction site in a unit length molecule.

First we consider the case that the orientation of all molecules and the number $c$ of restriction sites are known. We suppose that for each restriction site location $y_j$ the corresponding measured cut-sites follow the normal distribution with the density function $g(x;\theta_j,\sigma_j)$ for some $\sigma_j$. (This means the measurement is unbiased with mean $\theta_j$.) The observed cut-sites locations $x_1,\dots,x_n$ then follow the mixture distribution $f(x;{\bf p},\theta,{\bf\sigma})=\sum_{j=1}^k p_jg(x;\theta_j,\sigma_j)$, where $\sum p_j=1$. Using the likelihood principle we wish to find parameters ${\bf p},\theta,{\bf \sigma}$ that achieve the maximum of the likelihood function $\prod_{i=1}^nf(x_i;{\bf p},\theta,{\bf \sigma})$. In our case it is natural to assume that $p_1=\cdots=p_k=1/k$ and $\sigma_1=\cdots=\sigma_k=\sigma$ for a constant $\sigma$.

Frequently in the Optical Mapping there appear ``false'' cuts, i.e. cuts corresponding to no restriction site. In our model we accommodate false cuts by using an uniform component in the mixture distribution. We use EM algorithm and Bayes theorem for computing the maximum likelihood estimate and compare our results for the different variants of our model.

We explore how the change of the orientation of some molecules influences the maximum likelihood estimate and show that the orientation question can be in our case answered for each molecule separately. Finally we present few ideas for specifying the orientation of molecules without investigating the positions of restriction sites.


Bibtex entry:

@InProceedings{DaWa97,
  author =       "V. Dan{\v c}{\'\i}k and Michael S. Waterman",
  title =        "Simple Maximum Likelihood Methods for the Optical 
                  Mapping Problem",
  booktitle =    " Proceedings of the Workshop on Genome Informatics (GIW '97)",
  year =         "1997",
  note =         "(to appear)" 
}


Return to Previous Level

USC Computational Biology Home Page
http://www-hto.usc.edu/people/dancik/seq97.html, webmaster@hto.usc.edu, 10 Nov 1997