Simple Pattern Distribution Finder

I used to write a small C program to identify the distribution of certain oligonucleotide pattern among given sequences, which are usually genomic DNAs and could be as large as 300Mb like human chromosome 1. I recently realized that a handy tool to calculate some sequence properties, such as the GC ratio, CpG density or restriction site distribution, is very helpful to others too. Hence I wrote a Perl CGI wrapper for that program so that everyone can use it on the web. This is still a primitive version, which doesn't allow the match of multiple patterns. The output format requires further improvements too. The reason I like it is that it can handle very large sequence efficiently, and the report is in html format, which can be easily opened by Microsoft Excel throught which a variety of chart could be drawn.

Query sequence: (FASTA format only, multiple sequences are accepted.)

Load your file or paste your sequence below.

Pattern to Find: User-defined pattern:

Size of Sampling Window(Kb) Sampling Strand


Note:This web-based utility can handle sequences with size up to 500Mb. For any problem and suggestions, please contact me!

Last modified 02-10-2002. Copyright (C) 2002 Kun Zhang. All Rights Reserved.