A mine with data for several organisms from LIS and other sources
Description: These upstream flanking regions all contain much of the Arabidopsis CArG1 binding site motif. The Shared Motif Search shown below will find that motif, GTTTACATAAATGGAAAA, shared to varying degrees by these regions, with a high score indicative of the length of the motif, the number of hits, and the CG content. It will also find many other sequences common to the regions that are less interesting and have lower scores. Had these regions been chosen for other reasons, we might look into whether this high-scoring shared motif is a transcription factor binding site.Date Created:
Click to toggle Shared Motif Search
Shared Motif Search
27 motifs close to top scorer:
The Shared Motif Search executes a BLAST+ run for each feature against the remaining features (if there are 10 features, there will be 10 1×9 BLAST runs). The BLAST results are then collated to find shared motifs. Only motifs which contain a C or G are included (there are always hundreds of shared motifs containing only A and T), and only motifs up to length 27 are shown. The list is truncated at 100 motifs. The score is based on the number of features that contain the motif (shown in the Num column), the DNA content of the motif (C and G score higher than A and T), and the length of the motif (longer motifs score higher). The group of features that share a motif is linked so you can create a list. If the top-scoring motif has other motifs close to it (defined by a SmithWaterman pairwise alignment distance), a sequence logo is displayed on the top summarizing that group of motifs, which are marked with an asterisk. Those are included at the end of the list if they score below the top 100.
To find out whether a particular motif is a known quantity, use a tool like GOMo in the MEME Suite, which scans it for associated GO terms for a chosen species. Or, for example, look it up in a transcription factor binding sites list, like AGRIS for Arabidopsis. If you're interested in a particular motif, just type it in the search box to filter the results.
1.BLAST: A greedy algorithm for aligning DNA sequences Zheng Zhang, Scott Schwartz, Lukas Wagner and Webb Miller, J. Comput. Biol. 7, 203 (2000).
2.BLAST+: architecture and applications Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., & Madden T.L., BMC Bioinformatics 10, 421 (2008).
3.BioJava: an open-source framework for bioinformatics Andreas Prlic, Andrew Yates, Spencer E. Bliven, Peter W. Rose, Julius Jacobsen, Peter V. Troshin, Mark Chapman, Jianjiong Gao, Chuan Hock Koh, Sylvain Foisy, Richard Holland, Gediminas Rimsa, Michael L. Heuer, H. Brandstatter-Muller, Philip E. Bourne, Scooter Willis, Bioinformatics 28, 2693 (2012).
4.WebLogo: A sequence logo generator Crooks GE, Hon G, Chandonia JM, Brenner SE, Genome Research 14, 1188 (2004).
Click to select widgets you would like to display: