# What is Dotlet in Bioinformatics

Why is this knowledge important? Intensive engagement with dot plots is also worthwhile because the data structure of the 2D matrix and the concept Evaluate diagonals, in which other algorithms for sequence comparison (keyword calculation of alignments) are taken up. In addition, dot plots make it easy to compare the composition of genomes of closely related species.referenceThese exercises supplement chapter 9 "Pairwise sequence comparison".

Learning objective

 After completing the exercise, you should understand:The principle of the dot matrix

Some of the following examples are taken from the Dotlet package by M. Pagni and T. Junier.

exerciseDotplot_1  The following sequences are given:

 SEQ_A G H R. Q S. G G SEQ_B S. G G R. G

Calculate a dot plot with paper and pencil D..  The following condition applies to the filling of the matrix D.:
 Discuss the outcome, what is the longest common infix?
solutionHere you will find the solution to the task.    Preparation for the following exercises
 Clicking this link activates Dotlet, a program that interactively generates dot plots. Submit the following sequences via copy and paste to the applet. To do this, press each time inputKey and give the sequences the specified names.Read the Dotlet help file.

exerciseDotplot_2
 Compare the SLIT_DROME sequence to yourself. How do you interpret the pattern?

Set the zoom factor to 1: 5.

Adjust the controls so that identical regions are clearly visible.

Navigate with the mouse to determine at which positions within the sequence there are similar partial sequences.

You can find the database entry here. See if the regions you identify are in the section Features are listed and annotated.

exerciseDotplot_3

Both sequences contain a zinc protease domain.

 At which positions in the sequences are the domains located?

You can find the database entry here. See if the regions you have identified are in the Features are listed and annotated.

Now use as a matrix:Identity and note where the domains are in the two proteins.

Confirm your analysis by checking the location of the domains with the SMART server. Transfer one of the named sequences and start by pressing the button Sequence SMART the analysis.
exerciseDotplot_4

Compare the SERA_PLAFG sequences with yourself. Use the BLOSUM 30 matrix.

The sequence contains a region of low complexity, in this case a sequence of more than 30 serine residues.

 In which positions do you find region (s) of low complexity?

exerciseDotplot_5, introns and exons

In the 1st window (horizontal) select the sequence of the calmodulin gene (EMECALM) and
in the 2nd window that of the gene product CALM_EMENI.

Use the BLOSUM 100 matrix and one Sliding window of length 7.

 Can you see the intron / exon structure? What conditions must exist for such an analysis to be carried out?

Before the comparison, Dotplot translates the DNA sequence into the protein sequence in all three reading frames.

Note

The calculation and output can take some time.

In case you are familiar with the terms Intron and Exon are not familiar, please try the Internet.

What you should have understood by now

The 2D matrix is ​​an important data structure for the pairwise analysis of sequences. Areas with identical partial sequences are noticeable through diagonal elements, insertions or deletions through gaps in one of the sequences.