Skip to content

Latest commit

 

History

History
125 lines (93 loc) · 3.67 KB

README.md

File metadata and controls

125 lines (93 loc) · 3.67 KB

UniFreiburg Banner

Lehrstuhl für Bioinformatik - Institut für Informatik - http://www.bioinf.uni-freiburg.de


Bioinformatics 2

SS 2023
Exercise sheet 10: Introduction to mapping

Exercise 4 - Programming Assignment

In this exercise, you will implement Burrows-Wheeler Transform (BWT) and its inverse.

a) To start building the Burrows-Wheeler matrix, you need to implement a helper function that returns all the rotations of the given string. Implement the function rotations which takes a string and returns a list of all its rotations.

Example: (Spoiler)
 >>> all_rotations = rotations("abaaba$")
 >>> print(all_rotations)
 ['abaaba$', 'baaba$a', 'aaba$ab', 'aba$aba', 'ba$abaa', 'a$abaab', '$abaaba']

b) Build the Burrows-Wheeler matrix. Implement the function bwm which takes a string and returns the Burrows-Wheeler matrix. The matrix should be a list of string rotations sorted alphabetically.

Example: (Spoiler)
 >>> matrix = bwm("abaaba$")
 >>> print(matrix)
 ['$abaaba', 'a$abaab', 'aaba$ab', 'aba$aba', 'abaaba$', 'ba$abaa', 'baaba$a']

с) Implement the transformation with the Burrows-Wheeler matrix. Implement the function bwt_with_bwm which takes a string and returns the Burrows-Wheeler transform as a string.

Example: (Spoiler)
 >>> t = bwt_with_bwm("abaaba$")
 >>> print(t)
 abba$aa

d) Check your understanding of the Burrows-Wheeler transform. Implement the function transformation_to_first_colum which takes the BW transform string t (the last column of the BW matrix) and returns the string corresponding to the matrix's first column. Just so you know, you do not need to build the Burrows-Wheeler matrix to do this.

Example: (Spoiler)
 >>> f = transformation_to_first_colum("abba$aa")
 >>> print(f)
 $aaaabb

e) Implement a helper function transformation_to_dict_occurrences. This function takes the string transformation (t) and returns the dictionary with the number of occurrences of each character in the transformation.

Example: (Spoiler)
 >>> occurrences = transformation_to_dict_occurrences("abba$aa")
 >>> print(occurrences)
 {'a': 4, 'b': 2, '$': 1}

f) Implement a helper function transformation_b_ranking. This function takes the string transformation (t) and returns the B-ranking of the characters in the transformation as a list.

Example: (Spoiler)
 >>> ranks = transformation_b_ranking("abba$aa")
 >>> print(ranks)
 >>> print(occurrences)
 [0, 0, 1, 1, 0, 2, 3]
 {'a': 4, 'b': 2, '$': 1}

g) Implement the helper function bwm_first_column_interval_form_from_transform. This function takes the string transformation (t) and returns the first column of the BW matrix in the interval form i.e. the dictionary where the keys are the characters and the values are tuples (start, end) of the character occurrences.

Example: (Spoiler)
 >>> f_column_intervals = bwm_first_column_interval_form_from_transform("abba$aa")
 >>> print(f_column_intervals)
 {'$': (0, 1), 'a': (1, 5), 'b': (5, 7)}

g) Implement the function reverse_bwt. This function takes the BW transformation string (t) and returns the original string.

Example: (Spoiler)
 >>> s = reverse_bwt("abba$aa")
 >>> print(s)
 abaaba$