Hello, reader ๐๐ฝ ! Welcome to day 32 of the series on Problem Solving. Through this series, I aim to pick up at least one question everyday and share my approach for solving it.
Today, I will be picking up LeetCode's daily challenge problem: 433. Minimum Genetic Mutation.
๐ค Problem Statement
- Given the two gene strings start and end and the gene bank bank, return the minimum number of mutations needed to mutate from start to end. If there is no such a mutation, return -1.
- A mutation is defined as a change in one single character in the gene string.
- A gene string can be represented by an 8-character long string, with choices from 'A', 'C', 'G', and 'T'.
- Each of the resultant gene mutation strings must be present in the given gene bank. Only then it can be used to mutate to the end string.
๐ฌ Thought Process - BFS
- To start from one string and reach to a destinations -> source to destination -> the problem in gist.
- In the get go, this does not look like a graph problem at all. But it is.
- From a node, we try to change each of the characters in that string among the given list of characters
['A', 'C', 'G', 'T']
. So all the resultant strings with single character change become neighbours of that parent node. - For e.g. the string
"AACCGGTT"
are some of the neighbours with mutated first letter."CACCGGTT"
"GACCGGTT"
"TACCGGTT"
- So this is what we'll do. Generate all possible mutations from the start string, and then iteratively check for mutations of those neighbours.
- But we will also add an additional check that a mutation is valid/ relevant only if it's also in the bank.
For e.g.: if our bank had the following strings:
["CACCGGTT"]
. Then"GACCGGTT"
and"TACCGGTT"
will not be valid mutations and hence not be considered in the further checks.So, we will run a normal BFS algorithm with a queue and a visited set. At each iteration, we try to generate the neighbours of the front node.
- If the neighbours are part of the bank, then we will continue with them.
- This process is repeated until:
- The queue is empty -> we have found no possible mutation
- Found the end string.
- We will also keep track of the number of steps needed for the mutations from start to end.
Why BFS?
- Since we need to find the minimum number of steps required and all the mutation costs 0, this basically means we need to find the shortest path that changes start to end.
- Hence we can use BFS which gives the shortest path in an undirected graph from start to destination.
Note: In the code, I've declared a new class Pair(String, int) that tells the current string and also the number of mutations it took to reach that string. Additionally an inner loop could have been used to process all the nodes at a single level as well.
๐ฉ๐ฝโ๐ป Solution - BFS
class Pair {
String node;
int mutations;
Pair(String node, int mutations) {
this.node = node;
this.mutations = mutations;
}
}
class Solution {
public int minMutation(String start, String end, String[] bank) {
Queue<Pair> queue = new LinkedList();
Set<String> set = new HashSet();
Pair startPair = new Pair(start, 0);
queue.add(startPair);
set.add(start);
char[] geneChars = new char[] {'A', 'C', 'G', 'T'};
while(!queue.isEmpty()) {
Pair p = queue.poll();
String s = p.node;
int mutations = p.mutations;
if(s.equals(end)) {
return mutations;
}
for(char ch: geneChars) {
for(int i = 0; i<s.length(); i++) {
String nextGene = s.substring(0,i) + ch + s.substring(i+1);
Pair next = new Pair(nextGene, mutations + 1);
if(!set.contains(nextGene) && isInBank(nextGene, bank)) {
set.add(nextGene);
queue.add(next);
}
}
}
}
return -1;
}
private boolean isInBank(String s, String[] bank) {
for(String b: bank) {
if(s.equals(b)) return true;
}
return false;
}
}
Time Complexity: O(n)
- n = length of bank array
Space Complexity: O(1)
- The time complexities can be considered constants as well, because we are given that the gene string sizes will be 8, and the bank size will not exceed 8 as well.
- Hence the mutations we have are also limited.
- But since the bank size can vary, we can say that the algorithm runs proportional to the bank size.
- You can find the code for this question in the GitHub repo here: 433. Minimum Genetic Mutation.
Conclusion
That's a wrap for today's problem. If you liked my explanation then please do drop a like/ comment. Also, please correct me if I've made any mistakes or if you want me to improve something!
Thank you for reading!