Bitmask DP ( Buildup to SoS DP - Pt. 2 ) ( tutorial to SPOJ - ASSIGN )
[ Pre-requisites - Basic DP, bitmasks ]
First up, if you don't know what a bitmask is or need a refresher, please look up some good tutorials on the topic. This should be a good tutorial, taken directly from Steven and Felix Halim's legendary CP book. I have a blog post about bitmasks here, but it's more of a set of "points to remember" than a complete tutorial. Once you are done, resume.
DP, i.e, "Dynamic Programming", the arch nemesis of many competitive programmers including me, is probably the hardest problem solving paradigm to adapt to and to apply effectively in a timed contest. If you have absolutely no idea about DP, please bookmark this page for later and go learn the basics. It would take a few hours/days/weeks but it's totally worth it.
As of late, I have been messing around with DP a bit more, and I can definitely see some "meta" categories into which most DP problems can be classified into, such as "basic"/"classic" DP, Digit DP, DP on trees, Bitmask DP and quite a few others (a more detailed list is available here ).
Today I'll be tackling Bitmask DP, as already evident by the blog title. I'll call it BMDP in short.
The thought process behind BMDP is quite similar to "classic" DP problems, and as such standalone BMDP problems are not too hard to figure out, about Codeforces div2 C level in my opinion.
Generally speaking, whenever you see a problem involving "all possible subsets of a given set", it can be one of the following :
1. Maths.
2+2=4-1=3quickmaths. But how to know whether it's a math problem ?
a. Look at the constraints. If size of superset >30, it is most probably not BMDP, may be maths. If size of set is absurdly large such as 10^6 or 10^9 etc., highly probable that it is a O(1) or polynomial time maths problem, as in order to generate all possible subsets you will need at least O(2^N)
b. Individual items of the set are not important. If you see that the solution doesn't depend on exactly which elements are chosen in the subset but it depends on some other property of the subset such as number of elements, number of repeated elements etc., it is probably a maths problem.
2. Bitmask DP
How to spot ?
a. Look at the constraints. If size of superset is <=25, BMDP is very much feasible. If it is between 25 to 30, it might be BMDP with pruning or some special optimization. As BMDP takes at least O(2^N) , you can generally rely on it to work in under 1sec with no special optimization for N<=27 if your implementation is efficient.
b. Individual items of the set generally have some importance. In general, some sort of value could be provided as input/computed for each element of the set, which should be essential in calculating the effect of each possible subset on the final answer. You will see this in a while.
3. Something else
It can be anything. The world of algorithms is too huge for me to predict exactly how a problem should be approached.
As we see, BMDP is used in a variety of questions related to Hamiltonian walks and cycles.
Now, lets take a look at SPOJ-ASSIGN
If you look at the problem in a certain way ( consider the input matrix as an adjacency matrix of a graph ), the problem is quite similar to counting hamiltonian cycles but in a contest scenario, you will most probably never think this way.
A better way to think about the problem is as follows.
For any given number of students, consider that students numbered from 0 to 'k' have been already assigned to a subject each, and the rest 'n-k' students remain to be assigned.
Now, as there is no defined weightage of different students on the same subject, we do not have to bother about which student was placed into which subject. We only need to know which subjects are booked and which students have been assigned.
In order to store which subjects have been taken up till now, we will use a bitmask 'B'.
Also, from the variable 'k', we find that all students from 0 to 'k' have been assigned.
DP(k,B) = Total no. of possible arrangements of students 0 to k choosing subset of subjects as per bitmask B.
(Keep in mind that every student CAN NOT be assigned into every subject, otherwise it would turn into a simple math problem)
Time for some optimization. The best way to optimize a DP solution is to reduce its "dimension", i.e, reduce the no. of state variables. In or case, we can see that k = No. of set bits in B. So, k can be easily calculated from B and hence not a part of our DP states anymore.
The new DP state is :
DP(B) = Total no. of possible arrangements of students 0 to k assigning subjects as per bitmask B.
where k = count set bits of B
For a state B , we can define the recurrence relation as follows :
where k = number of set bits in B,
DP(B) = for each i'th set bit in B, check whether k'th student likes i'th subject. if true, DP(B) += DP(unset i'th bit from B)
basically, for any arrangement, we remove one bit at a time and add up the resulting state if it satisfies the "liking" criteria. For example :
DP(1011) = DP(0011) + DP(1001) + DP(1010)
(assuming 3rd student likes 1st, 3rd and 4th subjects. If he didn't like any of those, then those disliked states wouldn't be added)
logically, this can be explained as follows :
for a DP state 1011, the 3rd student can be assigned to either 1st,3rd or 4th subject. Now, if the student was assigned to 1st subject, then the number of ways to assign the previous students is given by DP(0011). Similarly, if 3rd student gets 3rd subject, we add DP(1001), and for 4th subject we add DP(1010).
First up, if you don't know what a bitmask is or need a refresher, please look up some good tutorials on the topic. This should be a good tutorial, taken directly from Steven and Felix Halim's legendary CP book. I have a blog post about bitmasks here, but it's more of a set of "points to remember" than a complete tutorial. Once you are done, resume.
DP, i.e, "Dynamic Programming", the arch nemesis of many competitive programmers including me, is probably the hardest problem solving paradigm to adapt to and to apply effectively in a timed contest. If you have absolutely no idea about DP, please bookmark this page for later and go learn the basics. It would take a few hours/days/weeks but it's totally worth it.
As of late, I have been messing around with DP a bit more, and I can definitely see some "meta" categories into which most DP problems can be classified into, such as "basic"/"classic" DP, Digit DP, DP on trees, Bitmask DP and quite a few others (a more detailed list is available here ).
Today I'll be tackling Bitmask DP, as already evident by the blog title. I'll call it BMDP in short.
The thought process behind BMDP is quite similar to "classic" DP problems, and as such standalone BMDP problems are not too hard to figure out, about Codeforces div2 C level in my opinion.
The basic idea
The basic idea is to treat each subset as a state of the DP, and try to figure out a transition relation between different states so that we can generate different subsets from an existing subset.How to spot a BMDP problem?
Generally speaking, whenever you see a problem involving "all possible subsets of a given set", it can be one of the following :
1. Maths.
2+2=4-1=3quickmaths. But how to know whether it's a math problem ?
a. Look at the constraints. If size of superset >30, it is most probably not BMDP, may be maths. If size of set is absurdly large such as 10^6 or 10^9 etc., highly probable that it is a O(1) or polynomial time maths problem, as in order to generate all possible subsets you will need at least O(2^N)
b. Individual items of the set are not important. If you see that the solution doesn't depend on exactly which elements are chosen in the subset but it depends on some other property of the subset such as number of elements, number of repeated elements etc., it is probably a maths problem.
2. Bitmask DP
How to spot ?
a. Look at the constraints. If size of superset is <=25, BMDP is very much feasible. If it is between 25 to 30, it might be BMDP with pruning or some special optimization. As BMDP takes at least O(2^N) , you can generally rely on it to work in under 1sec with no special optimization for N<=27 if your implementation is efficient.
b. Individual items of the set generally have some importance. In general, some sort of value could be provided as input/computed for each element of the set, which should be essential in calculating the effect of each possible subset on the final answer. You will see this in a while.
3. Something else
It can be anything. The world of algorithms is too huge for me to predict exactly how a problem should be approached.
Classical problems for BMDP
In my opinion, when you set out to learn a new DS/algo, the best way to proceed is to study the classical problems related to it. A nice read up on BMDP classics can be found here.As we see, BMDP is used in a variety of questions related to Hamiltonian walks and cycles.
Solving SPOJ - ASSIGN
Now, lets take a look at SPOJ-ASSIGN
If you look at the problem in a certain way ( consider the input matrix as an adjacency matrix of a graph ), the problem is quite similar to counting hamiltonian cycles but in a contest scenario, you will most probably never think this way.
A better way to think about the problem is as follows.
For any given number of students, consider that students numbered from 0 to 'k' have been already assigned to a subject each, and the rest 'n-k' students remain to be assigned.
Now, as there is no defined weightage of different students on the same subject, we do not have to bother about which student was placed into which subject. We only need to know which subjects are booked and which students have been assigned.
In order to store which subjects have been taken up till now, we will use a bitmask 'B'.
Also, from the variable 'k', we find that all students from 0 to 'k' have been assigned.
Defining the DP state
So, we can define our DP state with two variables, 'k' and 'B' as :DP(k,B) = Total no. of possible arrangements of students 0 to k choosing subset of subjects as per bitmask B.
(Keep in mind that every student CAN NOT be assigned into every subject, otherwise it would turn into a simple math problem)
Time for some optimization. The best way to optimize a DP solution is to reduce its "dimension", i.e, reduce the no. of state variables. In or case, we can see that k = No. of set bits in B. So, k can be easily calculated from B and hence not a part of our DP states anymore.
The new DP state is :
DP(B) = Total no. of possible arrangements of students 0 to k assigning subjects as per bitmask B.
where k = count set bits of B
Finding the recurrence relation
The base case should be DP(0) = 1 , as there is exactly 1 way to arrange 0 students and 0 subjects.For a state B , we can define the recurrence relation as follows :
where k = number of set bits in B,
DP(B) = for each i'th set bit in B, check whether k'th student likes i'th subject. if true, DP(B) += DP(unset i'th bit from B)
basically, for any arrangement, we remove one bit at a time and add up the resulting state if it satisfies the "liking" criteria. For example :
DP(1011) = DP(0011) + DP(1001) + DP(1010)
(assuming 3rd student likes 1st, 3rd and 4th subjects. If he didn't like any of those, then those disliked states wouldn't be added)
logically, this can be explained as follows :
for a DP state 1011, the 3rd student can be assigned to either 1st,3rd or 4th subject. Now, if the student was assigned to 1st subject, then the number of ways to assign the previous students is given by DP(0011). Similarly, if 3rd student gets 3rd subject, we add DP(1001), and for 4th subject we add DP(1010).
Implementation
in practice, we create a single dimensional array of size 1<<20 (i.e. 2^20) . Let it be called DP[ ].
First, set DP[0] = 1 ;
then , run a loop from 1 to (1<<n)-1 (i.e. (2^n)-1) to generate all possible bitmasks
for each index of the loop, apply the recurrence relation as discussed above
finally, DP[(1<<n)-1] gives the answer.
find my implementation below : (sorry for the messed up indentation, github has been acting crazy)
I hope you learned something new today :)
This comment has been removed by the author.
ReplyDeletethanks really nice
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete