checksum

Problem

Grace and Edsger are constructing a N×N boolean matrix A. The element in i-th row and j-th column is represented by Ai,j. They decide to note down the checksum (defined as bitwise XOR of given list of elements) along each row and column. Checksum of i-th row is represented as Ri. Checksum of j-th column is represented as Cj.

For example, if N=2, A=[1011], then R=[10] and C=[01].

Once they finished the matrix, Edsger stores the matrix in his computer. However, due to a virus, some of the elements in matrix A are replaced with 1 in Edsger's computer. Luckily, Edsger still remembers the checksum values. He would like to restore the matrix, and reaches out to Grace for help. After some investigation, it will take Bi,j hours for Grace to recover the original value of Ai,j from the disk. Given the final matrix A, cost matrix B, and checksums along each row (R) and column (C), can you help Grace decide on the minimum total number of hours needed in order to restore the original matrix A?

Input

The first line of the input gives the number of test cases, T. T test cases follow.

The first line of each test case contains a single integer N.

The next N lines each contain N integers representing the matrix A. j-th element on the i-th line represents Ai,j.

The next N lines each contain N integers representing the matrix B. j-th element on the i-th line represents Bi,j.

The next line contains N integers representing the checksum of the rows. i-th element represents Ri.

The next line contains N integers representing the checksum of the columns. j-th element represents Cj.

Output

For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is the minimum number of hours to restore matrix A.

Limits

Memory limit: 1 GB.
1<=T<=100.
1<=Ai,j<=1, for all i,j.
1<=Bi,j<=1000, for i,j where Ai,j=1, otherwise Bi,j=0.
0<=Ri<=1, for all i.
0<=Cj<=1, for all j.
It is guaranteed that there exist at least one way to replace 1 in A with 0 or 1 such that R and C as satisfied.

Test Set 1

Time limit: 20 seconds.
1<=N<=4.

Test Set 2

Time limit: 35 seconds.
1<=N<=40.

Test Set 3

Time limit: 35 seconds.
1<=N<=500.

In Sample Case #1, A1,2 can be restored using the checksum of either 1-st row or 2-nd column. Hence, Grace can restore the matrix without spending any time to recover the data.

In Sample Case #2, Grace spends one hour to recover A1,1. After that, she can use checksums of 1-st row and 1-st column to restore A1,2 and A2,1 respectively. And then she can use checksum of 2-nd row to restore A2,2. Hence, Grace can restore the matrix by spending one hour.

In Sample Case #3, Grace can spend one hour to recover A1,1 and another hour to recover A2,2. After that, she can use checksum to restore the rest of the matrix. Hence, Grace can restore the matrix by spending two hours in total.

Sample Input & Output

3
3
1 -1 0
0 1 0
1 1 1
0 1 0
0 0 0
0 0 0
1 1 1
0 0 1
2
-1 -1
-1 -1
1 10
100 1000
1 0
0 1
3
-1 -1 -1
-1 -1 -1
0 0 0
1 1 3
5 1 4
0 0 0
0 0 0
0 0 0
Case #1: 0
Case #2: 1
Case #3: 2

Rust Solution

Analysis

Suppose we know a subset S of elements of the matrix A because they are either given in the input or revealed by Edsger. Can we restore the remaining elements? If there is only one unknown element in some row or column, we can recover its true value from the known elements in that row or column and the respective checksum. We can apply such an element recovery step until all elements are known or we end up with a matrix where each row and column has at least two unknown elements (aside from fully reconstructed rows and columns). It will be shown later that the whole matrix can be restored using this simple strategy or the subset S does not provide enough information to do so.

Test Set 1

For the small test set, we can examine every potential subset S of matrix elements and check if S is sufficient to restore the whole matrix. Among all sufficient subsets S, we pick the smallest in terms of hours as our answer.

Given a subset S of known elements, the above matrix recovery strategy can be implemented in O(N2) time. For example, we can use a BFS-like algorithm iterating over a queue of elements that are the only unknown elements in their rows or columns and calculating their true values one by one. For the sake of efficiency, we also need to maintain the XOR values of currently known elements for each row and column as well as the number of unknown elements per each row and column. As soon as the number of unknown elements for a row or column becomes 1, we add that element to the queue.

There are O(2N2) subsets of elements. Consequently, the overall time complexity of this brute-force algorithm is O(2N2×N2).

Test Set 2

Let us look at our problem from graphs perspective. Namely, we construct a weighted bipartite graph G, where the rows and columns of the matrix A are represented by nodes in G, and there is an edge of weight Bi,j between i-th row and j-th column if and only if Ai,j=1. Please see the below example to make the construction more clear.

An isolated node represents a row or column with all its elements known. We can safely disregard such nodes. A leaf node with exactly one incident edge represents a row or column with precisely one unknown element. Note that the process outlined in the first paragraph corresponds to repeated removal of leaves from the graph. If we end up with an empty graph in this way, it means that the original graph must have been a forest without any cycles and we can recover the full matrix without spending any time.

So what if the graph G does contain a cycle? Given any assignment of binary values to the elements of the matrix, we can flip the values of elements corresponding to the edges of a cycle, and this operation would not change the XOR checksum of any row or column. Consequently, we cannot tell the true value of elements on a cycle unless we reveal at least one of them, and effectivelly break the cycle by removing the edge from the graph and paying a delicious price. In other words, in order to be able to recover the whole matrix, it is necessary to break all cycles by revealing and removing some edges. It is also sufficient — once all cycles have been broken, what remains is a forest of edges, and the true value of all remaining edges can be determined unambiguously.

Thus we have reduced the original problem to finding a minimum weight subset of edges that breaks all cycles. An intuitive greedy approach involves iterating over the edges in a non-decreasing order of weights and removing the current edge from the graph if it is part of a cycle. An edge is part of a cycle if there is a simple path between its end-nodes other than the edge itself — a condition that can be tested by, say, running a depth-first search from one of the end-nodes.

In our example above, once we remove the cheapest edge of cost 1 (the red edge), what remains is a tree, so no other edges will be removed.

Since there can be up to N2 edges, the above steps are repeated O(N2) times. One run of the depth-first search costs O(N2) time as well, so the overall time complexity of this approach is O(N4).

But what about the correctness of this greedy approach? The proof is very similar to that of Kruskal's algorithm. Consider the first edge e that is removed by the algorithm, so it has the smallest weight among all edges on cycles. In particular, suppose that e is part of a cycle C. Now, consider any cycle breaking set of edges X that does not include e. Since the cycle C must be broken, X must contain an edge fe that is also part of the cycle C. The set of edges Y=Xf+e has a total weight no larger than the weight of X and it is cycle breaking as well. To prove the second claim, assume the contrary that the graph GY contains a cycle C, which necessarily includes the edge f and does not include the edge e. But then we can combine the paths P=Cf and P=Cf to form a cycle in GX, which contradicts the fact that X was cycle breaking. It follows that the edge e is part of some optimal solution and our greedy choice was valid.

Test Set 3

Of course, the problem of finding a minimum weight cycle breaking edge set is equivalent to the well known problem of finding a maximum weight spanning forest of G, except that we would build the complement set of edges to keep rather than the set of edges to remove. In the example above, the edges of the maximum weight spanning forest are rendered green. It would cost Grace one hour (the red edge) to reconstruct the whole matrix.

The graph may potentially have up to N2 edges, therefore, a simple implementation of Prim's algorithm without maintaining a priority queue data structure would achieve an O(N2) time complexity. Note that because of the high density of the graph, Prim's algorithm is a better choice than Kruskal's algorithm, as the later would need O(N2×logN) time for sorting the edges.

It is interesting to note that we never use the actual checksums in the graph construction nor the maximum spanning forest algorithm, therefore, the input values Ri and Ci can be safely ignored.

Having problems with this solution? Click here to submit an issue on github.