Siwakorn Srisakaokul (Ping): Greedy Algorithm

Monday, June 2, 2014

Codeforces Round #250 | Div. 1

Problem A

I will just give a hint for A. The idea is that disconnecting the graph requires cutting all the edges. Cutting an edge e(v1, v2) can cost either v1.value or v2.value. Of course, we want the cost to be the lower one, and we can always force it by some cutting sequence.

Problem B

This problem is very interesting for me. It took me a large amount of time in order to solve it. So suppose that x is the minimum number of animals among the routes areas from p to q. If we set the cost of each edge to be the minimum value between its ends, x will be equal to the minimum edges' cost along the path. f(p, q) is the maximum value of x's among all simple routes between p and q. It looks really hard if we consider all the routes. We have to calculate the sum value of f(p, q) for all pairs p, q (p ≠ q). But actually this sum is equal to c(e₁) + c(e₂) + ... + c(e_m) where c(e_i) is e_i.cost ✕ the number of all pairs p, q whose f(p, q) is e_i.cost. To find c(e_i), we need to start with the largest edge. If e(u, v) is the largest edge, f(u, v) is surely equal to e.cost. Then we put u, v into the same set. Then move to the second largest edge and so on. Notice that for each edge e(u, v), f(p, q) = e.cost where p ∈ the set of u and q ∈ the set of v, so c(e) = e.cost ✕ |u| ✕ |v|. Then we union u and v. Doing this to all the edges in descending order of their weights will give you the sum value of f(p, q) for all pairs p, q (p ≠ q). And the best data structure that can do this very efficiently is Disjoin Set.

Problem C

One can easily realize that this problem is a dp problem. The question is: what is the best way to do dp? Well let dp[i][j] = the number of ways to split the polygon p[i], p[i + 1], ... ,p[j] into non-degenerate triangles. Note that i can be greater than j and that will represent a polygon p[i], p[i + 1], ..., p[n], p[1], p[2], ..., p[j]. How can we calculate dp[i][j]? We can try splitting the polygon(i, j) by a non-degenerate triangle(i, k, j) where k is between i and j. We need to check that triangle(i, k, j) doesn't cross the entire polygon. So dp[i][j] will be the sum of dp[i][k] ✕ dp[k][j] for all valid triangles(i, k, j). By doing this, we don't have to encounter double-counting problem, because we split the polygon by different triangles(i, k, j).

Problem D

This problem is basically a classic problem that can be solved by Segment tree if we don't have the second type of operation. Let's think about the second type of operation (mod). Does it really increase the difficulty of this problem? If we handle the second operation by simply mod-ing down the tree, we can stop when the maximum value of all the values in a subtree is less than the mod value. If x < mod, x % mod = x. Another fact is that mod-ing significantly decrease a value. Even if we mod everything down the tree, their values will become less than some mod value pretty fast. By these two facts, we can just solve this problem straightforwardly by using the normal segment tree. Each tree node keeps two values: sum (the sum of this subtree) and max (the maximum over all the leaves of this subtree).

Problems: Codeforces Round #250 | Div. 1 | Problems

Lesson learnt:

Checking that a line connecting between two corners doesn't cross the entire polygon and is inside the polygon can be easily done by checking if the sum of the areas' parts separated by this line is still equal to the area of the entire polygon
Getting a polygon(start, end) can be handled by the following code (this is a nice way to deal with a circular array):
```
vector<Point> v;
for (int i = start; i != end; i = (i + 1) % N) {
    v.push_back(poly[i]);
}
v.push_back(end);
```
When you find that you're stuck in a thought process or you keep repeatedly thinking about the same idea, you should stop thinking, step back. You might want to re-read the problem statement again, think slowly about the big idea to solve the problem, and consider the simplest case. By doing this, you might end up with a new working solution. I noticed that sometimes I can fall into my own thought process and can be misled by something that won't ever direct me to the intended approach to solve a problem. Also, it's worth taking a short 10-second break.

Saturday, May 31, 2014

Google Code Jam 2014 | Round 2

Today GCJ 2014 Round 2 contains 4 problems. The top 500 contestants will advance to Round 3. In this round, at least 50 points with small penalty time are required to pass this round. And yeah! I got 413^rd ranking. This was my first time that I got a GCJ t-shirt and advanced to Round 3.

let's talk about the problems. The first problem (A: Data Packing): given N files with their capacities and disc capacity, what is the minimum number of discs are required to store all the files with 2 conditions:

A disc can contain only at most 2 files.
Each file can't be divided to stored separately in different discs.

This problem was the easiest one of this round. An O(nlogn) solution can get accepted. My greedy solution is to loop from the smallest file a and for each of them, find the biggest file that can be put into a disc along with this file a. If we can't find one, just put a into a disc. We repeat until no files left. A nice data structure for doing this is multiset (C++), because we can call lower_bound() and there might be duplicate file capacities.

The second problem (B: Up and Down): give a list of integers, we would like to rearrange the sequence to an up and down sequence (one where A₁ < A₂ < A₃ <...< A_m > A_m-1 >...> A_n for some index m). We can rearrange them by swapping two adjacent elements of the sequence. The problem asked you to find the minimum number of swaps needed to accomplish this. First my first attempt to solve this problem was to find where we should place A_m (it's obvious that A_m is the largest element of the sequence). There are n possible positions. We can try moving A_m to each position, then find the number of inversions of the left part and right part of the sequence. This idea was completely wrong. We can't prove anything that doing this will give us the minimum number of swaps. After thinking for a while, I came up with a solution which is starting from the smallest element first. We know that the smallest element need to be on either leftmost end or rightmost end. We can just pick the one end that is closest to the smallest element's position (Whether moving it to the left or right end doesn't affect the remaining sequence. So it's better to move it to the closest end). That's it! we do the same thing to the other elements.

After finishing the first two problems, I only had 1 hour left. So I changed my strategy to just solving the small inputs on the last two problems. I will give brief ideas of how to solve the small inputs on the last two problems. The ideas are quite straight-forward to me. For C: Don't Break The Nile , we want to find the maximum flow of water from the south side to the north side of the river. Each grid cell can carry only 1 unit of water to its adjacent cells. Thus, let's build a network flow graph.

For each cell, we create 2 nodes called lower node and higher node, and have an edge with capacity = 1 going from the lower to the higher node. If the cell contains a building, set the edge's capacity to 0.
Now create edges between the adjacent cells: connect the higher node of a cell to the lower node of each adjacent cell with an edge of capacity 1.
Make a new node called source and connect it to all the lower nodes of the south side cells.
Make a new node called sink and connect it to all the higher nodes of the north side cells.

That's all. Then the answer to the problem is the maximum flow on this graph. This method, of course, is too slow to pass the large input set that the height of the river can be up to 10e⁸.

Now, come to the last problem (D: Trie Sharding). Given M strings, we would like to divide it into N smaller groups and make a trie data structure for each group so that the total number of nodes used to make tries is maximized. So we want to find the maximum number of nodes of all possible group arrangements. For the small dataset, M = 8, N = 4, we can just bruteforce all the possible ways to make groups, make tries for each of them. We finally return the smallest number of nodes of all the possible ways of dividing up the strings into groups. This is all about implementation. For the large dataset M = 1000, N = 100, I still have no idea how to solve it.

This post is a bit long, but that's it for GCJ2014 Round 2. I really like the problems which I can categorize them to be in a hard level for me, but improving requires solving hard problems :). I'm planning to join Round 3 as well, but before that we should read the analysis of this round and try coding it up!

Problems: Code Jame 2014 | Round 2 | Problems
Scoreboard: Code Jame 2014 | Round 2 | Scoreboard

Now that, I'm starting to write blogs seriously in order to summarize what I learn from programming competition. I think it's better to rethink about problems and how well I do during each contest. This can help me prevent repetition of the same mistake. Writing can make ideas solid so that when I see the similar problem again, I can pick up the idea pretty fast. Furthermore, it might be able to guide others about the ideas to solve problems. Thanks to this blog which motivated me to start writing blogs again.

Menu bar