14 December 2023 | Challenge 247 |
Weighing and Counting Pairs
Task 1: Secret Santa
Submitted by: Andreas Voegele
Secret Santa is a Christmas tradition in which members of a group are randomly assigned a person to whom they give a gift.
You are given a list of names. Write a script that tries to team persons from different families.
Example 1
The givers are randomly chosen but don't share family names with the receivers.
Input: @names = ('Mr. Wall',
'Mrs. Wall',
'Mr. Anwar',
'Mrs. Anwar',
'Mr. Conway',
'Mr. Cross',
);
Output:
Mr. Conway -> Mr. Wall
Mr. Anwar -> Mrs. Wall
Mrs. Wall -> Mr. Anwar
Mr. Cross -> Mrs. Anwar
Mr. Wall -> Mr. Conway
Mrs. Anwar -> Mr. Cross
Example 2
One gift is given to a family member.
Input: @names = ('Mr. Wall',
'Mrs. Wall',
'Mr. Anwar',
);
Output:
Mr. Anwar -> Mr. Wall
Mr. Wall -> Mrs. Wall
Mrs. Wall -> Mr. Anwar
Solution
Frankly, I don’t understand the task. The formulation
…a script that tries to team persons…
leaves room for for too much interpretation and I see some inconsistency in the examples.
In the first example there are pairs where A hands a gift to B and vice versa.
In the second example there is an odd number of persons. Irrespective of any family memberships there are no pairings covering all persons. The solution looks completely different than the first. There are no longer pairs, but a cycle of three instead.
To get over these inconsistencies, I’d like to add some specifications:
- There shall be pairs of persons that hand a gift to each other.
- For an odd number of persons, there is no solution.
- With an even number, we try to pair persons from different families.
- Failing that, we finally may pair persons from the same family.
Sadly, this interpretation does not agree with example 2.
Further assumptions:
- Families may have one or more members.
- Persons belong to the same family if the last parts of their names are the same.
With these presuppositions the task may be regarded as maximum weight matching. Using a weight of one for pairs from different families and zero within the same family we come to a complete graph where (for an even number of vertices) a maximum cardinality matching always exists.
This problem is known to be solvable in polynomial time, though the known algorithms are rather complicated. Luckily there is Graph::Matching on CPAN that will get the job done.
There are just two steps:
- Create a weighted, undirected graph with the persons as vertices and an edge between all pairs of persons with a weight of zero if both belong to the same family and a weight of one otherwise.
- Find a maximum cardinality matching having maximum weight for the constructed graph.
As the inner-family matchings do not contribute to the weight of the matching, they will be chosen only for cardinality if there is no better solution.
These steps may be implemented as:
use Math::Prime::Util 'forcomb';
use Graph::Matching 'max_weight_matching';
sub same_family {
my ($x, $y) = map +(split)[-1], @_;
$x eq $y;
}
sub secret_santa (@names) {
return {} if @names % 2;
my @graph;
forcomb {push @graph, [@names[@_], 1 - same_family(@names[@_])]} @names, 2;
+{max_weight_matching(\@graph, 1)};
}
See the full solution.
Task 2: Most Frequent Letter Pair
Submitted by: Jörg Sommrey
You are given a string S of lower case letters 'a'..'z'
.
Write a script that finds the pair of consecutive letters in S that appears most frequently. If there is more than one such pair, chose the one that is the lexicographically first.
Example 1
Input: $s = 'abcdbca'
Output: 'bc'
'bc' appears twice in $s
Example 2
Input: $s = 'cdeabeabfcdfabgcd'
Output: 'ab'
'ab' and 'cd' both appear three times in $s and 'ab' is lexicographically smaller than 'cd'.
Solution
There are a few steps to the solution:
- Find all consecutive letter pairs.
- Lexicographically sort the pairs.
- Count the number of repetitions for each pair.
- Find the index of the maximum count.
- Pick the pair at the found index.
Here PDL::Char comes handy.
We may solve example 1 step by step in the PDL shell.
First load PDL::Char
and toggle ‘print-by-default’:
pdl> use PDL::Char
pdl> do_print
1
Construct a PDL::Char
object:
pdl> $s = PDL::Char->new('abcdbca')
'abcdbca'
Then use lags to build the character pairs:
pdl> $p = $s->lags(0, 1, $s->dim(0) - 1)
[ 'ca' 'bc' 'db' 'cd' 'bc' 'ab' ]
Perform a vector sort:
pdl> $sv = $p->qsortvec
[ 'ab' 'bc' 'bc' 'ca' 'cd' 'db' ]
Runlength-encode the sorted pairs.
pdl> ($count, $pairs) = $sv->rlevec
$PDL_Char1 = '' ;
$PDL_Char2 = [ 'ab' 'bc' 'ca' 'cd' 'db' '' ]
;
As the $count
ndarray inherits the class PDL::Char
from $sv
, it needs to be converted to a numerical type:
pdl> $count->long
[1 2 1 1 1 0]
Find the index of the most frequent pair.
pdl> $maxind = $count->long->maximum_ind
1
Pick the result:
pdl> $pairs->atstr($maxind)
bc
Everything put together in a sub:
use PDL;
use PDL::Char;
sub most_frequent_pair {
my $s = PDL::Char->new(shift);
my ($count, $pairs) = rlevec($s->lags(0, 1, $s->dim(0) - 1)->qsortvec);
$pairs->atstr($count->long->maximum_ind);
}
See the full solution.
If you have a question about this post or if you like to comment on it, feel free to open an issue in my github repository.