Sunday, 13 July 2014

SELinux, sendmail, and delivering mail to programs

The thing that I most dislike about SELinux is that it rarely announces itself as the cause of failure.  This leads to a good deal of headscratching moments when things which "should" work do not, and it can take some time to even think that SELinux might be the culprit.  The following is a recent case in point, described in the hopes that it will save someone else some of the time that it cost me.

I recently had to get sendmail working on a machine running a (mostly up to date) version of CentOS with SELinux enabled.  After the usual wade through the morass of sendmail configuration options I had something which appeared to be mostly functional.  However, it was soon discovered that delivery to programs (i.e., the "|/path/to/mail_handler" kind of delivery) was failing with the unhelpful error message "unknown mailer error 126".

Searching suggested that this error message was supposed to mean that sendmail could not execute the program in question, but most suggestions for the reason involved smrsh, which was not a factor in this installation.  Basic system tools could be delivered to (e.g., "|cat >> mymail"), but not shell scripts that included those lines.

It took some time to discover that SELinux was a factor, and that disabling it enabled the delivery to successfully invoke the desired program.  Of course, the goal was presumably to keep SELinux enabled but also allow the mail delivery to work, so further investigation was needed.

To make a long story short, the issue turned out to be that sendmail did not have permission to execute user-created files due to them having the wrong security context.  In this case the home directories were NFS-mounted and so the handling scripts were all in the nfs_t domain, which sendmail lacked permissions for.  The workaround required was to:
  • Ensure that the programs being delivered to were all on a local filesystem, so that they did not have their domains overridden to be nfs_t; and
  • Explicitly give them a domain that sendmail could execute; in this case I found that bin_t worked.

Sunday, 8 June 2014

Somewhat Less Bruteforcing The Devil

Recently Félix Saparelli posted ("Bruteforcing The Devil") about using brute force to find a preimage for a simple hash function (associated Hacker News thread here).  In pseudocode, the hash function computes a 32-bit hash as follows:
function hash(str)
   h = 0
   for c in str
      h = (h << 5) - h + c
   return h
The goal was to find an acceptable input string which would produce a hash value of exactly 666 (the "devil" of the post's title).  The string had to have at most 30 characters, and consist only of ASCII values in a certain set (letters, numbers, and some punctuation characters; there was some slight corruption in his post due to a caret ('^') character, but a later code fragment made the proper set clear).  For the rest of this post I limit myself to considering strings of lowercase characters only, but the adjustments to larger character ranges are not overly difficult.

One approach is to simply generate lots of strings and hope to find a matching hash.  This is actually more feasible than it may sound: The number of possible hash values is a little over four billion, so we would expect to find a match every four billion hashes or so.  At a handwaved guess, the amortised cost of generating and hashing a ten character string in C should be well under a hundred instructions, so one would hope to find matches sooner than every couple of minutes even without being very clever.  (Indeed, the first ten matches from the simple program (Program 1, below) that I wrote to test this claim occurred at 143, 166, 235, 303, 348, 417, 440, 554, 577, and 645 seconds, so the average was close to a minute.)

This was the approach that Félix used, although he wrote in Rust instead of C and understandably took a performance hit from doing so.  (Quite a severe one, based on his figures, but I suspect a large part of that was the name generation since he aimed to generate readable strings.)  He ended up computing around 35 billion hashes and finding four matches, so he seemed to be somewhat unlucky.

Once can do much better than full brute force by exploiting the structure of the specific hash function.  We could instead write the iteration step as h = 31*h + c, which makes that structure clearer.  Indeed, a little thought shows that the hash function simply treats the string as a polynomial (with coefficients equal to the ASCII value of the characters in the string) and evaluates this polynomial at 31.  So our problem becomes to find polynomials which have the required evaluation, under the appropriate restriction on the coefficients.

Note that if we split our string S into two parts S1 and S2 then hash(S) = hash(S1)*31^length(S2) + hash(S2).  (Here '^' represents exponentiation, not the C xor operator.)  We can thus adopt the standard splitting technique: Suppose that we precompute hash(S2) for a large set of strings S2 of length L and store these in some structure easily indexed by hash(S2).  Then for a given S1 we can search for an S2 such that hash(S1)*31^L + hash(S2) = 666, which is to say hash(S2) = 666 - hash(S1)*31^L.  This is just a lookup on the precomputed hash value, and if successful results in a matching string S = S1 S2.

Using Program 1, searching for all matching strings of eight lowercase letters took around 2600 seconds.  With the splitting approach (Program 2), computing hashes for all four-character strings, the same search took just 0.04 seconds — around 65000 times faster.  Applying the same process to length 10 (hashing five-character strings of lowercase letters) took 1.2 seconds to find all 32755 such strings.

There is another option, though.  We want to find polynomials S (with coefficients in the required ranges) so that S(31) ≡ 666 mod 2^32.  This is a parametrised family of equations S(31) = 666 + k*2^32 for non-negative integers k.  If we fix a particular value of k, we can write the right hand side in base 31 and more-or-less read off the coefficients of S.  There is a slight wrinkle in that we want coefficients larger than 31, but it is not too hard to compensate for this from the base-31 representation.

Note that if we are after values of a specific length then we can computer appropriate bounds on k ahead of time, and the search space is bounded.  For instance, for strings of 8 lowercase letters k must lie in the range [643 .. 807], so we only need consider 165 possible values.  My simple version written in Magma (Program 3) takes less than a tenth of the time that the C version above did.  For the strings of ten lowercase letters, the same approach found all suitable values in under a second (0.94 seconds).  Obviously there is slowdown here due to Magma being interpreted; I would expect a C version of the same algorithm to be rather faster again.

The search space is, of course, still exponential, so there's still a good deal of brute force required.  Nonetheless, we can see from the above that we can avoid a great deal of that brute force with a bit of analysis.


Program 1 (C, searches all strings of seven lowercase letters, naïve brute force)
#include <stdio.h>
#include <stdint.h>

uint32_t hash(char *str)
{
    uint32_t    h;

    h = 0;
    for (; *str; str++)
        h = (h << 5) - h + (uint32_t)(unsigned char)(*str);
    return h;
}

/* searches through all lower-case options only */
void search(int len, uint32_t target)
{
    char        str[31];        /* large enough for now */
    int         i;

    for (i = 0; i < len; i++)
        str[i] = 'a';
    str[len] = '\0';

    do
    {
        if (hash(str) == target)
            printf("%s\n", str);

        for (i = len - 1; i >= 0; i--)
        {
            if (str[i] != 'z')
            {
                str[i]++;
                break;
            }
            str[i] = 'a';
        }
    } while (i >= 0);
}

int main()
{
    search(7, 666);
    return 0;
}

Program 2 (C, searches all strings of ten lowercase letters, meet-in-the-middle approach)
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

typedef struct
{
    uint32_t    hash;
    uint32_t    ident;
} entry;

int entry_cmp(const void *ap, const void *bp)
{
    const entry *a, *b;

    a = ap;
    b = bp;
    if (a->hash < b->hash)
        return -1;
    if (a->hash > b->hash)
        return 1;
    return 0;
}

void print_ident(uint32_t ident, int len)
{
    char        buf[10];        /* big kludge, but will be large enough */
    int         i;

    for (i = 0; i < len; i++)
    {
        buf[len - 1 - i] = 'a' + (ident % 26);
        ident /= 26;
    }
    buf[len] = '\0';
    printf("%s", buf);
}

int binsearch(entry *entries, uint32_t n, uint32_t target, uint32_t *first, uint32_t *last)
{
    int low, high, mid, midhash;

    low = 0;
    high = n - 1;
    while (low <= high)
    {
        mid = low + ((high - low)/2);
        midhash = entries[mid].hash;
        if (midhash < target)
            low = mid + 1;
        else if (midhash > target)
            high = mid - 1;
        else
        {
            for (low = mid - 1; low >= 0 && entries[low].hash == target; low--)
                ;
            *first = low + 1;
            for (high = mid + 1; high < n && entries[high].hash == target; high++)
                ;
            *last = high - 1;
            return 1;
        }
    }
    return 0;
}

uint32_t hash(char *str)
{
    uint32_t    h;

    h = 0;
    for (; *str; str++)
        h = (h << 5) - h + (uint32_t)(unsigned char)(*str);
    return h;
}

/* searches through all lower-case options only */
void search(int len, uint32_t target)
{
    char        str[31];
    int         i, halflen;
    entry       *entries;
    uint32_t    nentries, pos, pow31;

    /* For now, assume an even length */
    halflen = len / 2;
    nentries = 1;
    for (i = 0; i < halflen; i++)
        nentries *= 26;
    entries = malloc(nentries * sizeof(entry));

    pow31 = 1;
    for (i = 0; i < halflen; i++)
        pow31 *= 31;

    for (i = 0; i < halflen; i++)
        str[i] = 'a';
    str[halflen] = '\0';

    str[halflen-1]--;   /* simplify loop iteration */
    for (pos = 0; pos < nentries; pos++)
    {
        for (i = halflen - 1; str[i] == 'z'; i--)
            str[i] = 'a';
        str[i]++;

        entries[pos].hash = hash(str);
        entries[pos].ident = pos;
    }

    qsort(entries, nentries, sizeof(entry), entry_cmp);

    for (pos = 0; pos < nentries; pos++)
    {
        uint32_t        htarget, first, last, j;

        htarget = target - pow31*entries[pos].hash;
        if (binsearch(entries, nentries, htarget, &first, &last))
        {
            for (j = first; j <= last; j++)
            {
                print_ident(entries[pos].ident, halflen);
                print_ident(entries[j].ident, halflen);
                printf("\n");
            }
        }
    }

    free(entries);
}

int main()
{
    search(10, 666);
    return 0;
}

Program 3 (Magma, searches all strings of ten lowercase letters, search by k)
tostr := func<S | &cat [ CodeToString(x) : x in Reverse(S) ] >;

search := function(n, target)
    allones := (31^n - 1) div (31 - 1);
    minv := 97*allones - target;
    maxv := 122*allones - target;
    mink := Ceiling(minv / 2^32);
    maxk := Floor(maxv / 2^32);

    vals := [];
    for k in [mink..maxk] do
        S := Intseq(2^32*k - minv, 31, n);
        if &or [ x gt 25 : x in S ] then continue; end if;
        Append(~vals, tostr([ x + 97 : x in S ]));
    end for;

    return vals;
end function;

search(10, 666);

Tuesday, 22 October 2013

Goldfish Draft 2: Final Score

I worked through one sequence of play for the previous goldfish draft, giving a lower bound to the possible score.  I'm not fully satisfied with it, but it has become too cumbersome to work through the possibilities.

The basic idea is to get the [mtgcard Glint-Eye Nephilim] down, give it double strike with the [mtgcard Silverblade Paladin], use [mtgcard Gisela, Blade of Goldnight] and [mtgcard Anthem of Rakdos] to increase the damage, use the card draw from the first attack to pump the [mtgcard Glint-Eye Nephilim|Nephilim] for the second attack and [mtgcard Niv-Mizzet, the Firemind] for damage.  Throughout, [mtgcard Elbrus, the Binding Blade|Withengar] keeps growing, and most the extra cards get turned into land to fuel the card dump, and also to allow [mtgcard Ghave, Guru of Spores] to shuffle counters around, allowing the [mtgcard Goblin Sharpshooter] to do damage.  Finish off with [mtgcard Soul's Fire] and [mtgcard Sword of the Ages] for further damage.

The resulting score was 4,139,209,534 (details are available in this spreadsheet); this looked like it would be the best of the group for some time until one of the others found an alternative line of play which bumped his score up to 27 billion, a target which seems sufficiently far ahead that I'm not motivated to find a better line of play.  There are some obvious areas for potential improvement, however; starting a bit slower might allow [mtgcard Elbrus, the Binding Blade|Withengar] to grow earlier.  It should also let [mtgcard Chandra Ablaze] use her ability in the last round to re-cast [mtgcard Soul's Fire] and [mtgcard Titanic Ultimatum], with the latter also opening up the possibility of cycling a lot of damage through the [mtgcard Crypt Rats].  Certainly options worth considering, and I suspect that they would lead to an overall increase of score... but I'm rather doubtful that it could reach the required 27 billion mark.

Thursday, 12 September 2013

Goldfish Draft 2

I participated in a Goldfish Draft this week.  This is actually the second that I was part of, but the other one had nothing much of interest as far as my results went.  One of the other participants, though, managed a comfortable win thanks to a serious combination that ended up in him gaining over 10193 extra turns!

(I know I haven't made the post about the previous Goldfish Draft that I said I would.  I got bogged down in too many options, and frankly it looks unlikely that I will ever get back to it.  Oh, well.)

The draft went passably well, but the results never quite managed to hit high gear.  There's some nice combination potential but the key cards to make it all come together (a [mtgcard Stolen Identity] would have done very nicely) never quite materialised.  In retrospect there were a couple of poor drafting decisions also, although not that many.

Here is what I ended up with; first the A deck:

[mtgcard_image Fastbond] [mtgcard_image Glint-Eye Nephilim] [mtgcard_image Gisela, Blade of Goldnight]
[mtgcard_image Anthem of Rakdos] [mtgcard_image Chandra Ablaze] [mtgcard_image Elbrus, the Binding Blade]
[mtgcard_image Liege of the Tangle] [mtgcard_image Niv-Mizzet, the Firemind] [mtgcard_image Timberwatch Elf]
[mtgcard_image Liliana of the Dark Realms] [mtgcard_image Borborygmos] [mtgcard_image Titanic Ultimatum]
[mtgcard_image Soul's Fire] [mtgcard_image Norin the Wary] [mtgcard_image Goblin Chieftain]

Quite a lot of gold in that selection!  And this was the B deck:

[mtgcard_image Wheel of Fortune] [mtgcard_image Silverblade Paladin] [mtgcard_image Black Lotus]
[mtgcard_image Crypt Ghast] [mtgcard_image Sword of the Ages] [mtgcard_image Szadek, Lord of Secrets]
[mtgcard_image Extraplanar Lens] [mtgcard_image Ghave, Guru of Spores] [mtgcard_image Goblin Sharpshooter]
[mtgcard_image Underground Sea] [mtgcard_image Bayou] [mtgcard_image Goblin Bombardment]
[mtgcard_image Sarkhan the Mad] [mtgcard_image Searing Meditation] [mtgcard_image Crypt Rats]

I'm still working out the play sequence; I believe I'll exceed a million, and am hopeful of getting past the billion mark.  I've no idea how much higher, if at all, I can go.

Saturday, 7 September 2013

Risk

I played my first game of Risk today.  Well, it was actually some computerised knockoff, but as far as I can tell the rules were the same and the map almost identical (although the names differed).  It's never been a game I really looked at before; I'm mostly uninterested in multiplayer (i.e., more than two player) wargames as a result of early experiences with Diplomacy, and one even worse one with Machiavelli that would have been the last time I played games of that kind.  I'm not temperamentally well-suited to games where breaking alliances is a key part of the play.

(I also have issues with games where much of the play lies in convincing players to adopt the courses of action that you wish them to.  I'm all for setting up a situation through gameplay that shapes the desirability of their options, but it ceases to be fun for me when such facets are overshadowed by the persuasiveness of others (or myself).  I enjoy tactics and strategy, but not negotiation and persuasion.)

This was a three-person game, and fortunately (from my point of view) there was no real attempt to do much negotiation.  We had a random start, each claimed a continent (more or less), then fought out the rest.  I was fortunate enough to win as a result of a couple of things, in particular being able to claim two continents early on with a very small combined border (this was where the map differed from the official one, and I think it is a flaw in the version I played).  The other part was an overly cautious player adjacent to one side of that, enabling me to get away with glass cannon tactics far longer than I should have.

I don't want to comment on the quality of the game so much -- although obviously it plays fairly well -- but rather on the difference that I feel the computerised version made.  None of us had played it before (whether physically or on computer) so we were all on even status there.  We each got caught out by some interface issues, not understanding the attack rules and troop transfer mechanism at first.  If we have been playing a boardgame as such we could have simply worked out what was going on and resolved it, but since here the computer was in control we had to live with the consequences of these errors.  This may well have been ultimately significant, as that player I described as being overly cautious was, I believe, caught out by this several times, finishing their move early as a result of selecting troop movement.

But the difference that struck me afterwards -- and maybe I'm wrong about this, but it feels right -- is how little real investment there was in the results of battles.  It was pretty much just "click-click-click", with win or loss following.  Whereas if we were clustered around the board, dice clutched tight as we tried to infuse them with winning energy, making each individual roll... I think it would have been a lot more engaging.  By streamlining that process away the game was quicker and easier to play... but I think it also lost a good deal of player involvement as a result.  There was still the overall board position to consider and play for, but the narrative of the battles was lost.

I'm not sure that anything could be done about this, mind you.  But it feels like something to keep in mind when producing a computerised version of a board game: What aspects of the experience are being lost, and if they are important, is their some way to provide a similar experience?

Monday, 24 June 2013

Goldfish Draft: Tinkering

In my previous post I gave a possible score that I might have had if presented with the same choices as Andrew in his Golfish Draft (as detailed in his posts starting here).  This post follows up on some issues related to that; namely: What was I alluding to with the [mtgcard Vesuvan Doppelganger] shenanigans, and how would things have gone if I had drafted the key card in the deck (the [mtgcard Yew Spirit]) after all?

There's a further question of what would be the best options with the benefit of hindsight (or prescience, if one prefers).  I certainly can't pretend to have a definitive answer to that, but I will give one possible result in the next post.  As a teaser, I invite you to consider the opening play of [mtgcard Black Lotus], [mtgcard Show and Tell], [mtgcard Omniscience], [mtgcard Serra Avatar], and [mtgcard Garruk, Primal Hunter].  Sure it's a five-card combination, but it pretty much allows you to play 23 of your other 25 cards at once.  That may or may not be optimal, but it's certainly a powerful way to start.

As for the other matters... not choosing the [mtgcard Yew Spirit] was an extremely poor decision on my part (one could argue that there were several others, but that was the worst of them).  It was so broken if it could be made to work that it was not sensible to ignore it that second time, particularly given that I did not expect to make effective use of the [mtgcard Stream of Life] that I chose instead.  Anyway, with that in hand the play would essentially go as before except for the final turn; this time [mtgcard Ajani, Caller of the Pride] would summon the lions between the two combat damage stages (if you want to be technical about it, after the first damage resolution and associated affects and before the second combat damage phase begins) and then the [mtgcard Yew Spirit] would be pumped as much as possible.  I make the end result somewhere in the vicinity of 10^10^30.8, or just 30.8 in Andrew's revised scoring metric.

Now for the [mtgcard Vesuvan Doppelganger|Vesuvan Doppelgangers]: What I had been thinking earlier was that rather than having just copies of [mtgcard Roiling Horror] around, it might be feasible to use [mtgcard Trostani, Selesnya's Voice] during upkeep, then switch a [mtgcard Vesuvan Doppelganger|Doppelganger] to copying [mtgcard Trostani, Selesnya's Voice|Trostani] instead.  That new copy could then also create a token creature, and so it would go; each time the life total more than doubles, at the expense of being able to attack with one less creature that turn.  That turns out to be definitely worth it provided there are always at least three [mtgcard Roiling Horror]s that will attack, and there are more than that in this case.

That first plan ran into the stumbling block that [mtgcard Trostani, Selesnya's Voice|Trostani] is legendary.  But even then I thought it was still worthwhile acting under the new "Legend Rule" -- we could choose to get rid of the original and so still use the new one, and the doubling machine runs happily along.  The end creature count is the same so the difference is simply between front-loading the doubles or attacking with more creatures.  The doubling is always better, and the limiting factor in how well this works is the availability of white mana for the token creation.  It's a lovely generation engine, really, as the [mtgcard Stolen Identity] keeps us supplied with both tokens and mana (thanks to [mtgcard Gaea's Cradle]) so that we can keep pace with the other factors.  In this case I think it ended up being around 30 extra doubles, which would probably push the end result over 10^10^36.

However, that new Legend Rule only comes into play starting July 13, 2013.  This draft, and indeed this post, happened before then so this cannot work as envisaged.  It is possible that a variation of it still can -- if two [mtgcard Vesuvan Doppelganger|Doppelgangers] both copy [mtgcard Trostani, Selesnya's Voice|Trostani] then the first to resolve will trigger the legend auto-destruct on both.  What I don't know is whether the second resolution then fizzles as the creature to be copied is no longer present, or whether the creature data is "already set".  I suspect that it ends up fizzling based on similar rulings, but I'm not sure.  I also haven't worked out whether this is worth it; it's a process that loses creatures, unlike the other version, so I imagine it is not.

Anyway, that's what I was thinking, and come mid-July that engine will work quite nicely.  But not yet.

Update: An email exchange with one of the other players in the draft drew my attention to the fact that the [mtgcard Stolen Identity] is cast when the ciphered copy's ability triggers.  That means that it can interact profitably with the [mtgcard Ink-Treader Nephilim], causing exponential growth in the number of copied creatures rather than linear.  We lose the initial boost from the [mtgcard Rite of Replication] (which I had chosen instead of the [mtgcard Ink-Treader Nephilim]), but it is plausible that this is quickly overcome.

... Well, not so quickly, but inevitably.  The slowdown is actually the loss of [mtgcard Trostani, Selesnya's Voice|Trostani] -- since it is legendary, the first time the massive copying happens it disappears.  That removes some copies, but far more importantly it also removes the life gain each time a creature appears.  Still, exponential growth must win out given enough turns; in this case I make the resulting score at least 1.6*10^45, or with the [mtgcard Yew Spirit] somewhere over 10^10^40 (I've not thought about this carefully, but that feels plausible as a lower bound).