Solving the Circle City Con 5.0 regex crossword challenge (Part 1)

Circle City Con runs a CTF. This year, I was the first solve of the Regex Crossword.

Post challenge, I was encouraged to describe how I solved it.

But first, the challenge.

If you’ve never seen one before, this is an entirely opaque problem. It is a regex crossword. Many examples of this exist at https://regexcrossword.com. Of the categories listed on the site, this would be a double cross problem.

Solving a regex crossword is usually best done by working from the outsides in. The matches for a given space are going to be most constrained by the character classes and grouping near the outside of each regex. When hunting matches, it’s best to start with positive character classes. The top line, right regex,

(.CTF|C.TF|CT.F|CTF.).?[^flag]?(\(|\[|\{)[AEIOUY]

had a very strong match sequence, showing me that in the first line I would be finding the characters ‘C’, ‘T’, and ‘F’ in that order.  Further more, there would be one of the set ([{ followed by one of the set AEIOUY.  The CTF sequence would have to come before it, and there would be one other character inside that sequence.

A reminder, ? means zero or 1 of the preceding sequence.  This means the sequence .? is anything maybe.

A less commonly known operation is character set inversion.  The match [^flag] means anything that’s not the characters in the set flag.  The following ? means zero or one copies of it.

Combine the above sequence with my knowledge that the crossword puzzle is only 6 characters wide, and I have a greatly reduced search space.  .? and [^flag]? both drop out to zero instances of them and I’m left with matching

(.CTF|C.TF|CT.F|CTF.)(\(|\[|\{)[AEIOUY]

From that, one needs to start eliminating options, so start working through the column expressions in the same way.

I started from the left most column solves, looking for something that would constrain the CTF sequence.

.?(Lt2P|B0T|NuL|m0H)(P|o|p)+[^a-m,n-y,z](\\\|v|//)

[IN|CCC]?[^A-Z]+\w[p4SsW0RdHa5h]*.?

Well, that’s not helpful.  Both of these regex’s start with ? blocks that may or may not exist, and both could match C, or a wide variety of anything else.  Next!

[FL4IR]*(M|0|R|3)*([abcde])(1|3|5|7|9)\2

(L|4|V|A)*[FluBb3R]+\w[^aeiou]

Okay, here’s a more interesting case, but still difficult to deal with.  The leading match classes in both cases have the * (zero or more) operator applied to them.  The only fixed sequence in the first string is ([abcde])(1|3|5|7|9)\2 .  That is an interesting sequence however, as it uses back references.  I know that the first and third characters of that sequence will match.

Alright, on to the third column.

\w+(SG|Sg|sG|sg)[jobs][WITH][hammers]

.?[vegas]+[G0LD]+[overnumerousness]+[\\\W]\h*[garbage]?

Again, not much to work with.  I could tell that I was going to lead with a ‘word’ character (\w+) one or more times.  .? is again something of ‘might exist’.  This is getting frustrating.  On to column 4.

[fail|FAIL]?[variableS^*]*\d(2|5|Z|S)+[^cone].+

[A-F,I-L](e|d|b)[^0-2,3-5,6-8][A-Z]+[encode|decode][\\\|\(|\)|\-|\=]

Well, finally, something that’s vaguely constrained.  The second regex has 6 character classes, with one which is marked as 1 or more  (+ operator).  The crossword has 7 spaces there.

At this point, I choose to make an assumption.  I choose to assume that these Regex’s are designed to consume the entire string, which is not a requirement generally.  I make that choice because otherwise, I’m going to end up with almost no constraints given the number of ?, +, and * operators that are in this problem.

With that assumption in place, I can see that the first Character is going to be an A or an F.  The overlap of the sets [fail|fail] and [A-F,I-L].  Looking back to the row one reduced regex,

(.CTF|C.TF|CT.F|CTF.)(\(|\[|\{)[AEIOUY]

I can see F would be valid in position 4 no matter what.  I declare that to be what I’m aiming at, and write it down.

I can also take a pretty serious stab at the second character of the column, as the set of (e|d|b) overlaps with [variableS^*] in one case, b.

Continuing on, I have column 5’s regex’s

(\(|\[|\{)[GOODLUCK]*\b?[52SZ]?(l|m|n|o)\D+

(.)+[^!LABEL]?[p-q,a-l,m-z]([SiGnS]*)(\1|.)

Well, great, I kinda knew that already, that it was going to be one of the set of ([{, and of course (.) matches anything.  The back reference operator \1  at the end is amusing, but is parred with an | operator of a . , thus anything could match there.

Assumption, most of the flags in this game were in the form CCC{something} . The problem indicated that the flag was in a non-standard format, but you would know it when you saw it.  I just started assuming that this character was a { and moved forward.

Column 6 regexes are:

[AEIOUYaeiouy0][Version2.0]*[Out|In]?[Blashpemy]+[V|^|v|/\]+\D?]

(M|0|N|E|Y)[VERIFY]+[^a-z0-9,.][birdlaw]+(\)|\})

Okay there, here’s something I can work with.  The first character match is the intersection of [AEIOUYaeiou0] and (expressed as a character set) [M0NEY].  The intersection of those two sets is [EY0].  My first row matching final set is [AEIOUY], which reduces the set to [EY].  Now the second match for first row has the restriction in the final group of (Y|O|U).  One overlap, which is the character Y.  At least one character isn’t ambiguous.

The solve at this point looks like this

???F{Y
???b??
??????
??????
??????
??????
??????

Well, that’s not much to work with.  Time to reprocess the column 1 -3 lines with my expanded assumptions.

.?(Lt2P|B0T|NuL|m0H)(P|o|p)+[^a-m,n-y,z](\\\|v|//)

[IN|CCC]?[^A-Z]+\w[p4SsW0RdHa5h]*.?

C could match either one of these as the leading character.  Sadly, so could anything that’s a capital letter.  So, this either matches on C.TF or .CTF for the row regex.  Can I eliminate C from the second column?

[FL4IR]*(M|0|R|3)*([abcde])(1|3|5|7|9)\2

(L|4|V|A)*[FluBb3R]+\w[^aeiou]

Again, given the assumption that the match will consume the entire line, I can tell that the first character matched on the second column will be in the set [FL4IRM0R3] which is the full matching set of the first two clauses of the first regex match, as each is followed by a *, zero or more matches is possible, so either could be the leading character.  What’s not in that list?  C.  C can’t be in the second position, so it must be in the first.  Thus we are matching on either C.TF or CT.F as the opening string.

At this point, I also learn that T can’t be in that location, which means I’m matching the string C.TF as the prefix, and I can fill in more of the solve matrix.

C?TF{Y
???b??
??????
??????
??????
??????
??????

Great, now what is the content of the second column.

[FL4IR]*(M|0|R|3)*([abcde])(1|3|5|7|9)\2

(L|4|V|A)*[FluBb3R]+\w[^aeiou]

If I assume that the * groups will consume at least one character on the match, The intersection of the two results in [L4].

The regex for row one demands a digit in the second location

[^foxtrot]\d[^qwertyUIOP](F|4|1|L)\W(Y|O|U)

which reduces the set to 4.

The solve matrix now looks like this

C4TF{Y
???b??
??????
??????
??????
??????
??????

This string matches the left and right row one regex’s, and provisionally matches all of the column regexes.

Problems when solving this problem:

  • The font selected to write the regex’s in was miserably hard to differentiate upper case L from upper case I from lowercase l. The difference between lowercase l and an uppercase I was about 2 pixels.
  • Not knowing if the regexes are anchored to the edge of the puzzle makes this much, much harder.  See the assumptions I’ve needed to add.
  • Extensive use of * and ? radically increase the search space.

Thus ends part 1, where I push out the partial work done as an introduction.  Part 2 completes the walk through of the solve.

Packing lighter for travel

or “I regret ever ounce I am carrying. I think this is working correctly.”

On my most recent trip, I decided to carry hand luggage only. Specifically, bags that have no wheels. And to limit it even more, when I left, I was only carrying one bag.

Within a day, I regretted packing all this stuff. I’m choosing to use this as a learning experience.

On this trip, I packed the following ‘gadgetry’:

  • Laptop (usb-c)
  • Tablet with keyboard (usb-c)
  • Kindle (micro-usb)
  • Cell Phone (usb-c)
  • Backup Cellphone (usb-c)
  • Laptop Charger (usb-c)
  • Two port phone charger (usb-c)
  • Cabling to connect all of the devices to both chargers and other devices

What I should have packed:

  • Laptop (usb-c)
  • Cell Phone (usb-c)
  • Backup Cell Phone (usb-c)
  • 2 Laptop chargers (usb-c)
  • cables to connect laptop to phones (usb A to C)

The tablet and kindle have been entirely unneeded. The kindle has been light enough, but the tablet is heavy. Ultrabook laptop heavy. In addition, the cabling to both charge and connect these devices is heavy. Finally, the two port phone charger is entirely redundant, and does not charge the laptop while it’s being used, only while the laptop is sleeping or off.

The laptop charger is capable of charging the phones and the laptop. At night, the single charger can charge the laptop and the phone through the laptop.

The only time the tablet and keyboard for the tablet become viable is if you are using it as a complete replacement to the laptop. I expect I can make this a reality eventually, but not right now.

Contingencies:

An astute reader may have noted that I have a backup cell phone listed above. The core element of packing light that I need to get my head around is that your backup is money. The vast majority of problems you encounter can be solved with money. This is, to say the least, uncomfortable for me. I carry a backup cell phone because I can’t walk into any store and get a replacement Google Fi device. I can, however, carry a spare cell phone with me that I can bring online with Google Fi rapidly.

Having a large number of devices that charge on USB-C makes the backup charging problem easier. If I just carried two laptop chargers, everything can charge off of them, and I don’t have to worry about one of them breaking at an unexpected time. I’ve seen good reviews regarding Dart but those appear to be unobtainable right now.

On the contingency front, I need less clothing than I think. I have at least 2 too many shirts with me, and too many pairs of boots. You need one pair of good boots. That’s it. Also, learn to dress and operate in layers, rather than alternate clothing. Two t-shirts is easier and more versatile than an t-shirt and a flannel shirt.

Just one notebook is enough. I like notebooks that can also double as file folders to protect papers, something hard backed, that way it’s one less item to carry.

I went uncomfortably light packing this trip, and I think I can make it even more uncomfortably light, and enjoy the results, with just a little more work. Better planning next trip, I suppose.

HAProxy, Chrome, tcp-preconnect and error 408

This is my guide to making haproxy work reasonably with the deployment model I’m constructing.

In broad strokes, I’m looking to have haproxy running on a host which is acting as an http and https front end for a large number of web servers living on a private network, each hosting an individual web site.

But first I had to make haproxy work with the modern web browser Chrome.

Starting with default settings led to a series of werid failures, where I would either get flashes of grey blank pages while loading, or error pages loaded when I clicked on content that I knew existed. Of course, this wasn’t generating logs on the webserver, and the logs from haproxy were equally weird, with nothing showing up that was temporally aligned with the action that created the result.

I knew this couldn’t be a just me problem yet. That generally takes me a few more hours of digging into a corner nobody else has ever gone into.

So off to google I go and type “haproxy chrome” and it suggests “haproxy chrome 408”. Well, that’s an omnious sign.

The first hit is https://www.haproxy.com/blog/haproxy-and-http-errors-408-in-chrome/ which starts to explain what’s going on.

Related is this Chromium issue: https://bugs.chromium.org/p/chromium/issues/detail?id=377581

A brief aside. Standing around with provisional sockets open burns resources on the server and the client, all for a few round trips of latency in client reaction. Well, I can totally understand reducing client latency by 1.5 round trips, but getting broken pages back, when I click links is a terrible result. Additionally, responding with a error when you’ve never seen a request is a wonderfuly grey area of the http spec, as noted in the chromium issue.

Why would google think this isn’t a big deal? Based on @dakami’s Defcon presentation from 5ish years ago, he was scanning the internet and realized that only one side of the connection actually had to keep state. His high volume scanner decided to be the side of a tcp link that wasn’t keeping state. This worked great, until he scanned google, where he discovered that google isn’t actually tracking state for tcp either. TCP doesn’t work right when neither party is tracking state. That said, there’s the reason provisional connections aren’t considered a big deal. If your tcp stack doesn’t keep state for clients on the internet, there’s no cost for having them open provisional sockets. So, it all hangs together, I don’t know that it’s offical reasoning.

This commment is enlightening as well: https://bugs.chromium.org/p/chromium/issues/detail?id=377581#c47
The implication I see in this is that you’ve added a new race condition, the question of what’s a ‘fresh socket’. That’s likely a timeout of some sort, or an event of some sort, but that’s not clear.

Right then, what to do about it.

As of December 2017, when I’m writing this, HAProxy 1.7.9 has the following configuration option:
https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#option%20http-ignore-probes
which fixes most things.

Is this perfect? No, the hiding of timeouts in the logs can cover up all sorts of errors and hides actual activity. Additionally, I experience flashes of the grey error page when Chrome has to re-connect to the server because it used an already closed socket.

It seems Chrome will only sit on a provisional socket for 5 minutes, so if you are willing to carry the load of idle client sockets for 5 minutes, I’d suggest also setting your timeout client to more than 5 minutes. Current chrome will open up to 6 sockets to a server. So scale the number of sockets you are willing to let sit idle accordingly.

That said, that still can create failures, because if your server timeout isn’t as long, the live client connection might be attached to an already timed out server socket, thus requiring the whole error 408 thing again. Thus you need to set your server timeouts to be just as long, and that means possibly holding open sockets on the servers for the same length of time. I suspect this is one case of what is being covered in the manual by “it is highly recommended that the client timeout remains equal to the server timeout in order to avoid complex situations to debug.” – https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#4.2-timeout%20server

Summary:
If you are running small websites ( 1 server or less of load, and less than 100 simultanious clients), I suggest the following settings.

timeout server 302s
timeout client 302s
timeout connect 10s
option http-ignore-probes

Other things that I learned along the way:
The first work around was to make the error 408 page /dev/null, resulting in no data being forwarded to chrome and triggering a reload. That still works to some extent.
You can get haproxy to serve single static small pages by declaring them to be a backend at a particular url on error 200. This is a terrible plan.