Steve’s Blog

Just another compsci weblog

Abstract factories: creating objects from strings

Occassionally I find myself in a situation where I’d like to instantiate objects whose types and parameters are given in a text file. Usually it’s the case that all the objects derive from a common base class, and so it’s not difficult to write a function that parses a set of parameters and passes them to the appropriate constructor, assuming that all constructors take the same set of parameters. This assumption, however, is not usually true in practice. So then the simplest solution becomes that each constructor should take as input a string to do its own parsing. This scheme is often wasteful because an integer parsed in one constructor is typically identically parsed in every other constructor (and similarly for other types). So it would be convenient if we had one function that parsed an arbitrary set of parameters and passed them to the appropriate constructor, automatically. This would eliminate code reuse which would reduce the risk of creating bugs. Finally, with variadic templates, this is possible in C++.

The key to this approach is constructing the function, which we’ll call applystream, that reads variables off an input stream and calls an arbitrary function with those variables. The trick here is, as is often the case with variadic template functions, to handle one argument and recursively call the function with the remaining arguments. Each time we will create a lambda function so that we may curry the call our arbitrary function (which will be a function calling a constructor, but the code is general so it could be anything). Note that I’ve assumed that we want any bare pointers wrapped in shared_ptr for

// base case
FuncRet applystream(std::function f, std::istream & in) {
        return f();

FuncRet applystream(std::function f, std::istream & in) {
	// Pull argument from the stream.
	// You must have the function istream& operator>>(istream&, Arg1) defined
	// for all types in Args... (this is always the case with primitives).
        Arg1 arg1;
        in >> arg1;

        if(in) {
		// If the stream is still valid, construct our lambda function.
		// The lambda function g takes n-1 arguments. It calls the
		// function passed in, f, with the curried argument arg1.
                std::function g =
                        [f, arg1](Args... args) -> FuncRet
                                { return f(arg1, args...); };
		// We pass the reduced function, g, to applystream recursively.
                return applystream(g, in);
        } else {
		// otherwise signal an error
                throw std::runtime_error("stream error");

All we need to do is call applystream with the appropriate version of make_shared to construct the object. I’ve written a small example file that demonstrates a completely generic factory class that instantiates objects from an input stream using applystream. The factory is simply an abstract base class (to enable storage in a map or other stl container) plus a templated child class. By instantiating the child class, the compiler has everything it needs to know about how to create the class — you can simply use the function call operator to create a new instance based on an input stream. The link to the code is here.

No comments

Making Boost.Tokenizer More Useful

This is a continuation of my last post. In it I discussed how I’d like to use the TokenizerFunctions I developed with input iterators. As it works now, however, this is impossible because InputIterators do not guarantee that if iter1 == iter2 then ++iter1 == ++iter2. To solve this, I need to write some sort of iterator wrapper that will buffer the input for me. I mentioned that boost::spirit has an iterator called multi_pass which does this, but I couldn’t get it to work. So I wrote one myself. The difference between my buffered_iterator and boost’s multi_pass is that my iterator keeps track of all references to the original iterator (assuming they’re made through copies of alternate constructor) and can reduce the size of the buffer automatically when there are no iterators pointing to the beginning. The effect of this is that when you’re tokenizing a large sequence using an ifstream, the entire buffer is never loaded into memory at one time (of course with today’s computers this optimization isn’t necessary most of the time but whatever).

To facilitate this functionality, the buffered_iterator class has two constructors: An alternate constructor and a copy constructor. The alternate constructor accepts the Input Iterator as an argument and allocates a structure to be shared across all copies of the iterator. The code looks like this:

template< class IterT, class IterDerefT >
class buffered_iterator : public std::iterator< std::forward_iterator_tag, IterDerefT, ptrdiff_t, IterDerefT*, IterDerefT& > {
	struct refcounter {
		refcounter( IterT iter ) : iter_( iter ), buffer_( std::list< IterDerefT >( 1 ) ), refs_( std::list< int >( 1, 0 ) ) { }
		IterT iter_;
		std::list< IterDerefT > buffer_;
		std::list< int > refs_;

	explicit buffered_iterator( IterT iter ) 
		: refcounter_( new refcounter( iter ) ), bufferiter_( refcounter_->buffer_.begin( ) ), refiter_( refcounter_->refs_.begin( ) ) { alloc( ); ++*this; }
	buffered_iterator( const buffered_iterator< IterT, IterDerefT > & x ) 
		: refcounter_( x.refcounter_ ), bufferiter_( x.bufferiter_ ), refiter_( x.refiter_ ) { alloc( ); }
	~buffered_iterator( ) { dealloc( ); }

	refcounter * refcounter_;
	typename std::list< IterDerefT >::iterator bufferiter_;
	std::list< int >::iterator refiter_;

The refcounter structure will hold the original input iterator, as well as the data buffer and reference buffer. The reference buffer stores the amount of iterators pointing to the corresponding data buffer locations. Therefore, the data buffer and reference buffer are always the same size. Say that we have a data buffer equal to { ‘A’, ‘B’, ‘C’ } (assuming IterDerefT = char) and we have two iterators pointing to ‘C’ in the data buffer. That means that the reference buffer would look like { 0, 0, 2 }. Since we’re only allowing forward iteration on this iterator, we know that ‘A’ and ‘B’ will never be referenced again, so we can get rid of them. And that’s what the minimize method does:

void minimize( ) {
	// remove any portions of the beginning of the buffer that aren't referenced
	for( std::list< int >::iterator iter = refcounter_->refs_.begin( ); iter != refcounter_->refs_.end( ) && *iter == 0; ) {
		refcounter_->buffer_.pop_front( );
		refcounter_->refs_.pop_front( );

The minimize method gets called whenever the iterator is modified in some way (iterator is destructed or operator++ is called). All that’s left now is to fill in the required methods (operator++, operator==, operator=, etc.,). I want to make one final note that we can’t use the boost::tokenizer class any more because it saves the starting iterator you pass it (and therefore the buffer would never be minimized). Instead, we have to use boost::token_iterator_generator. I’ll let you check it out in the code listing.



Making Boost.Tokenizer Useful

Recently I began writing a scripting language and interpreter. To this end, I needed to develop a scanner. I had a brief look at Boost.Tokenizer but the build in TokenizerFunction objects were unhelpful. I ended up implementing my own basic scanner but recently I came back and took a second look at Boost.Tokenizer. Inside is actually a useful library waiting to get out; you just have to implement your own TokenizerFunction objects. Let’s start with a basic one:

class find_str_tfunc {
	find_str_tfunc( const std::string & str ) : str_( str ) { }
	bool operator( )( std::string::const_iterator & next, std::string::const_iterator end, std::string & tok ) const {
		std::string::const_iterator iter = str_.begin( );
		for( ; *iter == *next && iter != str_.end( ); ++iter, ++next );
		if( iter == str_.end( ) ) {
			tok = str_;
			return true;
		} else {
			return false;
	void reset( ) { }
	std::string str_;

This basic TokenizerFunction attempts to match a string with the beginning of the sequence. Pretty boring and useless so far. Let’s see another:

class find_ident_tfunc {
	bool operator( )( std::string::const_iterator & next, std::string::const_iterator end, std::string & tok ) const {
		std::string::const_iterator begin = next;
		if( *next >= 'A' && *next <= 'Z' || *next >= 'a' && *next <= 'z' ) {
			for( ; ( *next >= 'A' && *next <= 'Z' || *next >= 'a' && *next <= 'z' || *next >= '0' && *next <= '9' ) && next != end; ++next );
			tok = std::string( begin, next );
			return true;
		} else {
			return false;
	void reset( ) { }

Slightly more interesting. This time we are trying to match an identifier with the beginning of the sequence (in our case, an identifier is defined by the regex [A-Za-z][A-Za-z0-9]*). Let's do one more:

class find_integer_tfunc {
	bool operator( )( std::string::const_iterator & next, std::string::const_iterator end, std::string & tok ) const {
		std::string::const_iterator begin = next;
		if( *next >= '0' && *next <= '9' ) {
			for( ; ( *next >= '0' && *next <= '9' ) && next != end; ++next );
			tok = std::string( begin, next );
			return true;
		} else {
			return false;
	void reset( ) { }

This should be self explanatory if you could follow the previous examples. So where do these get us? Nowhere. We need a TokenizerFunction which will combine these somehow. This brings us to multi_tfunc:

template< class TFunc >
class multi_tfunc {
	template< class X >
	multi_tfunc( X begin, X end ) : tfuncs_( begin, end ) { }
	bool operator( )( std::string::const_iterator & next, std::string::const_iterator end, std::string & tok ) const {
		for( ; *next == ' '; ++next ); // skip whitespace

		std::string longest;
		for( typename std::list< TFunc >::const_iterator iter = tfuncs_.begin( ); iter != tfuncs_.end( ); ++iter ) {
			std::string::const_iterator nextcopy = next;
			std::string result;
			if( (*iter)( nextcopy, end, result ) && result.size( ) > longest.size( ) ) {
				longest = result;

		if( longest.empty( ) ) {
			return false;
		} else {
			std::advance( next, longest.size( ) );
			tok = longest;
			return true;
	void reset( ) { }
	std::list< TFunc > tfuncs_;

So how does this work? First we skip whitespace. Then, for each TokenizerFunction in our list, we create a copy of the next iterator and execute that TokenizerFunction. But why do we want the longest token from all of the TokenizerFunctions? Consider the following two tokens in the C language: "while" and "while123". In this case, "while" is a token for a while loop, whereas "while123" is an identifier. By always choosing the longest token, we will avoid this problem.

There is one final problem: std::list< TFunc > is a homogeneous container. That is, we cannot add different types of TFuncs. To get around this, we need to create some helper classes:

class tfunc_polym {
	virtual bool operator( )( std::string::const_iterator & next, std::string::const_iterator end, std::string & tok ) const = 0;
	virtual void reset( ) = 0;

template< class TFunc >
class tfunc_polym_derived : public tfunc_polym {
	tfunc_polym_derived( TFunc tfunc ) : tfunc_( tfunc ) { }
	bool operator( )( std::string::const_iterator & next, std::string::const_iterator end, std::string & tok ) const {
		return tfunc_( next, end, tok );
	void reset( ) {
		tfunc_.reset( );
	TFunc tfunc_;

class tfunc_polym_adapter {
	tfunc_polym_adapter( tfunc_polym * tfunc ) : tfunc_( tfunc ) { }
	bool operator( )( std::string::const_iterator & next, std::string::const_iterator end, std::string & tok ) const {
		return (*tfunc_)( next, end, tok );
	void reset( ) {
		tfunc_->reset( );
	tfunc_polym * tfunc_;

Now we can use multi_tfunc with TFunc = tfunc_polym_adapter and we will be able to add any of the TokenizerFunctions we have just created. Let's put it all together:

int main( ) {
	std::list< tfunc_polym_adapter > tfuncs;
	tfuncs.push_back( new tfunc_polym_derived< find_str_tfunc >( find_str_tfunc( "while" ) ) );
	tfuncs.push_back( new tfunc_polym_derived< find_str_tfunc >( find_str_tfunc( "(" ) ) );
	tfuncs.push_back( new tfunc_polym_derived< find_str_tfunc >( find_str_tfunc( ")" ) ) );
	tfuncs.push_back( new tfunc_polym_derived< find_str_tfunc >( find_str_tfunc( "{" ) ) );
	tfuncs.push_back( new tfunc_polym_derived< find_str_tfunc >( find_str_tfunc( "}" ) ) );
	tfuncs.push_back( new tfunc_polym_derived< find_str_tfunc >( find_str_tfunc( ">" ) ) );
	tfuncs.push_back( new tfunc_polym_derived< find_ident_tfunc >( find_ident_tfunc( ) ) );
	tfuncs.push_back( new tfunc_polym_derived< find_integer_tfunc >( find_integer_tfunc( ) ) );

	std::string s = "while( identifier123 > 5 ) { }";
	boost::tokenizer< multi_tfunc< tfunc_polym_adapter > > tok( s, multi_tfunc< tfunc_polym_adapter >( tfuncs.begin( ), tfuncs.end( ) ) );
	std::copy( tok.begin( ), tok.end( ), std::ostream_iterator< std::string >( std::cout, "\n" ) );

	return 0;



This is a pretty simple example but it should show you how to expand on this. There are a few improvements I'd like to make on this. The first improvement I'd like to make is to make the basic TokenizerFunction types i've created more general. That is, they should be able to accept any type of iterator. Specifically, I'd like to be able to use an istream_iterator to read data directly from a file. However, istream_iterator is only an input iterator and so the copy that I perform in multi_tfunc would not work properly. To get around this I would need to create some sort of buffered istream_iterator. Second, I'd like to cut down on the ugliness of constructing the boost::tokenizer object. Although with the new auto keyword in C++0x we might not have to worry about that.

The complete code listing is here: tfunc.cpp

Update: It looks like Boost has beaten me to the punch when it comes to one of the problems: The multi_pass iterator, which converts an input_iterator into a forward_iterator, can be used to solve the iterator copy problem I described above.

No comments

Ray Tracer: Part I

It has been quite a while since I posted on here, but I haven’t had a great deal of free time lately. Since I’m graduating in May, I’m trying to get applications in for grad school which means I have to figure out a) what I want to do and b) where I want to go. And that’s aside from taking the general GRE and studying for the CS GRE.

But anyway, I’m taking a computer graphics course this semester, the final project of which is a ray tracer. And I’ve always wanted to write one so I decided to get a head start on the project. I’ll be writing here as I refine pieces of the code that are post-worthy.

The obvious place to begin is with ray-geometry intersection code. I’ll skip ray-sphere intersections since you can find information on it everywhere. More interesting, in my opinion, is a ray-convex polyhedron intersection, mostly because many programs (Q3Radiant, Hammer) save level data in that format (which are then compiled into BSPs – more on spatial partitioning soon).

So how do we test collisions against convex polyhedra? Simple! Err, sorta. Actually, the method I use is one that I read from an article, Quake 3 Collision Detection, which I used long before I understood how it works. So I’m going to try to explain it as best I can, and provide a much cleaner version of the code.

Convex Polyhedra

First of all, what is a convex polyhedra? Lets define it simply as a set of planes in 3-space which form a closed volume. Lets look at a 2d example:

We begin with one plane, represented by the black line. The flag coming off of the plane is the normal and the area behind it is the volume, shaded in gray. Then we add a second plane, however the intersection of the two volumes is still infinite. Finally a third is added. The intersection of the volumes behind the three planes gives us a closed, finite volume, which will be the volume of the polyhedron (just imagine it in 3d). Note that we are not storing any vertex, edge, or face information; only a list of planes.

Ray-Polyhedron intersection

The basic idea is that we test each plane to see if the ray intersects with it. There are three cases that we need to consider: 1) the ray starts outside of the polyhedron and intersects with its volume, 2) the ray starts outside the polyhedron and doesn’t intersect with its volume, and 3) the ray starts inside the polyhedron.

How can we differentiate between these three cases while we’re intersecting each plane? First we determine which side of the plane the ray starts on. If the ray starts in front of the plane (the side in which the plane normal points), then we’ll call it a front intersection, otherwise we’ll call it a back intersection. Here are illustrations of the most frequent scenarios:

In the first picture, we can see two of the front intersections that the ray makes, colored in blue. The second picture shows two back intersections, colored in green. Finally, the third picture shows a ray that does not intersect with a volume. Note that while the ray does not intersect the volume, it does intersect the planes.

Notice how in the first picture, the front intersection that lies furthest from the origin of the ray also lies on the surface of the volume. In the second, the back intersection which is closest to the origin of the ray lies on the surface of the volume. When we loop through each plane, we want to remember the furthest front intersection and the closest back intersection. After all the planes have been iterated over, we compare the furthest front intersection with the closest back intersection. If the furthest front intersection is farther than the closest back intersection, the we have the scenario in image C, and there is no intersection. If there are no front intersections at all, the ray starts inside the polyhedron. Otherwise, we can simply return the furthest front intersection as point of intersection.


I believe this method has a few advantages over storing a list of vertices and faces, including space requirements. A polyhedron needs only to store a list of planes, and a plane can be stored as a normal, and the distance of the plane to the origin (4 floating point number total). On the other hand, storing a list of faces means storing three or more vertices per face as well as normal information. I also think this method is faster than doing intersections against a face and then testing to see whether or not that point lies within the face, but I could be wrong.


Doesn’t support concave polyhedra.


You’ll notice that in the code, the class for a convex polyhedron is simply called a brush. This name is taken from some of the editors I mentioned above, where they got the name I have no idea but its a lot shorter than polyhedron. The Intersect method returns a struct that contains a variable called raydistance, which is set to infinity if there is no intersection. It has a different meaning than when the method is called with a line segment: it returns a number 0 < x < 1 such that x * ( line.start – line.end ) + line.start is the point of intersection. It may seem strange to include a method to intersect with a line segment in a ray tracer, but it will later replace the ray method when I post about kd-trees.

Also, this is the biggest bottleneck of the ray tracer. If anyone can see any possible optimizations, let me know.

Point.h | Plane.h | Plane.cpp | Brush.h | Brush.cpp

No comments

Blackjack Optimal Strategy Calculator

So during my trip to Las Vegas for Black Hat 2008, I played my first game of blackjack at a casino. I had turned on the TV in my room earlier that day and happened to see a program that was explaining basic strategy in blackjack. The basic strategy is this: since ten is the most common card value that will come up, assume that every card you don’t see has a value of ten.

This is pretty simple and I was getting the hang of playing – even getting help from the dealer when I made an incorrect move. However, one situation that kept coming up was when I was delt sixteen and the dealer was showing a ten. All week I was wondering what the correct move is for this scenario, so I decided to develop a program to tell me. I didn’t get around to writing it until the plane ride home, and even then I had to rewrite it twice before it started producing the correct output, but I’m pretty sure it’s correct as it is almost identical to the optimal playing strategy found on Wikipedia and other websites.

The method behind it is actually very simple – there are six main functions: player-hit, player-stand, player-turn, dealer-hit, dealer-stand, dealer-turn.

The player-hit function returns the probability of the player winning if the player hits on the current hand. It does this by calling player-turn for each possible card that could be delt to the player and averaging the results.

The player-turn function returns the probability of the player winning in his current situation. First it evaluates whether the player is over 21 – if so, it returns zero indicating that the player loses. Otherwise, it finds the probabilities of winning if the player should hit and if the player should stand, and returns whichever is higher.

The player-stand function returns the probability of the player winning if the player accepts no more cards for this game. All it really does is call dealer-turn.

The dealer-hit function works exactly like the player-hit function.

The dealer-turn function works similarly to the player-turn function, except that the dealer has strict rules as to whether they can hit or stand – if the dealer is 16 or under, or has 17 with an ace, the dealer must hit. Otherwise, the dealer stands.

The dealer-stand function checks to make sure the dealer isn’t over 21, then it tests to see who has a higher hand value. The function returns 1.0 if the player has a higher hand, 0.5 for a draw, and 0.0 if the dealer has a higher hand.

The value of each hand is stored in two numbers: one number is the sum of all the cards in the hand (with aces holding a value of 11), and the other is the number of aces in the hand. This is done so that some aces can be counted as having a value of one if the hand goes over 21.

And that’s basically it. You can use the probability-table function to produce a table of every starting blackjack scenario which will tell you the optimal move – hit or stand. Note that the table function takes a long time to run. Also, the program assumes an infinite deck, which obviously won’t be the case for any real situation.

Blackjack Optimal Strategy Calculator


Black Hat 2008 & DEFCON 16

Back story

By some unexpected fortune last week I found myself in Las Vegas attending Black Hat & Defcon. Apparently there was some attempt by the heads of my office to get clearance from E&Y so that I could go, but that didn’t work out with me being an intern. I was notified about a week before Black Hat that someone was suddenly unable to go and now there was an extra ticket so I would need to make travel plans.

Cut to Monday morning; I roll out of bed at about 4:30am. After showering and throwing clothes on, I’m ready to be picked up at 5:00 to get to Newark airport for my 7:50 flight. I get a call around 5:05: the driver of my car has two flat tires, but they’re going to try to find another car in the area. The minutes crawl by until around 6:00 I get another call from the driver saying that he’s on his way. Since it takes about an hour to get to Newark airport, I narrowly made it through security in time. If the plane wasn’t slightly delayed, this trip would have started off very poorly.

We spend Monday in Houston for our ‘all-hands’ meeting with the Houston team, where many people on the team give presentations on various security-related topics. Because the hurricane was coming we decided to all catch a late flight Monday night to Las Vegas. By this time I was beat, so I checked into the hotel and went to sleep.

Spent Tuesday hanging out – later that night I gambled at a casino for the first time playing blackjack (more on that later).

Black Hat: Day One

Wednesday was the first day of Blackhat which started off with a keynote speech by Ian O. Angell about how complexity in information systems leads to increased risk. He made numerous mentions of how combining human systems with technology systems is disasertous but while eloquent in his speech, I largely disagreed with his opinion. He mentioned in one portion of his speech that we have a desire to categorize things and seemed to insinuate that this is a bad thing. This is, however, how the human brain operates which leads me to believe that it is an efficient way of interpreting reality. Yes, it may not be proper in all possible scenarios, but what is? Overall, he seemed to resent technology which made me resent his keynote.

Later I went to Dan Kaminsky’s talk on the new DNS cache poisoning attack. I had already heard about the details of the attack, but at Black Hat he went over the extent of the damage that could be caused by this attack: just about everything. Originally one of my co-workers mentioned that for anything important, such as your bank’s website, you’ll see a signed SSL certificate ensuring that the website is legit. However, Kaminsky pointed out that Certificate Authorities validate certificate purchases by way of email. So if I control the DNS entry for, I’ll get the mail to When I get an email from the CA for my newly-signed SSL cert – bam! You think you’re at, you see a signed SSL certificate, but its all controlled by evil Eve. Pretty cool attack.

A cool but low-profile talk I saw was entitled ‘Return-Oriented Programming: Exploits Without Code Injection’ by Hovav Shacham. Intrigued by the title, I went to the talk and discovered how the author came up with a system for injecting not code, but instead injecting pointers to areas of memory which contained libc functions that when executed contiguously would exhibit malicious behavior. While no evil code was injected, evil behavior would be executed. He even showed how he could run a sort in a vulnerable program by injecting pointers to libc which was pretty cool.

Finally I went to a talk on how implantable medical devices have virtually no security due to battery constraints.

Black Hat: Day Two

On day two, one of the guys from the Houston office was giving a talk on ‘extreme client side exploitation’. Basically, the talk was about how the Same Origin Policy is flawed and how to exploit it using GIFARS. GIFAR is a term which comes from a file which combines a GIF and a JAR together such that the resulting file is a valid GIF and JAR (this is possible due to the way each of these files are parsed; GIFs from the top-down, and JARs from the bottom-up). On sites which you can upload your own images (facebook, myspace, flickr, etc.), you can upload a GIFAR which will live on the website’s server. By getting a user to view a webpage which has an <applet> tag that points to the GIFAR, the attacker can gain access to the private information of the user on that website.

There were some other cool talks but I think my favorite was the Quantum Spookshow. During the conference they had a room with two quantum key distribution setups inside. One setup had a web cam at one end which was sending the encrypted stream via a network connection. At the same time the cryptographic key is transmitted via photons. The receiver is able to determine if the photon transmission has been intercepted and retransmitted or not by the principals of quantum mechanics. The other setup uses quantum entanglement to transmit the key, but has high bandwidth constraints. However, it has the advantage of having a truly random key.

Defcon: Day One

Defcon was a much more relaxed conference and filled with all kinds of cool stuff. The badges for the conference were actually a device that you could connect an SD card to and transfer files to other badges via infrared. They also had a lot of physical security talks which Black Hat didn’t have. My favorite talk from Defcon was entitled, “Advanced Physical Attacks” by Eric Schmiedl and covered a wide range of topics. First the talk was about spies and how the CIA and other agencies recruit them. One of my favorite parts of the talk was how you can listen to the sound of someone typing, and determine what keys they were pressing.

I ended the day at Defcon by buying some cool merch and playing about three hours of Guitar Hero with some other cool geeks. Unfortunately I only got to stay for the first day of Defcon, but the whole trip was outrageously fun.

No comments

Minesweeper Solver in Lisp

Everyday I wake up for work at 6:00am and get home at around 8:00pm. Since this only leaves me with two hours of spare time I try to get something constructive done during both 70 minute train rides. As much as I try, however, a little Minesweeper always manages to get played. While playing the other day, I was examining a small subset of the grid when I noticed that when I calculated all the possible mine positions for that isolated subset, it became apparent that there was a spot that couldn’t have a mine (normally when I play, I only examine a single cell to determine where mines may or may not be located). It was during this game that I decided to write a Minesweeper solver.
And it works pretty simply. When you examine a particular cell containing a number in a game of Minesweeper, it tells you how many adjacent spaces contain mines. For example, if you were to come across this configuration:

  A B C
1|? 0 0
2|0 1 ?
3|0 ? 0

You would know that there is one bomb which lies in either location A1, B3, or C2. Later it will become useful to also note which spaces do not contain mines. This will break each possibility into two groups: one group contains the spaces which have mines, the other contains spaces which do not. So our example above would be stored as:

mine-group: (A1), not-mine-group: (B3 C2)
mine-group: (B3), not-mine-group: (A1 C2)
mine-group: (C2), not-mine-group: (A1 B3)

As it turned out, writing code to divide these three unknown spaces into two groups was the hardest part. The function which does it is called ‘GenCombo’, and the function works by being passed in a list of elements to be divided – (a b c) for example – and a list of the group sizes, which in the above case would be size one for the ‘mine’ group, and size two for the ‘not mine’ group. The function then recursively pops an element off the first list and appends it to a particular group.

Now that the program can determine the possible configurations of a particular cell, there must be a way to combine them in order to reduce the total number of mine configurations. The particular configuration that made me realize how to do this looked like this:

  A B C D
1|? ? ? 0
2|0 1 1 0
3|0 0 0 0

The possible configurations for the number at position B2 are as follows:

mine-group: (A1), not-mine-group: (B1 C1)
mine-group: (B1), not-mine-group: (A1 C1)
mine-group: (C1), not-mine-group: (A1 B1)

Lets call these combinations C(B2).

The possible configurations for the number at C1 are:

mine-group: (B1), not-mine-group: (C1)
mine-group: (C1), not-mine-group: (B1)

This is C(C1).

To combine these, we take each possible pairing for each possibility and check for contradictions. To demonstrate, lets combine the first two possible configurations. To do this, we simply combine both ‘mine’ groups together and combine both ‘not mine’ groups together.

  (mine-group: (A1), not-mine-group: (B1 C1))
+ (mine-group: (B1), not-mine-group: (C1))
= (mine-group: (A1 B1), not-mine-group: (B1 C1 C1))

As we can see, there is a contradiction in this example in that there is an element that appears both in the ‘mine’ group AND in the ‘not mine’ group. What about two other pairs?

  (mine-group: (B1), not-mine-group: (A1 C1))
+ (mine-group: (B1), not-mine-group: (C1))
= (mine-group: (B1 B1), not-mine-group: (A1 C1 C1))

Aha, there are no contradictions, making this a viable combination. In total, there is one other viable combination for this example:

mine-group: (C1 C1), not-mine-group: (A1 B1)

If one were to plug this input into the code, the output returned would be in this form:

	(		; possible configuration 1
		(B1)	; mine group
		(A1 C1)	; not mine group
	(		; possible configuration 2
		(C1)	; mine group
		(A1 B1)	; not mine group

As you can see, it’s an ugly parenthetical mess. However its actually very simple: The function returns a list of possible configurations. Within each of these configurations, there are two groups. The first is the ‘mine’ group, and the second is the ‘not mine’ group. Within each of the groups, there is a number of ordered pairs which points to a place on the game board.

And that’s basically it. The game board is stored as a list, so the above example would be stored as: ‘(? ? ? 0 0 1 1 0 0 0 0 0). A function goes through each element in the list and generates the possible configurations for each individual cell, which would look like this:

'(nil nil nil nil 
nil ( ((A1) (B1 C1)) ((B1) (A1 C1)) ((C1) (A1 B1)) ) ( ((B1) (C1)) ((C1) (B1)) ) nil 
nil nil nil nil)

Phew. Next, a function takes this result, removes the nils, and reduces the list using
the Combine2Possibilities function, which would produce the output listed above.

Despite the output of this function being very ugly, it’s pretty versatile. Using some additional printing code, you can display the output in a number of different ways. In the example code, the original game state is printed first. Following the calculations, a list of possible mine location sets is printed out. But this output is hardly useful, so after that is a list of definite mine positions – positions that appear in every possible mine location set, as well as a list of definite not-mine positions. At the bottom is the most readable of all: the original game state with all definite mines and not-mines added.

Minesweeper Solver
(Note that in the code locations are represented as an ordered pair, not A1, B1, etc,.)

At the bottom is an example of it’s output using the example data provided in the code. ‘n’ means ‘not a mine’ – it’s possible to calculate this value for certain positions (those with no surrounding ‘?’s) but I’ll leave that as an exercise for the reader :)

Board before solution:
        1 ? ? ? ?
        1 ? ? 3 ?
    1 1 2 ? ? ? ?
    1 ? ? ? ? 1 ?
    1 2 ? 3 ? ? ?
      1 ? ? ? ? ?
    1 2 2 1 1 1 1
    1 ? 1
    1 ? 1
Possible mine location sets:
((5 0) (8 0) (7 0) (6 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 0) (7 0) (6 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (6 1) (7 0) (6 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (6 1) (7 0) (6 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (7 0) (6 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (7 0) (6 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (6 1) (8 0) (6 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (6 1) (8 0) (6 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (8 0) (6 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (8 0) (6 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (6 1) (6 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (6 1) (6 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (6 1) (8 0) (7 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (6 1) (8 0) (7 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (8 0) (7 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (8 0) (7 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (6 1) (7 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (6 1) (7 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (6 1) (8 0) (3 3) (6 3) (5 3) (4 5) (7 5) (3 7))
((5 0) (8 1) (6 1) (8 0) (3 3) (6 4) (5 3) (4 5) (7 5) (3 7))

Definite mine locations:
((3 7) (7 5) (4 5) (5 3) (3 3) (5 0))

Definite not mine locations:
((3 8) (8 5) (6 5) (5 5) (4 4) (4 3) (6 2) (7 2) (8 2) (8 3) (7 4)
 (8 4) (5 1) (5 2))

Board after solution:
        1 X ? ? ?
        1 n ? 3 ?
    1 1 2 n n n n
    1 X n X ? 1 n
    1 2 n 3 ? n n
      1 X n n X n
    1 2 2 1 1 1 1
    1 X 1
    1 n 1


Recently, a SecurID came under my acquisition. For those unfamiliar, SecurID is a small device that fits on any key chain and displays a six digit number that changes once per minute. The six digit number it displays is based on a seed number that is created when the device is deployed and known only by the device (hopefully) and the party deploying the technology. The deploying party then keeps a big database of each person’s user name, SecurID serial number, SecurID seed number, and SecurID pin, which can be modified via an 800 number. When you want to authenticate to certain services, you provide your pin number, followed by the currently displayed six digit number on the SecurID. The authenticating party then uses the seed to calculate the number that should be on the SecurID at that time and compares it to the PIN and SecurID number you submit.

By doing this you are (theoretically) proving to the authenticating party that you are who you claim to be with two factors of authentication: something you know (your PIN) and something you have (your SecurID). This makes it more secure than only a single factor of authentication, such as a password. Since the SecurID is physical, a hacker would need to steal or tamper with the device to beat the additional authentication method.

In this case, however, the line between something you have and something you know is, in my opinion, a bit blurry. Because the device holds some information that makes it unique – it’s random seed – one could extract this information from the device where it can be emulated in software. Fortunately for SecurID, it’s never connected to a computer, ensuring that an attacker must have physical access to bypass this security measure. (There is however, the issue of the seed being stored in the authenticating server, which is a potential target of attack)

But once an attacker does get physical access, there is little stopping him, in my opinion, from fraudulently authenticating himself. Let’s say, for example, that an attacker acquires another user’s SecurID and username. If the deploying party requires a four digit PIN then there are 10,000 possible codes that can be entered. Lets assume that the deploying party has only allowed one incorrect login per generation cycle. This means that to brute force the PIN number, the maximum amount of time it would take is 10,000 minutes. 10,000 minutes * 1 hour / 60 minutes = 166.7 hours. 166.7 hours * 1 day / 24 hours = 6.94 days. Which means that at most you would need to wait for a week to successfully brute force a SecurID login, assuming you have the device and username with a one minute code generation interval on the device.

But what if I have to enter a six digit PIN? That’s 1,000,000 possible combinations. 1,000,000 minutes * 1 hour / 60 minutes = 16,666.7 hours. 16,666.7 * 1 day / 24 hours = 694.4 days. Hmm, that’s much better. However, the longer you make the required PIN, the harder time the average user is going to have trying to remember it. It’s also been shown that people remember words better than remember a string of numbers, which is why I think SecurID should use passwords rather than a PIN. However, the more characters that need to be entered during authentication, the more time an average user needs to input this information, so there’s a balance that needs to be achieved. Four digit PINs are good because it’s the same size as a bank PIN, and similar brute force deterrent mechanisms can be used, such as disabling login after n failed attempts.

It’s not that I think SecurID is a bad technology; quite the opposite. Instead, I think that ‘something you have’ is almost invariably ‘something you know’ (whether you know it, or the device does, it’s somewhere). This also goes for the third factor of authentication, which is ‘something you are’. What is a thumb print but information? (And something you are is immutable – if someone makes a copy of your thumb print and masquerades as you, what are you going to do? Get a new thumb print?)

That said, I think SecurID is a great technology with a slim chance of being exploited. RSA did a good job of making it unable to be connected to a computer, which eliminates the possibility of being able to remotely discover the device’s seed number.



So I recently finished a two week long program at Dartmouth called SISMAT (Secure Information Systems Mentoring and Training) where I learned about a plethora of computer security related topics. Overall I had a really great time and met some really cool people. Each day was structured with two lectures in the morning and one or two labs in the afternoon. The topics presented included web vulnerabilities, software vulnerabilities, sniffing/network analysis, ethics in computer security, and public key infrastructure. Going in I wasn’t expecting my Linux skills to improve, but they did with all the labs and the RTFM mentality of the instructors (and I mean that in a good way).

Starting tomorrow I begin part two of SISMAT; that is, I start my first day at my internship at Ernst & Young in New York City. I’m both excited and nervous to be working at the Advanced Security Center, although I think most of the nervousness is due to my unfamiliarity with New York city. I am, however, looking forward to learning during the month I will be interning since I have had plenty of experience developing web apps for The College.

No comments


Today is my birthday! Woohoo!

Currently I am at the SISMAT program at Dartmouth College learning about computer security, so I will hopefully be making a post about the program as well as some general musings on topic. More to come, stay tuned.

1 comment

Next Page »