Clojure Transducers

A co-worker pointed me to an article about Clojure and transducers:

http://conscientiousprogrammer.com/blog/2014/08/07/understanding-cloure-transducers-through-types/

He wasn’t sure what the purpose was, as the explanations have all been quite technical. Here’s my attempt at an explanation.

Disclaimer: I don’t know clojure, but I know haskell fairly well and read those articles on transducers and think I understand them. To start, you need to understand common higher order functions like map or filter. Map takes a transform function as an argument and a list as an argument, applies the transform function to each item in the list, and returns the results accumulated into a list. Filter takes a predicate function and a list, and returns a new list consisting only of items for which the predicate function returned true. There are many other common high level functions; learning haskell will expose you to them, and you will eventually realize that in most cases the “for” loops you write in procedural code are just verbosely reimplementing these higher order functions over and over again. Note that I just defined map and filter on a list container, but you could implement them (and other common higher order functions) for other types of containers as well: for example a tree structure. Going further, you could generalize the concept of map and filter even more so that these functions could be applied to any type of container, which is done by making up a sort of interface function that you’d need to implement for each type, and then the generic map/filter would work on it. There are various approaches to this, and he’s calling his “transducers”.

The concepts he’s laid out start with a class of function he’s calling a reducer, which is just a function that takes a container plus an item to add to the container, and returns the new container. You can imagine writing a simple reducer that just takes a new item and an existing list and adds the item to the end of the list. You could write other basic reducers for other containers. This is his interface function that allows generalization of map/filter/etc across multiple types of containers. The class of functions he’s calling transducers are the generalized map/filter/etc and take a reducer function as input (plus usually other inputs) and return a new reducer function. So in place of a list or other container in a normal map/filter/etc argument list is a reducer function. His generic map function is a “transducer” that takes a transform function as an argument and a reducer as an argument. His generic map outputs a reducer function that applies the transform function before passing piping the result to the input reducer function. That is, the resulting output reducer function will automatically transform input items before adding to its container. His generic filter function is a transducer that takes a predicate function as an argument and a reducer as an argument. It outputs a reducer function that tests an input against the predicate and, if it passes, pipes the input through the reducer (thus adding the item to the container), otherwise is a no-op reducer (leaves the container unmodified).

So instead of operating directly on containers like other generalizations of map/filter/etc, the transducer concept operates on functions that modify the containers. It’s another step removed, essentially acting as a way to defer the actions on the containers while still encapsulating the operation you’re performing. In practice, you also need to define a function to actually execute all the reducers. For a simple list, this probably means starting with an empty list and then applying each reducer in order. However, it could be whatever you want. This makes the transducer approach useful versus traditional generalizations because traditional map/filter/etc still imply some ordering of operations on the container. He’s interested in asynchronous operations, which may require operations on the container in some other ordering. Traditional generalizations abstract away the high level function from the container. With transducers, you also abstract the application of operations from the high level function.

Fun with Haskell

Haskell is a great programming language and I have a lot of fun programming in it. I’ve been working fiddling around with a simple prototype for a board-game style game (more details on that later) and was figuring out how to draw a line connecting some positions on the game board. Specifically, I needed to figure out how to go from a calculated path to a list of line segments to draw. So, turning a list like this, where each element is a coordinate pair:
[(1,1), (2,3), (4,4)]
Into a list like this, where each element is a line start/end pair:
[((1,1),(2,3)), ((2,3),(4,4))]
This is a super easy problem to solve in any imperative language, just iterate through the elements making a connecting list, but Haskell is fun because you get to come up with awesome one-line solutions that don’t have any iteration:

zip x (tail x)

where x is your list. Done. Hooray!

Gran Turismo 5

GT5 is great fun. I keep checking the “Used Car” dealership looking for gems. The challenges spice up the game. The car physics are finally in a passable state. My one, huge complaint: 5 seconds of load time to move between any two menus in the game. If you’re in the main menu and you bring up the race submenu: 5 seconds. If you then go back to the main menu: 5 seconds. If you go to the options menu: 5 seconds. Argh.

Functional Refactoring in C++

Consider the following C++ code snippet:

bool UNIVERSE::Initialize(const std::string & assetpath,
	const std::string & shippath, 
	const std::string & behaviorpath,
	MODELMANAGER & modelmanager,
	std::ostream & info_output, std::ostream & error_output)
{
	if (LoadAssets(assetpath, 
		modelmanager, 
		error_output))
	{
		info_output << "Loaded universe assets successfully" << std::endl;
	}
	else
	{
		error_output << "Error loading universe assets" << std::endl;
		return false;
	}
	
	if (behaviors.Load(behaviorpath,
		error_output))
	{
		info_output << "Loaded AI behaviors successfully" << std::endl;
	}
	else
	{
		error_output << "Error loading AI behaviors" << std::endl;
		return false;
	}
	
	... etc ...
	
	return true;
}

We have a function that performs some initialization and returns true on success or false if it fails. Internally the function calls various loading functions. There's some obvious code duplication here; for each load function we call we do almost exactly the same thing to handle success or failure. In the future we might want to change what we do on success or failure, and it'd be nice to only have to change the code in one place instead of after every load call. How can we remove this duplication?

We could make OnLoadSuccess and OnLoadFailure functions and use them like so:

if (behaviors.Load(behaviorpath,
	error_output))
{
	OnLoadSuccess(info_output, "AI behaviors");
}
else
{
	OnLoadFailure(error_output, "AI behaviors");
	return false;
}

This is better, but there's still some duplication and it's not as generic as it could be; for example, if we wanted to display a "retry" message box to the user and allow them to retry a particular load step, we couldn't do that without changing all of the places where we called OnLoad*. We can go further.

We could make some variadic LOADFUNCTION macro, but that could get pretty ugly.

We also could refactor all of our loading functions to conform to some standard interface so we could call them in a loop. That's not a bad way to go, but it turns out that we can get similar genericity without having to do that.

In functional programming, it's very common to pass around functions, and it's also easy to do partial function application (currying), which means taking a function object with multiple arguments and supplying some of those arguments, yielding a new function (called a closure) that takes only the arguments you didn't supply. We can do something similar with the TR1 functional extensions. The std::tr1::function class lets us use function objects, and the std::tr1::bind function lets us create something like a closure. We can bind all of the necessary arguments to our load functions and then pass them to a GenericLoad function as a zero-argument function object that returns a boolean. Here's what this looks like:

bool GenericLoad(std::ostream & info_output,
	std::ostream & error_output, 
	const std::string & name, 
	std::tr1::function  f)
{
	if (f())
	{
		info_output << "Loaded " << name << std::endl;
		return true;
	}
	else
	{
		error_output << "Error loading " << name << std::endl;
		return false;
	}
}

bool UNIVERSE::Initialize(const std::string & assetpath,
	const std::string & shippath, 
	const std::string & behaviorpath,
	MODELMANAGER & modelmanager,
	std::ostream & info_output, std::ostream & error_output)
{
	if (!GenericLoad(info_output, error_output, "universe assets",
		std::tr1::bind(&UNIVERSE::LoadAssets, this, 
			assetpath, 
			modelmanager, 
			std::tr1::reference_wrapper(error_output))))
		return false;
	
	if (!GenericLoad(info_output, error_output, "AI behaviors",
		std::tr1::bind(&behaviortree::TREES::Load, &behaviors, 
			behaviorpath,
			std::tr1::reference_wrapper(error_output))))
		return false;
	
	... etc ...
	
	return true;
}

Is this a win? We've removed a bunch of redundant code and made our loading much more generic. The syntax, though, is absolutely horrible. In languages with first-class functions and a lot of syntactic sugar (like Haskell), the syntax is often neater for the case where we use the generic loader. In C++, an argument could be made that the generic loader is actually harder to maintain in some cases due solely to how confusing and muddled the syntax is.

Screenspace Ambient Occlusion

I finally got on the SSAO bandwagon and have my own implementation up and running. It’s mostly the implementation here, but with some tweaks from this (similar) implementation. It’s pretty slow on my 7900GT, but any recent card shouldn’t have too much trouble with it. Here are screenshots without AO and with AO, for comparison. There are some artifacts around the headlights, but that’s due to the odd geometry tessellation in that spot.

Haskell

I’ve tried to learn Haskell about 3 separate times, and it’s just now, after reading this tutorial, that it seems to really be taking. It was apparent to me immediately that Haskell was extremely powerful, but frustrating because ultimately the only reason I bother to learn any programming language or technique is because I want to use it to do something. The other Haskell documentation/books/tutorials I’ve read all just seem to delve into syntax and more reasons WHY haskell is powerful, but totally miss the point of “here’s how you build a program in Haskell.” Which, for me anyway, is the whole point. The Haskell community is a little frustrating too, as many times they seem content to write research papers on how Haskell could be applied to a problem instead of actually building a program with Haskell.

I also found this reference helpful, but despite its name, I only found it useful as a reference for syntax to complement the other tutorial.

What’s so cool about Haskell? Contrary to a lot of tutorials, the reason it’s cool has nothing to do with this:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
Sure, that’s a compact way to write a function that generates the Fibonacci numbers, but compact, confusing one-liners are not impressive to me (sorry perl).

This is why Haskell is cool:
calc :: String -> [Float]
calc = foldl f [] . words
where
f (x:y:zs) "+" = (y + x):zs
f (x:y:zs) "-" = (y - x):zs
f (x:y:zs) "*" = (y * x):zs
f (x:y:zs) "/" = (y / x):zs
f xs y = read y : xs

This is the entire and complete definition of a Reverse Polish Notation calculator. Haskell has an awesome strong static type system, first-class functions, and all kinds of generic polymorphism. All those add up to being able to do things that are really hard or complicated in other languages in a generalized, compact, and clear way. In addition, it compiles to native binary code and is therefore quite fast.

Baking Ambient Occlusion Maps on the GPU

Despite the trend toward screen-space ambient occlusion, static ambient occlusion maps are handy for improving the realism of non-deformable objects like cars or spaceships for free. However, baking them in Blender is a chore because raytracing a proper solution using the CPU takes ages. I’ve been toying with the idea of using the GPU to speed this up for a while now, but only recently figured out how to put it all together and make it work. Here’s how it works at a high level.

I iterate over the following steps many times from different camera positions along a hemisphere (to emulate the sky). First pass, I render the object from the camera position, writing depth values to a depth texture. This will be used to cast shadows in the next step. Second pass, I render the object again but this time I have my vertex shader set up to transform the verts into the UV map’s layout. However in the fragment shader, I check the expected (non-uv-map-layout) vertex position against my depth texture to determine occlusion. I write the value of the z component of the screenspace normal (or zero if I determined the pixel was occluded using my depth texture) to a color texture. Third pass, I accumulate the value of the color texture (times 1 / number of iterations) into an encoded RGB color texture (I could have used a float texture instead here).

The result is a nice looking 1024×1024 ambient map that takes 3.9 seconds to generate using my aging nvidia 7900GT. I performed 784 iterations, which seems to be the “sweet spot” for this method. Larger outputs are easy, up to whatever size your graphics card supports. I’ve attached an example. Notice how the left and right doors have darker areas around where the side mirrors attach. Neat!

GPU generated ambient occlusion map

Flash player video performance under Linux/Ubuntu

I’ve been annoyed at the slow performance of flash player video in Linux ever since I switched from using an old gentoo setup to a new Ubuntu distribution. Even with a 4-core phenom II, videos gets choppy if I fullscreen the hulu video player to 1280×1024. I’ve been googling around for answers, and found a number of suggestions, but here’s the one that finally worked. Add this to your /etc/X11/xorg.conf:

Section "DRI"
        Mode 0666
EndSection

Yay! Now I can fullscreen hulu on my 1920×1080 widescreen with no issues.

New Job

Yesterday was my first day at my new job as a game developer. It was a blast. I like the people and it’s a very different work environment from my last job. Of course, I’m already missing the people I used to work with at the old job, but one thing I’m not missing is the old 1.8 GHz laptop that was already years old when I got it years ago — at the new place I’ve got a brand spankin’ new quad core intel-based PC with admin rights and zero spyware. Also, for some reason, the desk at my work is exactly the same as the desk I just got back home. Weird coincidence. I guess they have good taste in desks?

Project Deimos

I’ve started getting somewhere on a new space-themed computer game that I hope to eventually release commercially. I haven’t chosen a proper name yet but for now I’ve codenamed it “deimos” because it sounds like a symphony of evil and I like that. It’s in the early stages, but today made some excellent progress and now have a networked client/server proof-of-concept up and running. You can’t do much besides turn your ship on and off, but the important part is that you can do it over a network connection and it should be completely tolerant of lag spikes, dropped packets, etc. I don’t have a lot of experience writing networking code so this is a big achievement for me and one of the two risk areas I knew I’d have to overcome when starting the project (the other being AI). Now that I have the networking system in place I should be able to proceed with adding other functionality and have it “just work” over the network.

Dev Journal and Project Hosting