Shaped Waves: code

Showing posts with label code. Show all posts

Wednesday, February 29, 2012

How to split quads of a height-map terrain?

Recently I needed to import some basic terrain height maps for a prototype project. It quickly became obvious that in order to get "smooth" geometry (whether you want it smooth for rendering, collision detection, or both) it's important to pay attention to how the terrain quads are triangulated. The problem is that even on a smooth heightmap most quads will not be planar, so two different ways to split the quad into two triangles will result in different geometry. Most of the time one of these is much worse than the other (sometimes neither is ideal, but at least you pick the "less bad" one). I'd also like to note that adding a vertex in the middle of the quad to get four triangles instead of two actually tends to make things worse on top of doubling the triangle count.

After searching the web for an algorithm and finding out that some big-name engines (eg UE3 docs about the issue just to name one example) just seem to let you set the orders manually, I ended up rolling my own algorithm. I'm not going to claim this is the best possible algorithm, nor am I going to claim it's novel or anything, but since it's very simple, works almost perfectly as long as the source data isn't awfully noisy (in which case any order tends to be bad), and I haven't found any other algorithms published I thought I'd share it here:

So, first do any non-uniform scaling of the heightmap terrain data such that we have a 2d array of 3d coordinates of the terrain vertices. Now for each vertex, calculate diagonal (one for each diagonal direction, on-axis doesn't work nearly as well because the problems occur in diagonal directions) two-sided finite-difference (eg [-1,0,1] kernel) "tangents" for each point (in 3d), then do a cross-product of the two "tangents" to get a "normal" and normalize the result. These "normals" are not that great for shading (you want to use something more sophisticated to calculate final normals), but take a dot-products between these "normals" at diagonally opposite corners of each quad, compare the results from the two diagonal pairs, and split the quad by adding an edge between the pair that gives the larger dot-product (ie have more "similar" normals) and you have an algorithm that will choose the better split order with "surprisingly large probability", even when neither of them is ideal.

That's not very complicated, nor did it take very long for me to come up with the algorithm. I'm sure more intelligent people could come up with much better algorithms, but if you just need "something" (so you don't need to rely on manual ordering) the above should provide you with a decent starting point.

Saturday, January 14, 2012

Z-buffer precision, camera pivot and optimal zNear

Everyone who has done any non-trivial 3D programming knows that sooner or later you run into problems with z-buffer precision. Especially a first-person situation you can have objects very near to the camera, which should not clip against the near-plane. But you also want to draw objects far away (say hills to the horizon). So you need a small zNear and a large zFar which runs into problems with precision, because of the non-linear screen-space z.

There are well-known solutions from depth partitioning to distorting the z values, but sometimes you just need a "bit more precision" and the above solutions just cause more trouble than they solve. If only you could move the zNear a bit further away.

Let's look at a first-person situation. The necessary (and sufficient) condition for avoiding any near-plane clipping (with any colliding geometry at least) is to place the near-plane completely inside the collision sphere (for rotational symmetry) of the camera. This can be a collision sphere of a player, or it can be the largest sphere that will fit inside the AABB or whatever else we collide with.

The question is, what is the largest possible value of the zNear (for a given field-of-view) that will keep the near-plane completely inside the collision sphere? Well, it depends on where you put the "eye" point.

As should be obvious in the image, the near-plane (and hence the zNear distance) is larger in the right-image. The largest possible near-plane that will fit inside the sphere will always go through the sphere origin. It is important to realize that it doesn't matter (at all) whether the "eye" is inside the collision sphere: anything behind the near-plane will get clipped anyway and the eye is purely virtual. Both placements will avoid any near-plane clipping as long as no geometry gets inside the sphere.

In practice first-person cameras (and other cameras that you turn directly) usually use the example on the left (it's quite obvious when you zoom: if you don't see any "parallax" movement when turning, then it's using the "left" version). This is what any text-book view/projection matrices will give you (eg D3DX helpers and whatever) unless you explicitly move the camera somewhere else.

Now, I would like to argue that the placement on the right is better even if you don't have precision issues, because it ends up feeling much more natural (on top of giving you larger optimal zNear which is easier to calculate). To understand why, we need to think of the "near-plane" as a lens. Then for a narrow field-of-view (eg high zoom) the "virtual" eye should end up behind the "real eye." The pivot point then should be much closer to the lens than the virtual eye position.

When I actually tried this in practice, I immediately realized why zooming in most games has always felt "wrong" as far as turning goes. It's because the pivot is in totally wrong place with regards to the view. By placing the pivot at the near-plane I ended up reducing the feeling of "tunnel vision" significantly and high-zoom no longer felt "shaky" at all. Turning actually gives you a sense of depth too (because of the slight parallax movement). You also see slightly more to your sides, which reduces the pressure to use high field-of-view for observing your surroundings (and you no longer need high field-of-view to combat the weird pivot either). All that on top of getting 4-8 times (well, it's FoV dependent; with more "zoom" you get larger zNear and hence more precision out in the distance) as large zNear without any near-plane clipping. I can't name a single negative really (go ahead and post of comment if you disagree).

So given a collision sphere radius and field of view, how do we calculate the optimal zNear when the near-plane goes through the camera origin? It turns out to be quite simple. Something like the following:

// fovY is vertical field of view in radians
// aspect is screenWidth / screenHeight
// solve half near-plane diagonal for nominal zNear = 1
float diag = tan(fovY / 2.f) * sqrt(1.f + aspect*aspect);

// solve zNear for diagonal to equal collision radius
float zNear = collisionRadius / diag;

Now you can calculate view matrix as usual, except move the camera backwards by zNear, because "camera" is really the "virtual eye" and we want near-plane through origin of the "real" camera.

Sunday, December 18, 2011

Why is math (written) so complex?

It's been ages since I wrote to this particular blog, but rather than make a new blog I'm going to use the existing one to mention some things that are in my mind these days.

We're going to start our series of rants with quaternion rotations. Now if you check the link, you'll find some annoyingly messy (long, error prone to type) formulas for converting a rotation quaternion into a 3x3 rotation matrix. As far as I'm concerned this is yet another perfect example of how people tend to make math unnecessarily cryptic.

Let's assume that we can multiply quaternions and use this to rotate vectors (see the link, it's simple enough). Now if we rotate axial unit vectors (1,0,0), (0,1,0) and (0,0,1) using the quaternion, we end up with a rotated basis that we can turn into a matrix simply by using those rotated vectors as the columns of the new matrix. It so happens that the resulting matrix is the rotation matrix that we want.

Now for efficiency purposes (in code) one might (if you don't trust your compiler's optimizer) want to write out the formulas, so all the products with zero can be dropped and all the products with one skipped and then simplify the remaining stuff a bit... but why do such premature optimizations when writing math for humans to read (as in Wikipedia) is just totally beyond me.

PS. I figured I could take the opportunity to also complain about the general hand-waving about adding scalars and vectors in the above mentioned Wikipedia article. The whole issue could be easily eliminated by defining vector i=(i,j,k) and taking a dot-product with the vector part of the quaternion. Then you'd end up with a+vi which becomes type-safe being all scalars now.

Sunday, July 6, 2008

What if...

What if there was no lambda in Lisp, but if defun returned the function defined? Wouldn't it be possible to define a macro such as:


(defmacro lambda (args . body)
  `(defun ,(gensym) ,args . ,body))

Please excuse the half-Scheme notation.

Saturday, January 19, 2008

Valoa kansalle

I had to throw away my old synth, for various reasons. Well, I didn't really throw it away, but it will take ages to make anything sort of a product, 'cos certain things weren't really done the way you would want to do them in a reliable software synthesizer.

The good news though is that for past half a year or so, I've been working on a new codebase, and somewhat different concept. The new synth thingie I call 'Valo' and while it's nowhere near finished, it's got to the point where most things work (well, there's no PWM yet and modwheel/aftertouch/velocity support is missing, and MIDI side lacks quite a few things and...) and it makes sound.

At present it has a completely redesigned oscillators, new filter design (well, if you can call a ladder low-pass new anymore these days) and it does all kinds of fancy stuff like mm.. sawtooth and square waves (well actually triangle as well) and it can do stuff like oscillator sync (properly) and the signal path is so non-linear it's a pain to tune everything 'cos when the signal levels change in one place, everything behind it has to be retuned with proper levels again.. which fortunately is not that many things.

I can't say it sounds clean. With all the non-linearities causing intermodulation distortions and noise here and there making everything slightly unstable, it most certainly does not sound very clean. On the bright side, it can sound approximately as clean as your average analog synth.

Yup, there's sample. Thought you'd like one. Some external effects (Kjaerhus Classic-series Chorus/Delay/Reverb) with pretty simple settings to make it sound less dry as the synth itself doesn't have any add-on effects (I'm not a huge fan of those) and currently only ever outputs mono-signal (that might change I think).

The patch itself is a "little of everything" thing with almost everything in the synth in use, including oscillators in synced mode, pitch envelope driving the slave, some (per-voice) distortion to trash it after filter, and finally both LFOs messing with the signal. No manual knob tweaking done. I'm sure you can tell I'm a very good sound designer.

Friday, January 5, 2007

A little teaser..

So here's a small screenshot of a filter. It's still lacking text labels and I'm not quite sure about the color scheme, but it's using the vector graphics library thingie.

The library itself needs a bit of tweaking before anything using it goes live. Maybe I'll release the library itself as well at some point, I don't know yet.

I was a bit worried about the performance, since it seemed to slow stuff down when redrawing a lot. That could happen for example when another window was moved on top of the thing. What was a bit strange though, was this would typically only happen after a while...

I was already going to write some bitmap caching logic to avoid doing the actual drawing unless something really changed in the window. Then my eyes caught something missing: I was leaking resources in my painting routine. Single line of code to fix that bug. Now it runs pretty nice just redrawing everything on the fly.

Sunday, December 31, 2006

Subdivision made visible

I'm currently working on stroking, and trying to come up with a decent algorithm for that stuff. The big problem here isn't really how to code stuff, but rather what things should look like.

Anyway, in order to visualize some problems for myself, I draw a stroke around a stroke. Then I got the idea that if I modify it a little bit, line segments will become visible, and I kinda liked what the result looked like, as it nicely illustrates the bezier subdivision done.

So in the picture below, there is a cubic bezier, which is subdivided into line segments. Then each of these line segments is made 15 pixels wide, and filled with white. Finally those are stroked with a black outline, which makes each individual segment clearly visible.

Saturday, December 23, 2006

Two dimensions for a change

Haven't got much of anything interesting added to the synthesizer thingie during the last few days. Instead, I've actually played around a bit in two dimensions. For a long time I've wanted a simple vector graphics library. A few days ago I started playing with the math. The result is rather boring system based on a couple of simple principles:

There is canvas, masks, and paint; paint is applied to canvas through a mask.
Geometric primitives (= line paths) work with masks.
Colors, gradients, bitmaps, whatever, work with paint.
To draw a white triangle, cut a triangle shaped hole in a mask, apply white paint.

The result looks like this (smaller one is screencap, bigger is the same thing magnified):

Now, at this point there isn't even a datatype for a path, so the individual lines in the mask must be drawn separately, but fixing that isn't a huge amount of work (actually probably less than writing this blog post). And I'd expect stuff like beziergons to work before christmas.

Now, what is nice about the design, is that as you see it antializes rather beautifully. It actually draws what you'd get by doing 16x16 naive supersampling and averaging down. Ofcourse all the supersampling is internal to the line drawing function. I'll probably make it 256x256 when I have the extra minute to tweak the algorithm a bit more.

The whole thing is a very basic scanline rasterizer, so the trick really is in the scanlines. Now a scanline rasterizer works by incrementing a counter every time it crosses a line going up, and decrements a counter every time it crosses a line going down. Then you typically fill when you've got either non-zero or odd number in your counter, depending a bit on what you want.

The antialising then is actually quite easy, just a question of putting fractional values in the scanline buffer, and drawing an anti-aliased line. The only important thing then is that the line drawing function makes sure that when one starts from left side of the line, and sums all the pixel values until one has crossed the whole line, the result is exactly one. In a sense, what is being drawn is the finite difference of the edge of an antialised half-plane. And I actually got the idea originally from anti-aliasing audio with BLEPs. Don't ask.

And the best thing? Since integration (and differentiation) are linear operations, one can draw each line independently, and just integrate the scanlines at the end. There aren't really any special cases whatsoever.

Ps. I'm not sure if the above makes any sense. I might consider a better explanation some other day, but I now have to add those beziergons. :)

Saturday, December 16, 2006

Fast pow2 for pitch to frequency mapping

Ok, so I posted a thread about this in KVR, and obvious I'm not the only one to have thought about it (which was to be expected ofcourse), but I'm still posting this here for future reference, since I don't have anything else to post about today.

In synthesizers, being able to convert from pitch to frequency is necessary. In equal tempered scale this involves the relation f=b2^p/12, where f is the frequency, b is the tuning base, and p is the pitch in semitones relative to the base tuning. This is trivially solved in C++ by f*pow(2.0,p/12). Trouble is, pow() takes an awful lot of time, so it's not really realistic if you want to do it on per-sample, per-voice basis.

One would expect to find some nice code that solves this problem reasonably accurate for the purposes of synthesizers, yet still reasonably fast. I've tried doing this a few times, but always failed. By reasonably accurate I mean something that gives resolution of about 1/100 semitones over the range of hearing (around 10 octaves). Most solutions I've seen are either

wildly inaccurate or only accurate near 0
only marginally faster (often slower than) than what gcc gives for pow

So all hope is lost? Not really, if one is willing to pay for a few lookup tables.

Now, looking at the facts, the accuracy requirements are measured in fractions of semitones. This suggests rounding before exponentiation should not cause problems, if the exponential itself can be calculated accurately. So how much accuracy does one need?

First of all, one only needs to consider one side, since 2^-x = 1/2^x. Let's say we say we want to go up and down 10 octaves, we need at least 4 bits for the integer part. As for the fractional part, 1 octave = 12 semitones, and 1 semitones = 100 cents. Hence there are 1200 cents to an octave. For 12 bits one gets 4096 tunings between octave. That's a total of 16 bits, and more than enough.

Full lookup table for 16 bits would ofcourse take 65536 entries, which for 32-bit floats takes a 256kB. Not huge these days, but exponentials have the nice property that a^(x+y)=a^xa^y. And this ofcourse means one can do a separate table for the upper 8 bits and the lower 8 bits. That's 256 entries per table, both tables together taking 2kB of memory.

So the algorithm for pow2(x) with two tables of 8 bits each then goes:

if x<0 recurse for 1/2^-x
multiply x by 2¹²
round/floor/whatever x into an integer (in C++ cast works fine)
do table lookups with 8 lowest, and 8 next bits, and multiply them together
return result

Ofcourse one can use more than two tables for more accuracy or lower memory requirements. How to populate the lookup tables is left as an exercise for the reader.

Shaped Waves

Who the...?

My Projects:

Archive