Wednesday, October 20, 2021

Vertex normals the easy way

Everywhere you look, people suggest that in order to compute vertex normals for smooth shading, you should first compute face normals and average those. If you're doing this on the fly on the GPU, that's an extra pass and just incredibly silly. Here's an easier way: find the ring (order in some consistent winding order) of vertices around every vertex (or just the vertices connected by edges, unless it's valence 2 in which case keep the face verts too; saves half the work on quad meshes), store these lists in a flat array (eg. buffer texture, whatever) with offset+size in the actual render-vertex. Then loop these, subtracting the vertice's position and taking cross-products between any two sequential pairs, accumulate the result and normalize. For closed meshes, that's it. For borders and corners, if two sequential vertices are not connected by a face, add the center vertex as a "dummy edge" so the cross-product will come out zero and you save a branch. If you don't do mess with the lengths of the cross-products when accumulating, the weighting rule ends up roughly sin(angle)*area, which is typically more or less what you'd want anyway. There you go. No need to have face normals at all.

Friday, February 5, 2021

Bunny-JIT

Because world needs a simple optimizing JIT that doesn't need 20GB of disk space to build...

github.com/signaldust/bunny-jit

Wednesday, December 20, 2017

Closed hashing probes and clustering..

Closed hashing with linear probing has a problem: once with hit a filled slot, there's a good chance we'll be searching O(n) through the cluster of collisions. This is well-known and called primary clustering.

So, we'll use quadratic probing? Using (hash+(i+i*i)>>1)%N with a N=2^n tables we will still search all the slots, but probe sequences starting from different slots will diverge. Unfortunately they will match if we start from the same slot (we're just using n bits of the hash), which is known as secondary clustering and also well-known.

The usually suggested remedy for secondary clustering is double hashing. We take another hash (or even just hash the original full hash; this is enough to avoid the "only using n low-order bits" problem as long as your second hash is good at mixing the bits.. or even just bit-reversing the first hash could work), let's call it hash2 and use it to control the probe sequence. The standard approach is (hash+(hash2|1)*i)%N (where the bitwise-or with 1 will enforce full sequence for 2n tables) and this is now perfect?

What about the case where we are really unlucky and a bunch of entries all get the same low bits in hash2 and then collide? Now they could theoretically cluster just like in linear probing, since they end up using the same linear sequence. This obviously shouldn't happen if hash2 is good, just like we shouldn't get a collision in the first place if hash is good, so maybe we should think about it anyway?

So why not fix that too? It doesn't even seem complicated. See, we can combine the double hash with the quadratic probing and end up with (hash+(hash2|1)*((i+i*i)>>1))%N. That's still guaranteed to hit every slot (in a "randomly" permuted order) of a power-of-two table! If two entries with same low bits of hash2 collide, then the quadratic will force them to diverge as long as their home slots are different. If two entries have the same home slot (low bits of hash) then hash2 will force a different permutation.

Now, it seems like the above would lead to a rather "fool-proof" collision resolution, with little cost over regular double hashing and even somewhat resistant against the second hash being poor (eg. you still get a quadratic probe too). The only way you can then cluster is by hitting both primary AND secondary clustering at the same time (ie. collision in both hashes at the same time).

So .. why have I not seen this mentioned anywhere? Am I missing some non-obvious problem?

Wednesday, December 10, 2014

Lore - simple regex library for C++

I finally got tired of not having an incremental (character at a time), linear-time (wanted predictable time and space complexity) regular expression library in C++ that I could use in various situations (from text search to input validation to lexing tokens to whatever else).

Maybe there is already a better library for this, but it didn't seem like any of the "usual suspects" could really full-fill my requirements (either they are exp-time, need the whole input at once, or quite commonly both), so I decided to port my older (in custom Lisp) regex implementation to C++ for use in native projects.

I also tried to make the API as simple and flexible as possible. While the implementation isn't anything special, the code is available at github in case someone else needs something similar.

Thursday, December 19, 2013

Silly pattern-matching macro

One of things that I somehow always end up doing with Lisp code is transforming one type of list-structure into some other type of list-structure, where the rules of transformation are simple pattern matches..

That's probably not very surprising, considering some "other" languages are built completely on pattern matching, but one of the cool things of Lisp is that you can add all the missing syntax quite easily. So today I figured I'd hack together a simple pattern matcher in (my own "toy" dialect) µL.

The syntax of the language should be quite obvious to anyone who's ever written any Scheme (it's mostly just shorter), so I won't bother translating it, but the idea is to add a "case-match" form (that should probably get a shorter name) so you can do things like this:

.uL.>
(def (let-format form)
  (case-match form
    (((fn ,args . ,body) . ,binds)
     `(let ,(map list args binds) . ,body))
    (,other other)))

 .r = #!<fn let-format>

.uL.> (let-format '((fn (a b) (+ a b)) 1 2))

 .r = (let
        ((a 1)
         (b 2))
        (+ a b))

See, we match some arbitrary list-structure to each of the case-structures, and pick the first one that applies. In the pattern ,symbol will match anything and capture that symbol as a binding to the matched form in the code for that match, essentially like a destructuring bind, except without unquote the symbols match literally.

Nothing too fancy, not worth writing a blog-post about.. but the cool thing was how simple the implementation turned out to be. I actually wonder why I've not done this earlier (veteran lisp-programmers, please stop laughing). The resulting code only walks once per pattern and aborts early, so it's reasonably efficient too. It could be extended further, but the point is: it's not hard at all to add this functionality into Lisp-like languages. It's here:

;;; main dispatch, with the actual macro
;;; simply binds the "what" part to a symbol
;;; so it can be any arbitrary expression
(macro (case-match what . cases)
  (let ((wsym (gensym)))
    `(let ((,wsym ,what))
       ,(case-match-build (gensym) wsym cases))))

;;; This builds the main (if .. (if .. (if ..))) structure.
;;; The code built by case-match-case returns either closure
;;; or nil (no match), so we check and call closure if any.
(def (case-match-build tmp wsym cases)
  `(let ((,tmp ,(case-match-case wsym (car cases))))
     (if ,tmp 
       (,tmp) 
       ,(if (cdr cases) 
          (case-match-build tmp wsym (cdr cases))))))

;;; Now what case-match-case does, it start's a CPS-style
;;; macro-expansion, because we want the final code to only
;;; recurse the match once (for efficiency). So we pass around
;;; the code that we want to insert in the "tail-position" in
;;; pretty much the same way we'd pass around continuations.
(def (case-match-case wsym 1case)
  (let ((pat (car 1case))
        (body (cdr 1case)))
    (case-match-worker wsym pat
      `(fn () . ,body))))

;;; So here we go: build match for a sub-pattern.
;;; The first cond-branch will do the unquote-match,
;;; and introduce bindings. The second one expands the
;;; CPS-style recursion and the last-one will match
;;; a literal, and either keep going, or abort.
(def (case-match-worker wsym pat k)
  (cond
    ((and (cons? pat) 
          (eq (car pat) 'unquote)
          (and (cons? (cdr pat))
               (symbol? (cadr pat))
               (not (cddr pat))))
     `(let ((,(cadr pat) ,wsym))
        ,k))
    ((cons? pat)
     (let ((pat1 (car pat))
           (patr (cdr pat))
           (ws1 (gensym))
           (wsr (gensym)))
      `(if (cons? ,wsym)
        (let ((,ws1 (car ,wsym))
              (,wsr (cdr ,wsym)))
          ,(case-match-worker ws1 pat1
             (case-match-worker wsr patr k)))
        ())))
    (t `(if (eq ',pat ,wsym) ,k ()))))


That's not too bad; no more manual destructuring for pattern matches. The main thing I don't like is the gensym-spam with a lot of temporary bindings.. which wouldn't be that bad if it wasn't for the lack of "safe-for-space" closures in my current implementation of the language itself. But this thing should help with building a better compiler to fix that.

Saturday, April 13, 2013

regexes in interpreted language.. wut?


I realized that in order to do any sort of sensible string-processing, I'd need at least basic regular expressions in µL. So I wrote a regular expression implementation, and I wrote it in µL itself.. well.. because I wanted to see how practical it would be in terms of performance.. and well.. because of fun and.. well, we'll get to that.

So far it's pretty basic, with just the basics (grouping, alternatives, greedy/non-greedy.. still need {n,m} counted repetition, character classes and a bunch of escapes.. but those are simple parser extensions) and doesn't have a nice "gimme the match string" or "replace stuff" front-end functions.. but it does track sub-expressions (so replace is just a question of front-end), and runs online (meaning possibly infinite text) in constant space and linear time (well, obviously O(n*r) time and O(r) space for n input and r regex length).

It's not very special; in fact it's kinda stock "Thompson NFA" .. but since it is essentially an interpreter for a non-deterministic finite-state-machine, written in an interpreted (sort-of) language, it's obviously "not very fast" yet I expected it to perform much worse when there's a lot of parallel states to maintain. Turns out it's not nearly so horrible, and the predictable nature of O(n) execution means you can optimize regex for size instead of trying to work-around matching problems. I really wonder why linear time isn't more popular.

I guess I've got too used to the now-common "back-tracking" (or "depth first") implementations myself as well. Those get exponentially slower when your expressions get more complex or input data more degenerate (so they need to try more paths with longer look-aheads), so the performance can vary quite unpredictably. With a linear-time engine, even with all the constant factors being about 2^10 times larger than necessary (as in my case), it still feels quite usable (sort of like TCL on 486). Obviously no back-references this way.. but I've never managed to use them for anything anyway (easier to cascade a regex-lexer with a simple ad-hoc parser), so personally I won't care.

Anyway, one cool thing (and the reason I resisted the temptation to write the thing as "primitives" in C) is that nothing (well, almost nothing) in the thing relies on input being a string (or even a sequence of characters). You could just as well feed it a list, and then use fancy transition predicates that test the individual items against whatever. The regex-parser won't build such machines (not yet anyway) but the evaluator would already process them. Since the predicates can be anything, you could match recursively against other patterns (or just the same one).. which would still run in O(n) time (but certainly requires worst-case O(n) space) assuming the tree is really a tree (and not DAG).

Curiously, with a slightly more complex algorithm, it should be possible to do similar matching for cyclic graphs too, in O(n*r) time and space, using the rule that if a recursive rule matches up to a cycle of the same node with the same rule (ie visit pairs of [inputNode,matcherState] once), then it's a match. Could be fun to build an engine that could do that, but .. I guess it's not possible to get worst-case O(n*r) for cyclic input without modifying the input (at least temporarily); hash-maps would give a "typical" bound, obviously.

Monday, April 8, 2013

µL hacks

Today I randomly got the idea "wouldn't it be cool to be able to quasiquote strings" and tried implementing such a feature into my poor little Lisp-variant. Here's how it works:

When you write {string} it gets parsed exactly as if you wrote "string" except the normal string escapes are disabled so {foo"bar\} results in a string that you would normally have to type as "foo\"bar\\" which on it's own is quite handy. However, you can use ~ (maybe should change that to ^ though) and if you do this, the reader will go back to reading a normal Lisp expression. Once it finishes, it'll keep reading the string again (and read another expression next time it sees ~ and so on) and when it finally sees } it'll turn the whole thing into strcat call (which is string concatenation):

{1 + 1 = ~(int->string (+ 1 1))!}
will read as
(strcat "1 + 1 = " (int->string (+ 1 1)) "!")
and evaluate as
"1 + 1 = 2!"
Naturally, since ~ just calls the reader recursively, you can nest such constructs just like any other code... so with a grand total of 11 short lines of code, I just added a rather natural "template" feature to my little language. :)

PS. Yeah, it'd be nice if such hacks could be done "on the fly" without changing the reader (like you could in some "other Lisps"), but currently the dispatch loop from which { needs to be intercepted isn't table-driven yet (several other things like string escapes, #\newline style character constants and special values like #!eof actually are table-driven, mainly because that way the pretty printer will always match what the reader expects.. should fix the dispatch too, I guess).

Another small (though somewhat larger) µL hack was to add support for "alien" types: these are basically opaque typed pointers from Lisp heap to somewhere else. Now it's simple to add new types for external resources (like files, or C-heap types, whatever), while still handling garbage collection sensibly (both collecting such external objects if desired, and allowing the external objects to keep references to Lisp heap objects that might move arbitrarily). Fortunately it turned out to be ridiculously simple to add finalization and a bunch of visitor callbacks to the garbage collector (which isn't that fancy.. just a simple Cheney with a rudimentary nursery to deal with the worst of "garbage on arrival" which the interpreter creates fast enough that I probably should add proper generational logic).

PS. Should probably upload the whole thing somewhere... but I'd like to fix a few more broken/missing features first, such that it might be even theoretically possible to find some practical utility, other than messing around in the REPL.