Hermite Spline Tangent estimation for curve fitting.

Hi all

Dataset fitting? it must be trough the Least squares method! I keep banging in my head for few days refusing to dive into it because for anything but the trivial function

*f(x) = A + Bx*

it can easily overflow the brain cache and is a heavy calculation for realtime purposes… So I attempted several heuristics that failed for arbitrary datasets (user strokes in this case) and that was a non negotiable requirement for me.

So what’s the problem about? basically you have a stroke between 2 end points, actually just a set of points given because if I knew the stroke equation there would not be problem at all! And I needed to approximate that curve with a 3D Hermite spline of the form:

Where P0 and P1 are the known endpoints. So I need to find the tangents at each of those points M0 and M1 such the resulting spline was the closest possible to that curve.

Finally I decided to get my hands dirty with pen and paper (you thougth programming happens only at a keyboard and in front of a computer? that’s typing or implementing, programming is actually thinking and it happens even when you load the brain with a complex problem and go to sleep!)

I’m posting this mainly for me, so when I find a similar problem in the future and I have flushed my current memory of this, I will remember, I don’t want to struggle again with Google and Stackoverflow to find nothing related there. (or I did a wrong search 😉 ).

How the least squares method works?

You have some datasets {P0, P1 ,… Pi} you want to fit with a mathematical function **F(…)**. That means that the sum of the approximation errors of that function should be as little as possible, ideally 0 so would be a perfect fit. And the error can be calculated as the sum of all the tiny errors for each evaluated point of the dataset squared, because we don’t want errors canceling each other on both sides of the original curve.

Easy no? well, in order to minimize that error we should take the partials derivatives of that function and set it to 0 for every unknown we want to estimate. In this case we have 2 unknown tangents.

And here’s where the heavy stuff start showing.

I needed to substitute the equations with the original forms but this brings another problem: it has 3D vectors! and the solution is not as easy as solving the least square for every component (trust me, I even attempted that!) I had to flatten it to a scalar problem.

A coordinate system transformation could have done the trick or a different parametrization:

But equations where already large and complex enough once substituted and it doesn’t account the case where in the new parametric form 2 or more values could correspond to the same parameter (remember user strokes can be like Brownian motion :P) and perhaps I can explore it once I find a more canonical solution.

And then I realized I could minimize the error **LENGTH** (scalar) instead of the **vector** difference. And went back to the beginning rewriting the least square functions:

so in order to minimize the error:

Wow, three times longer than before, take into account that each of those terms are a full equation on their own! look at this sample of my pen work 😉

Is it correct? the reasoning was correct… so I blindly continue developing the solution, calculating the derivatives (Thanks Wolfram Alpa!) and then organize it into Matrix form (**AX = B**) to finally calculate the solution vector with the tangents magnitude… and crossed my fingers….

Wow, just wow, it worked flawlessly and finally integrate it into my branch of 3DCoat QuadStrips. Is mathematically heavy for realtime but it only runs once at the creation of the spline so performance here is not an issue.

The harder the problem the greater the self satisfaction I get 🙂

Cheers!