Background - Pitfalls of Curve Fitting

Background

The background for this talk was actually started way back in 1994, when I started working for Jandel Scientific Software. During my first two months I was a temporary tester for what was going to be SigmaPlot 1.0 for Windows and SigmaStat 1.0 for Windows. They were new products that were being updated from their DOS versions.

I started getting involved with statistics and became the in-house expert in the Technical Support and Quality Assurance departments. One of our software programs, called TableCurve 2D, was specially designed for curve fitting data. During my time at Jandel, I was constantly fielding phone calls, e-mail messages and other inquiries about curve fitting. Usually they were from people who noticed very strange results when attempting to fit a data set with a given equation.

Since I ran into so many of these problems, I wanted to make some paper on it for future reference. Well, it turned out that there was a short tutorial on this in the TableCurve manual. It was the starting point for this paper, and it included a lot of examples that I was running into. The only items I added were all of the actual data sets and equations that were used by customers. In a way, you could say this this paper is in fact a collage of information from a variety of sources, including the data sets used for the TableCurve 2D manual, customer data sets and other in-house research I was doing during that time.

Another problem I ran into was the lack of papers on this issue. I was able to find papers on curve fitting techniques, but not very many that also dealt with these pitfalls. This is what makes this paper unique. It's combines the best of both worlds.

Although this paper goes into a fair amount of detail, I don't consider myself to be an expert. After all, it's been a few years since I was up to my eyeballs in this stuff. I would say that I'm very familiar with the statistical processes with curve fitting and how they affect the result. The professional and research statisticians who work with this every day are the real experts.

My sincere thanks goes to Ron Brown, who was the developer for TableCurve 2D. He was very helpful with giving me a lot of insight on the statistical and artistic sides of curve fitting. A lot of the information he mentioned in his manuals were used for this paper. Without his help and mentoring, this paper wouldn't have been written.

I eventually started piecing together various problems from customers, and decided to put the best ones in the paper. It was a lot of work to not only simplify the problem, but also how to present it in an informative and educational way. This took a very long time, roughly 17 months from start to finish.

The issue with the curve fitting programs is that this is almost considered a form of "black magic", where you plug in equations and data. Then you click on the "Go" button, and your results appear. When things go wrong, it usually takes either a statistician or other professional who's familiar with curve fitting techniques to figure out the problem. The really hard part is trying to explain the solution in plain English to someone who has little or no background in curve fitting itself, let alone the statistical aspects of it. This is why curve fitting is both an art and science rolled into one.

I decided to give a presentation at my alma mater, Sonoma State University, in Rohnert Park, CA. In March of 1998, I gave a presentation during the M*A*T*H Colloquium Lecture Series, which is a semester long seminar on various topics in mathematics and related disciplines.

This was a "look what I'm doing in the real world" type of talk for the current students. During the presentation, things didn't go too well for me. Basically, I bombed, or at least it seemed that way to me. My sister said I did fine, though. Perhaps it was the nervousness of giving a major presentation in front many of your former instructors. Either way, it could've gone much smoother.