Announcement

Collapse
No announcement yet.

Fitting a Model to Data

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fitting a Model to Data

    Hi !

    The problem I have looks simple but I could not find an efficient method to solve it.
    I have a set of data points which needs to be fitted to a Model (function) which is of the form Y= aX^n / ( 1 + aX^n), where "a" and "n" are the unknown constants which needs to be found for the best fit. The number of data points available is between 40 to 100. I know that the best "a" and "n" should minimize the total of square of errors between Y values from the model and actual Y data values for the given X values. However, I find it a bit too involved for me to translate this into a minimization algorithm . I wonder if there is anyone who has coded a solution to a problem like this.
    Thanks,
    Sampath

    Sampath

  • #2
    I wonder if there is anyone who has coded a solution to a problem like this
    Where have you looked and how (i.e., search terms used) ?
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      I have of course searched the forums, with keywords like "curve fitting" , "regression" , "Simplex method" etc. but could not get a good solution. The closest I have come from internet is a research paper https://www.researchgate.net/publication/246199710_Fitting_curves_to_data_The_simplex_algorithm_is_the_answer , which does have code but in Pascal and its not all that clear.
      Sampath

      Comment


      • #4
        Sampath

        Hmm interesting

        How do you define 'best fit'?????

        How do you know there is a formula that fits?

        Maybe there is always a formula, but in 'bad' cases the fit is not good and in 'good' cases the fit is good. (define 'good'!)

        Can you decide to leave out outliers? if so how many?

        I am not sure we have the right question yet!! But I am interested in the answer.

        Kerry
        [I]I made a coding error once - but fortunately I fixed it before anyone noticed[/I]
        Kerry Farmer

        Comment


        • #5
          Kerry,

          The data points available always fit the given formula with reasonably good fit because the data is from a defined process. As I mentioned in my first post the "best fit" is the fit that has the minimum error or variance between the value of the function and value of the data. The following is the representation of the error or the variance.

          Click image for larger version

Name:	fml.jpg
Views:	230
Size:	9.8 KB
ID:	810976


          Sampath

          Comment


          • #6
            Sampath - what you are describing sounds a lot like a LEAST SQUARES fit. I searched the forum for least squares and found a post with code from John Reinking that might be a jumping off point for you.

            See: https://forum.powerbasic.com/forum/u...t-fit-with-mat
            https://www.BcxBasicCoders.com
            BCX BASIC to C/C++ for Windows

            Comment


            • #7
              Originally posted by Mr. Kevin Diggins View Post
              Sampath - what you are describing sounds a lot like a LEAST SQUARES fit. I searched the forum for least squares and found a post with code from John Reinking that might be a jumping off point for you.

              See: https://forum.powerbasic.com/forum/u...t-fit-with-mat
              The problem is that Sampath's formula is not a polynomial so that method doesn't work.
              With a simple exponential function , you work with logarithms of the two sides. But having the variables in the divisor as well compounds the problem


              Thinks about what the curve of that equation looks like. Can aX^n be negative? If so what happens when it approaches -1 ?
              Now think about the square of any deviation close to there.

              Alternatively, think about when aX^n increases.( Hint, it converges on 1) Think about the "differences squared" as that happens and their relative weights.

              Comment


              • #8
                Stuart - You sound better equipped to help Sampath with his problem.

                Sampath stated:

                "I know that the best "a" and "n" should minimize the total of square of errors
                between Y values from the model and actual Y data values for the given X values."


                That sounds like the goal of a LS fit ... to minimize the sum of the squares of the residuals.

                I hope Sampath finds the help he needs.
                https://www.BcxBasicCoders.com
                BCX BASIC to C/C++ for Windows

                Comment


                • #9
                  Thanks for the suggestions. However, polynomial curve fitting wont work and linear least squares fit wont work either because as Stuart suggested, the model cannot be linearized by taking logarithms. However, it is a goal of a LS fit as Kevin says. Problem is how to code it, other than having to substitute values for "a" and "n" in a nested loop and checking for least error. That is not very efficient since its never an exact fit .
                  I forgot to mention that all data values are positive and Y always vary from 0 to 1 and X changes from 0 onwards . Therefore aX^n never becomes -1 but varies from 0 to a large value because (aX^n +1) in the end has to approach aX^n .
                  .
                  Sampath

                  Comment


                  • #10
                    Sounds like a tough nut to crack. All of my LS experience used a non-linear (iterative) software approach.
                    Typically, if a solution couldn't pass a chi-sqr goodness of fit test at a desired level (typically 5%), our data
                    was re-measured.
                    https://www.BcxBasicCoders.com
                    BCX BASIC to C/C++ for Windows

                    Comment


                    • #11
                      It seems some real ordered sample data for X, Y might be in order, just so... well, some of us might think it fun to putter with this.
                      Rod
                      In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

                      Comment


                      • #12
                        Here is a sample of data (X and Y)
                        0 0.001172
                        15 0.002089
                        30 0.004358
                        45 0.00917
                        60 0.014302
                        75 0.018572
                        90 0.022369
                        105 0.028322
                        120 0.034635
                        135 0.03951
                        150 0.043753
                        165 0.049041
                        180 0.056217
                        195 0.064367
                        210 0.0725
                        225 0.079812
                        240 0.087602
                        255 0.09606
                        270 0.105185
                        285 0.116695
                        300 0.130202
                        315 0.14511
                        330 0.160726
                        345 0.176354
                        360 0.192689
                        375 0.209502
                        390 0.224502
                        405 0.238829
                        420 0.254549
                        435 0.272126
                        450 0.290947
                        465 0.310931
                        480 0.331463
                        495 0.35174
                        510 0.373755
                        525 0.397203
                        540 0.421397
                        555 0.446109
                        570 0.471301
                        585 0.495443
                        600 0.517423
                        615 0.538116
                        630 0.559019
                        645 0.579448
                        660 0.598441
                        675 0.616573
                        690 0.633307
                        705 0.650669
                        720 0.668808
                        735 0.684894
                        750 0.699822
                        765 0.714087
                        780 0.728176
                        795 0.74201
                        810 0.755438
                        825 0.769306
                        840 0.782233
                        855 0.794418
                        870 0.806666
                        885 0.817972
                        900 0.828448
                        915 0.83833
                        930 0.846934
                        945 0.855213
                        960 0.86416
                        975 0.872702
                        990 0.880502
                        1005 0.888126
                        1020 0.895494
                        1035 0.903608
                        1050 0.910941
                        1065 0.917334
                        1080 0.924702
                        1095 0.932091
                        1110 0.938193
                        1125 0.943087
                        1140 0.947844
                        1155 0.954032
                        1170 0.959667
                        1185 0.963032
                        1200 0.96626
                        1215 0.969689
                        1230 0.972793
                        1245 0.975423
                        1260 0.978571
                        1275 0.982654
                        1290 0.986333
                        1305 0.990144
                        1320 0.993897
                        1335 0.997166
                        1350 0.999416
                        1365 1
                        Sampath

                        Comment


                        • #13
                          Thank you!
                          Rod
                          In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

                          Comment


                          • #14
                            Does it have to be of the form Y= aX^n / ( 1 + aX^n) ?
                            A fifth or sixth order polynomial fits that data almost perfectly. and deriving that in PB is trivial

                            Click image for larger version  Name:	5thorder.jpg Views:	0 Size:	101.2 KB ID:	811000

                            Comment


                            • #15
                              Stuart, yes it has to be that since 'a' and 'n' are needed for further processing.
                              Sampath

                              Comment


                              • #16
                                Originally posted by Sampath Weragoda View Post
                                Stuart, yes it has to be that since 'a' and 'n' are needed for further processing.
                                OK, more thought needed

                                Comment


                                • #17
                                  My ignorance is probably showing here, but how can Y = 1 when the denominator and the numerator can never be equal?

                                  Edit, ok rounded to so many significant digits.

                                  Anyway, if you convert the y values to yt = y/(1-y) then yt = a*x^n -- where y<>1

                                  Comment


                                  • #18
                                    Mark,
                                    Thanks for the suggestion. It does look very promising. I could take maximum Y as 0.9999 something which for all practical purposes is sufficient. This model is actually for a chemical conversion process and Y gives the degree of completion. So "1" means the process is 100% complete and 99.999% would not make a big difference for my purpose.

                                    Though your suggestion can be used to solve this, it would be useful to have a generalized algorithm to solve this kind of problem in case the model (equation) needs to be modified.
                                    Appreciate your help.



                                    Sampath

                                    Comment


                                    • #19
                                      FWIW, using the formula from post #1 and where a=the average of all given Xs and n=the average of all given Ys gives values of 0 for the first pair, but all subsequent pairs range from 0.999668 to 0.999972.
                                      You may want to use the variance instead of averages or some other derivative or a combination thereof. You might try those values for 'a' and 'n' with Mark's suggestion as well.
                                      Of course, I could be out to lunch as well.
                                      Rod
                                      In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

                                      Comment


                                      • #20
                                        Houston, we have a problem. You can't fit your data to your formula!

                                        Your data is clearly some sort of "S" curve (Logistic(?) )
                                        But your formula must, by its nature, generate a Logarithmic curve for any values of A and N.

                                        Typically something like the red line:
                                        Click image for larger version  Name:	SversusLog.jpg Views:	0 Size:	38.9 KB ID:	811017

                                        Comment

                                        Working...
                                        X