Announcement

Collapse
No announcement yet.

Curve Smoothing

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Curve Smoothing

    I have orbital latitude/longitude data for the International Space Station, but it has only 0.1 degree of resolution. At the equator that's about 8 statute miles, and I'd like to improve the accuracy -- or at least the average accuracy -- of the data. The altitude data is accurate to 1 mile and the time to 1 second.

    As background, the orbital inclination (the angle at which the circular orbit crosses the equator) is 51.6 degrees, so the orbit always trends northeast or southeast, switching when the ISS reaches +/- 51.6 degrees.

    Click image for larger version  Name:	orbit.png Views:	0 Size:	178.6 KB ID:	806433

    Five minutes of typical data at 15-second intervals looks like this. This segment starts just below the equator (-1.7 degrees latitude)...

    Click image for larger version  Name:	rawdata.png Views:	0 Size:	4.8 KB ID:	806429

    Obviously the ISS does not really jump from grid-point to grid-point, it orbits in a virtually perfect (albeit inclined) circle.

    How would I go about "smoothing" that data? To simplify things I am going to concentrate on just the latitude (north/south) values. My first idea was to create a new set of "averaged" data, where each value is the half-way point between the adjacent records. For example, the blue cell is the average of the two gray cells...

    Click image for larger version  Name:	averaged.png Views:	0 Size:	6.0 KB ID:	806431
    There were only 5 values that were not adjusted (green) and the adjustments were +/- .05 degree.

    Repeating the process, again blue is the average of the two grays (which are averages)...

    Click image for larger version  Name:	repeat.png Views:	1 Size:	7.3 KB ID:	806434

    What seems counter-intuitive to me is that more final values were left unadjusted (compared to the raw data) and all of the adjustments were smaller, yet theoretically the results are more accurate. It seems like there must be a basic flaw in my "averaging" logic.
    "Not my circus, not my monkeys."

  • #2
    Data text rather than images would make it a lot easier to experiment

    Have you thought about averaging the differences between sets of rows?
    i.e
    the first five increments total 3.9° so you can interpolate increments over that block of 0.78° ( or 0.052° per second)
    SImilarly, the last 5 increments total 3.8°. so you can interpolate an increment of 0.76° ( or 0.051° per second)

    Comment


    • #3
      Howdy, Eric!

      A while back I posted a game called gbFlyZapper where I had a fly move around on the screen. I seem to recall plotting the expected path of the fly based on it's previous location/direction. The actual movement was random but the plot of expected path required calculating and smoothing future points based on past points -- something like that.

      Perhaps an adaptation of that approach would be helpful?

      Comment


      • #4
        Is there any reason not to use more precise data in the first place?
        This will tell you where the ISS is now:
        https://api.wheretheiss.at/v1/satellites/25544

        If you need its position at some other time then add the UNIX timestamp of the time you're interested in:
        https://api.wheretheiss.at/v1/satell...mps=1617368503

        The returned data looks like this:
        [{"name":"iss","id":25544,"latitude":31.399128468897,"longitude":-0.96143502345036,"altitude":422.68365304452,"velocity":27580.682036677,"visibility":"daylight","footprint":4521.0915600549,"timestamp":1617368503,"daynum":2459307.0428588,"solar_lat":5.139592072705,"solar_lon":345.44774274129,"units":"kilometers"}]

        Comment


        • #5
          Thanks ALL!

          Hi Stuart,

          > Data text rather than images

          Yeah, I was trying to describe the concept visually, to get the ball rolling. This seems like a "classic problem" that some expert in statistics (or analytics or something) will recognize, and I don't know the right terms to use. But I guess I could upload a big text file if somebody just wanted to hack at it.

          > Have you thought about averaging the differences between sets of rows?

          I... think that's what I'm doing in my little spreadsheets. Do you mean something different? Edit: I see, you mean a 5-photo window, not 3.

          Gary,

          Hmm, I think that the non-randomness of the NASA data provides a big advantage, but I'll ponder your "look ahead" idea.

          Paul,

          That's cool stuff alright, but...

          > Is there any reason not to use more precise data in the first place?

          The "periodic" data in my example is just a simple example of a larger overall puzzle. I'm actually dealing with 2-3 million specific but mostly random time/locations going back to ISS001. Calling an online API that many times -- and processing that many individual result files -- would be extremely time consuming compared to doing some floating-point math in PowerBASIC. Also, I'd like to be able to apply the same type of data-smoothing to the Shuttle era when the project matures. NASA's own algorithm-based APIs don't work for the early shuttle missions.

          However the API you posted will provide me with some great sample data with which to compare my results! Thanks!

          "Not my circus, not my monkeys."

          Comment


          • #6
            A running average of Latitude-Longitude will give a distorted result from what is the best possible.

            For the best result, use the original, coarse, L-L coordinates indicating a point on the sphere and convert them to vectors (I’ll describe that exactly in a moment). Then average the vectors. Then convert the result back to L-L.

            The vector corresponding to a point on a sphere is just the arrow going from the center of the sphere to the point on the sphere’s surface. The point can be indicated by L-L or by its Cartesian coordinates in three dimensions. So you are converting from two number (L,L) to three: (x, y, z). The norm, or length, of the vectors will be constant, the radius of the sphere, so there is still only two degrees of freedom.

            In other words, convert from spherical coordinates (longitude-latitude) to Cartesian coordinates. See
            duckduckgo.com/?q=spherical+Cartesian+coordinates

            To make things simpler, divide the (x, y, z) coordinates by the norm, so that you are dealing with unit vectors.

            To average (x1, y1, z1) and (x2, y2, z2) first average the corresponding coordinates then divide the resulting vector by its norm to get a unit vector. (The first step won’t in general result in exactly a unit vector.) Then, convert that average unit vector to L-L.

            The L-L to Cartesian conversion:
            First convert latitude and longitude from degrees to radians, then
            x = cos(latitude) * cos(longitude)
            y = cos(latitude) * sin(longitude)
            z = sin(latitude)

            Note that the norm = sqr(x^2 + y^2 + z^2) = sqr(1) = 1.

            From Cartesian to L-L:
            latitude = arcsin(z)
            longitude =? arctan(y / x)
            except that the second needs to be modified so you get the correct range and sign.
            longitude = atan2(y, x). See
            duckduckgo.com/?q=atan2
            Politically incorrect signatures about immigration patriots are forbidden. Searching “immigration patriots” is forbidden. Thinking about searching ... well, don’t even think about it.

            Comment


            • #7
              Originally posted by Eric Pearson View Post
              Yeah, I was trying to describe the concept visually, to get the ball rolling. This seems like a "classic problem" that some expert in statistics (or analytics or something) will recognize, and I don't know the right terms to use.
              I think you will find the term you are looking for is "kriging".
              https://en.wikipedia.org/wiki/Kriging

              In statistics, originally in geostatistics, kriging or Gaussian process regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by prior covariances. Under suitable assumptions on the priors, kriging gives the best linear unbiased prediction of the intermediate values.

              My method of taking the mean of five values was an over-simplification of the concept. (See the section on Linear estimation in the above link)



              Comment


              • #8
                Yes, there are more sophisticated methods than interpolating linearly. My point above is that whatever method you use, it is better to use it on uniform Cartesian coordinates rather than distorted L-L coordinates.
                Politically incorrect signatures about immigration patriots are forbidden. Searching “immigration patriots” is forbidden. Thinking about searching ... well, don’t even think about it.

                Comment


                • #9
                  Thanks Mark, that's an interesting point. I am doing Great Circle calculations elsewhere in the program, so I understand how it would be better -- more linear -- than averaging latlons.

                  And BINGO Stuart, Kriging is exactly the term I wanted to find. Never heard the word before. Wikipedia even lists Astronomy as one of the fields where it is used, and an orbit is an orbit. Some of the NASA latlon data is highly periodic (as above) but in other places it is more or less random sampling, which is what Kriging is about. Unfortunately... it requires integral calculus. I did well in Differential but hit a wall at Integral; I repeated the class and got the same mediocre grade, at which point I changed my major from pure physics to engineering. Integral is not algorithm-based, so I wouldn't know where to begin coding it.

                  Thanks all!




                  "Not my circus, not my monkeys."

                  Comment


                  • #10
                    Eric, there are some very straightforward interpolation methods in the Numerical Recipes books, based on polynomials or splines. I have found that easy to do in the past to interpolate within essentially smooth data series (passes through the data exactly, smooth transitions between data points), based on adapting code for the Pascal version of NR - unfortunately I don't have that to hand right now. NR in Fortran is available as a pdf online I think...

                    Comment


                    • #11
                      There's a good chance a sine interpolation might work : such as : Lat = LatAmpl * SIN( Time / Period + FI ) ,
                      LatAmpl , Period , FI may be found using a SIN least squares form .

                      Comment


                      • #12
                        Eric, thank you for posting a very interesting problem. I learned something today by working on it. My initial instinct was that it is impossible to regain accuracy from data that the accuracy has been removed from. Latitude data is good to about 6 decimal places using GPS, so i am sure the ISS location is known to 6 decimal places as well. The data you showed is only 1 decimal place, the accuracy is no longer available. My instinct was both right and wrong, it seems that it depends on how the accuracy was removed from the data whether or not you can "regain" some of it. What i found was if the original data is rounded to 1 decimal place, I was able to curve fit a quadratic to the data and get up to 4.3x more accurate estimate of the data points. However if the data was truncated to 1 decimal place, fitting the curve gave maybe a 10% improvement but i would hesitate to say it was actually better. Here's how the data I tested looks:
                        numpoints to fit 31
                        Inclination 51.6 deg
                        orbital period 5565 sec
                        time increment 20 sec
                        latitude = 51.6*sin(2*pi*t/5565)
                        time (t) latitude (L) round(L, 1) round(L, 1) - L QUAD_FIT(round(L, 1)) QUAD_FIT(round(L, 1)) - L
                        sec deg deg deg deg deg
                        300 48.668191 48.7 0.031809 48.665022 -0.003169
                        320 48.268653 48.3 0.031347 48.263475 -0.005179
                        340 47.844504 47.8 -0.044504 47.840216 -0.004288
                        360 47.395960 47.4 0.004040 47.390080 -0.005880
                        380 46.923249 46.9 -0.023249 46.918384 -0.004865
                        400 46.426613 46.4 -0.026613 46.421478 -0.005134
                        420 45.906305 45.9 -0.006305 45.900526 -0.005779
                        440 45.362589 45.4 0.037411 45.355830 -0.006760
                        460 44.795745 44.8 0.004255 44.792507 -0.003238
                        480 44.206059 44.2 -0.006059 44.202599 -0.003460
                        500 43.593834 43.6 0.006166 43.591071 -0.002763
                        520 42.959381 43 0.040619 42.957923 -0.001458
                        540 42.303023 42.3 -0.003023 42.299757 -0.003266
                        560 41.625096 41.6 -0.025096 41.621994 -0.003102
                        580 40.925945 40.9 -0.025945 40.920983 -0.004963
                        600 40.205927 40.2 -0.005927 40.197785 -0.008142


                        I created a sine with period 5565 seconds, amplitude of +/- 51.6 degrees. I used a time space of 10 seconds like you showed. The “true” data is the latitude (L). The rounded data was using the Powerbasic ROUND function. The procedure was to take an equal number of points on either side of a particular reading, and fit a quadratic to the data. Using the best fit equation, then calculate the “improved” result for the center point, in this case t=400 sec. For instance if fitting 5 data points to get the estimate, for t=400 sec I used points 360, 380, 400, 420, and 440.
                        Here’s the results for some different number of fitted points:
                        standard deviation of ROUNDED data error standard deviation of fit data error factor improvement fitting points used
                        0.028 0.018 1.6 5
                        0.028 0.012 2.3 9
                        0.028 0.0088 3.2 15
                        0.028 0.0065 4.3 31
                        Going too high in the number of fitted points is no good because the quadratic becomes a poor fit for the sine eventually.
                        Here’s the results for trying to fit the truncated data:
                        standard deviation of TRUNCATED data error standard deviation of fit data error factor improvement fitting points used
                        0.057 0.053 1.08 5
                        0.057 0.051 1.12 9
                        0.057 0.05 1.14 15
                        0.057 0.05 1.14 31
                        Somehow the truncated data looses the ability to recover the curve fit, but the rounded data does not. I definitely did not expect that result.
                        My code to calculate the coefficients for the best fit quadratic are a bit too ugly for me to post, so if you are interested to try this approach send me a PM I will send it over to you.

                        Larry

                        Comment


                        • #13
                          Somehow the truncated data looses the ability to recover the curve fit, but the rounded data does not. I definitely did not expect that result.
                          Nor did I. That's an interesting point about truncated vs. rounded, I'll have to meditate on that one.

                          Data that is truncated to one digit is "only" wrong (by 0.1) 50% of the time more often than data rounded to one digit, so I can understand losing 50% of the gain, but your results are more like 75%. Let's say the ISS is north of the equator and east of Greenwich , and trending Northeast. It seems like truncating the data would result in a shift of 50% of the points to the south and 50% of the points to the west, but it doesn't seem like it should change the general shape of the interpolated curve. But maybe that's it: truncating the latitude makes the result 50% less accurate, and truncating the longitude takes another 50% of what's left, for a total loss of 75%

                          Hmm, do your results mean that I can determine whether the real-world data has been truncated or rounded?
                          "Not my circus, not my monkeys."

                          Comment


                          • #14
                            I realized the issue with truncation. the fitted curve is ALWAYS below the real curve, whereas for rounded data, the fitted curve can pass through the real data point. Take this example of rounding vs truncating to 0 decimal places:
                            value rounded truncated
                            1.1 1 1
                            1.9 2 1
                            1.8 2 1
                            1.7 2 1
                            1.2 1 1
                            1.5 2 1
                            1.4 1 1
                            1.3 1 1
                            average 1.49 1.50 1.00
                            because the truncated value is always <= the true value, the fitted curve will always be <= the true value as well.

                            On a little more looking, my results for fitting the truncated data show no improvement over the truncated data itself. For a 31 point fit, the truncated data has an average error of 3.36 miles (~.05 degrees), and the fitted curve to the truncated data has an average error of 3.35 miles. Compare that to the rounded data, which has an average error of 1.67 miles (~.025 degrees), and the fitted curve to the rounded data has an average error of 0.37 miles (.005 degrees).

                            Comment


                            • #15
                              Originally posted by Mark Hunter View Post
                              A running average of Latitude-Longitude will give a distorted result from what is the best possible.

                              For the best result, use the original, coarse, L-L coordinates indicating a point on the sphere and convert them to vectors (I’ll describe that exactly in a moment). Then average the vectors. Then convert the result back to L-L.
                              snip!
                              This is the method I'd use, except that the given Lat & Long are ellipsoidal rather than spheroidal. (And I don't think I'd normalise to a unit sphere, even if a spherical approximation was acceptable, because the satellites' heights may be involved down the line.) But converting to Earth-centred Cartesian coordinates and then interpolating and/or smoothing is definitely the way to go.

                              Dan

                              Comment

                              Working...
                              X