Reports

Principal Curve Analysis

Under development--I will expand this post over time whenever I have the opportunity.

This is an interesting one. Years ago I followed a strongly related discussion (and spent some time thinking about it).

Some clarification would be much appreciated. I suspect your raw data are not linestrings. GPS locations with timestamps instead? Please share some or extend the description. Also, please add the importance of actual routes (if present) for your project. Moreover, it is crucial to know your overall goal (further anlysis etc.)

Inclusion of streets/routes from map databases, if available, give valuable information about what each GPS track tries to approximate. Tracks can be smooth functions of time. Therefore, timestamps are important. If available, we should try to model autocorrelation, since the errors at point locations are most likely not independent.

It might be incorrect to sample points along the linestring objects if actual GPS data (point locations) are available.

TODO: Further discussion and references

# list of sf-data.frames (linestrings) 
# to sf-data.frame with source-column id
sampled_routes = list(road_gps_1, road_gps_2, road_gps_3, road_gps_4)
d = lapply(sampled_routes, sf::st_cast, to='POINT') 
d = Map(`[<-`, d, 'id', value=seq_along(d)) |> 
  do.call(what='rbind')

TODO: add disatvantages and limitations of transformation: CRS -> 2d-Cartesian -> CRS; ways to overcome it.

# sf-data.frame to plain xy-coords matrix 
M = 
  d |>
  sf::st_geometry() |> # _$geometry
  unlist(recursive=FALSE, use.names=FALSE) |>
  matrix(ncol=2L, byrow=TRUE, dimnames=list(d$id, c('x', 'y')))

No fine (parameter) tuning yet.

# principal curve analysis 
f1 = 
  M |>
  princurve::principal_curve()

4.1

# colour palette 
col = 
  d$id |>
  unique() |>
  length() |>
  palette.colors(palette='Set 1')

4.2

# xy-plot with averaged track (black)
M = cbind.data.frame(M, id = d$id)  
plot(M$x, M$y, col=col[d$id], pch=20, xlab='x', ylab='y', main='Cartesian')
lapply(unique(d$id), 
       \(i) with(subset(M, id==i), lines(x, y, col=col[i], lty='dashed')))
lines(f1$s[f1$ord, ], lwd=2)
legend("topleft", legend=unique(d$id), col=col, pch=20, lty=2)
princurve::whiskers(as.matrix(M), f1$s, col = "gray")

What are the reasons motivating a comparison to the actual route/road?

TODO: Back-transformation (and plot)

Summary

TODO: ...

Notes

We use {sf} (already in use) and {princurve}.

79487416

Principal Curve Analysis