step_geodist {recipes} | R Documentation |
step_geodist
creates a specification of a
recipe step that will calculate the distance between
points on a map to a reference location.
step_geodist( recipe, lat = NULL, lon = NULL, role = "predictor", trained = FALSE, ref_lat = NULL, ref_lon = NULL, is_lat_lon = TRUE, log = FALSE, name = "geo_dist", columns = NULL, skip = FALSE, id = rand_id("geodist") )
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
lon, lat |
Selector functions to choose which variables are
used by the step. See |
role |
For model terms created by this step, what analysis role should they be assigned? By default, the new columns created by this step from the original variables will be used as predictors in a model. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
ref_lon, ref_lat |
Single numeric values for the location of the reference point. |
is_lat_lon |
A logical: Are coordinates in latitude and longitude? If
|
log |
A logical: should the distance be transformed by the natural log function? |
name |
A single character value to use for the new predictor column. If a column exists with this name, an error is issued. |
columns |
A character string of variable names that will
be populated (eventually) by the |
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
step_geodist
uses the Pythagorean theorem to calculate Euclidean
distances if is_lat_lon
is FALSE. If is_lat_lon
is TRUE, the Haversine
formula is used to calculate the great-circle distance in meters.
An updated version of recipe
with the new step added to the
sequence of any existing operations.
When you tidy()
this step, a tibble with columns
echoing the values of lat
, lon
, ref_lat
, ref_lon
,
is_lat_lon
, name
, and id
is returned.
The underlying operation does not allow for case weights.
https://en.wikipedia.org/wiki/Haversine_formula
Other multivariate transformation steps:
step_classdist()
,
step_depth()
,
step_ica()
,
step_isomap()
,
step_kpca_poly()
,
step_kpca_rbf()
,
step_kpca()
,
step_mutate_at()
,
step_nnmf_sparse()
,
step_nnmf()
,
step_pca()
,
step_pls()
,
step_ratio()
,
step_spatialsign()
data(Smithsonian, package = "modeldata") # How close are the museums to Union Station? near_station <- recipe(~., data = Smithsonian) %>% update_role(name, new_role = "location") %>% step_geodist( lat = latitude, lon = longitude, log = FALSE, ref_lat = 38.8986312, ref_lon = -77.0062457, is_lat_lon = TRUE ) %>% prep(training = Smithsonian) bake(near_station, new_data = NULL) %>% arrange(geo_dist) tidy(near_station, number = 1)