Skip to content

[Feature Request] weighted KNN imputation #318

@bvenn

Description

@bvenn

FSharp.Stats already supports the KNN imputation via FSharp.Stats.ML.Impute.kNearestImpute. The current implementation takes the k nearest neighbors and computes the average of these at the index of interest. This average replaces the missing value of the incomplete data point. I suggest to make the following changes/additions:

  • rename the module to Imputation to be consistent within the library
  • add the possibility to define how a missing value is encoded (e.g., 0.0 or nan)
  • add an optional converter function that processes the distance measure. When using Pearson's correlation coefficient you determine the similarity rather the distance and therefore you have to take the reciprocal.
  • add a weighted version in which the averaging can be weighted according to the distance of the nearest neighbors
  • add proper documentation

Keywords

  • Local Least Squares

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions