Sort Spikes
This analysis groups spikes into clusters based on the similarity of their shapes.
Clustering algorithm in this analysis is similar to the one used in the clustering step of the SpyKING CIRCUS spike sorting toolbox.
Parameters
Parameter |
Description |
---|---|
Percent of Waveforms in Neighborhood |
Percent of waveforms to use when calculating the density of data points around each waveform in the PCA projection space. See Algorithm below. |
Maximum Initial Number of Clusters |
Maximum initial number of clusters. Similar clusters will be merged later. |
Cluster Merge Threshold |
Threshold to be used when merging clusters. See Algorithm below. |
Confidence Level for Outliers |
Confidence level (in percent) for detecting outliers. See Algorithm below. |
Create Neurons |
An option to create neuron variables for each cluster (waveform variables for each cluster are always created). |
Summary of Numerical Results
The following information is available in the Summary of Numerical Results
Column |
Description |
---|---|
Variable |
Variable name. |
XMin |
X Axis minimum in the PCA projections space. |
XMax |
X Axis maximum in the PCA projections space. |
YMin |
Y Axis minimum in the PCA projections space. |
YMax |
Y Axis maximum in the PCA projections space. |
Algorithm
The program selects waveforms in the specified time range and the interval filter.
Principal components are calculated using selected N
waveforms of the given waveform variable.
First, the matrix of covariances between waveform points (c[t, s]
) is calculated:
c[t, s]
= covariance between vectors waveform_value[t, *]
and waveform_value[s,*]
, s, t = 1, ...,number_of_points_in_each_waveform
.
Then, the eigenvalues and eigenvectors are calculated for the matrix c[t, s].
The eigenvectors (principal components) are sorted according to their eigenvalues.
The first principal component has the largest eigenvalue.
Analysis graph shows the scatter plot where x and y are projections of the selected waveforms to
the first two principal components (projection is a sum of products waveform_value[t]*principal_component_value[t]
).
The points in the PCA projections space are then used for cluster analysis.
For each point, the mean distance R
to the nearest S
points is calculated, where
S = Number_of_waveforms * Percent_of_Waveforms_in_Neighborhood/100
Then, the distance D
to the nearest point with a lower R
(or higher density) is calculated for each data point.
The intuition of the algorithm is that the cluster centroids
should be the points with a high density (i.e. low R
) and far apart from other points with higher density (high D
).
The M
points (where M = Maximum_Initial_Number_of_Clusters
) with the highest ratios D/R
are considered as initial cluster centroids.
Each point is then assigned to the same cluster as the closest point with a higher density (lower R
).
Normalized distances Gamma
between clusters are calculated according to equation (2) of the
publication describing
the details of the SpyKING CIRCUS algorithm. The pairs of clusters with Gamma
less than Cluster_Merge_Threshold
are merged.
For each cluster, the Confidence_Level_for_Outliers
percentile P
for the R
values of all the points in the cluster is calculated using bootstrap.
Data points with R
values exceeding P
are marked as outliers and the waveforms corresponding to outliers are assigned as unsorted
.
Reference
Pierre Yger et al. A spike sorting toolbox for up to thousands of electrodes validated with ground truth recordings in vitro and in vivo. Elife 2018 Mar 20;7:e34518