A document suggestion on Inputs section for the easy of use.

**TL; DR:** Add a hint or reminder to document and GUI, "Change `st_bin` into `ST_BIN_MS * SAMPLE_RATE` to load files that count in time samples instead of seconds, i.e. `kilosort` outputs"

--------------------
Hi, I have a little suggestion on loading `.npy` files to `rastermap`.

In the Inputs section, it says:
> If you have a `spike_times.npy` and `spike_clusters.npy`, create your time-binned data matrix with, where the bin size `st_bin` is in milliseconds (assuming your spike times are in seconds)

In my usage, the two `.npy` files are generated by `kilosort` or `mountainsort`. Those two files not always count in seconds. Instead, they count in sampling rates. In this case, `st_bin` is no longer in milliseconds as preferred. 

To know how to make this correct, we can look into the function `io.load_spike_times`:
```python
def load_spike_times(fname, fname_cluid, st_bin=100):
    st = np.load(fname).squeeze()
    clu = np.load(fname_cluid).squeeze()
    spks = csr_array((np.ones(len(st), "uint8"), 
                    (clu, np.floor(st / st_bin * 1000).astype("int"))))
    spks = spks.todense().astype("float32")
    return spks
```
Here a sparse array is created, where the y-coordinate is the cluster_id, while the x-coordinate is the spike time **rounded in bin size**, and the value is always `1`.

For example, assume the spike before **rounding** is:
cluster_id | spike_time (in seconds) | value
--- | --- | --- 
57 | 38.051 | 1
57 | 38.127 | 1

after rounding process  with `st_bin=1000` as an example, it becomes:
cluster_id | spike_time (in seconds) | value
--- | --- | --- 
57 | 38 | 1
57 | 38 | 1
 
**Then after the `todense()` process, there will be a `2` on (57,38).**

Now check the **key process of rounding**
```python 
np.floor(st / st_bin * 1000)
```
Here `st` is the spike time (**assumed to be in seconds**), `st_bin` is wanted bin_size in milliseconds, while `1000` is a convertion from milliseconds to seconds. 

When `st` is no longer in seconds, it becomes `st=ST_IN_S * SAMPLE_RATE`, the convertion meets a problem as `st / st_bin * 1000 = ST_IN_S * SAMPLE_RATE / st_bin * 1000`, a `SAMPLE_RATE` is lead into the calculation and cause the st_bin become much smaller than expected.

The fix method is to correspondingly multiply a `SAMPLE_RATE` to `st_bin`, and there is `st_bin = ST_BIN_MS * SAMPLE_RATE`.
After elimination of `SAMPLE_RATE`, it becomes `ST_IN_S / ST_BIN_MS *1000`, which is the original idea of the codes.

---------------
The codes are **correct**, but the document might point out that **`spike_times.npy` and `spike_clusters.npy` should be in seconds** more obviously to avoid misunderstanding.

As far as I know, issue #33 meets the same problem in `st_bin` setting, trying to load the result files of `kilosort` directly into rastermap. I didn't find a hint in GUI either. Besides, it might be difficult to analysis how to change `st_bin` when this problem occurs.

Thus, for the sake of easy-use, I suggest to add a hint both on the `README.md` file and GUI panel to hint the user: "May need change `st_bin` into `ST_BIN_MS * SAMPLE_RATE` to load files that count in time samples instead of seconds, i.e. `kilosort` outputs".



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A document suggestion on Inputs section for the easy of use. #37

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

A document suggestion on Inputs section for the easy of use. #37

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions