singler_classic_markers
Classic marker detection for SingleR
Loading...
Searching...
No Matches
singler_classic_markers Namespace Reference

Implementation of the classic SingleR marker detection. More...

Classes

struct  ChooseBlockedOptions
 Options for choose_blocked(). More...
 
struct  ChooseOptions
 Options for choose(). More...
 

Functions

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >
std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > > choose (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ *label, const ChooseOptions &options)
 
template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >
std::vector< std::vector< std::vector< Index_ > > > choose_index (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ *label, const ChooseOptions &options)
 
template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >
std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > > choose_blocked (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ *label, const Block_ *block, const ChooseBlockedOptions &options)
 
template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >
std::vector< std::vector< std::vector< Index_ > > > choose_blocked_index (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ *label, const Block_ *block, const ChooseBlockedOptions &options)
 
std::size_t default_number (std::size_t num_labels)
 

Detailed Description

Implementation of the classic SingleR marker detection.

Function Documentation

◆ choose()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >
std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > > singler_classic_markers::choose ( const tatami::Matrix< Value_, Index_ > & matrix,
const Label_ * label,
const ChooseOptions & options )

Implements the classic SingleR method for choosing markers from (typically bulk) reference datasets. We assume that we have a matrix of representative expression profiles for each label, typically computed by averaging across all reference profiles for that label. For the comparison between labels \(A\) and \(B\), we define the marker set as the top genes with the largest positive differences in \(A\)'s profile over \(B\). This difference can be interpreted as the log-fold change if the input matrix contains log-expression values.

Template Parameters
Stat_Floating-point type of the differences between medians.
Value_Numeric type of matrix values.
Index_Integer type of matrix row/column indices.
Label_Integer type of the label identity.
Parameters
matrixMatrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
labelPointer to an array of length equal to the number of columns in matrix. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
optionsFurther options.
Returns
Top markers for each pairwise comparison between labels. Given the output, the vector at output[i][j] contains the top markers for label i over label j. Each marker is represented by a pair containing the row index in matrix and the difference between medians. Each innermost vector is sorted by the differences between medians. All differences are guaranteed to be positive.

◆ choose_blocked()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >
std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > > singler_classic_markers::choose_blocked ( const tatami::Matrix< Value_, Index_ > & matrix,
const Label_ * label,
const Block_ * block,
const ChooseBlockedOptions & options )

Variant of choose() that handles multiple blocks (e.g., batch effects) in the reference dataset. Differences between medians are computed within each block and then combined across blocks to obtain a single statistic per gene in each pairwise comparison. The default method is to compute the mean of the per-block differences, but we can also compute the minimum for greater stringency.

Template Parameters
Stat_Floating-point type of the differences between medians.
Value_Numeric type of matrix values.
Index_Integer type of matrix row/column indices.
Label_Integer type of the label identity.
Block_Integer type of the block assignment.
Parameters
matrixMatrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
labelPointer to an array of length equal to the number of columns in matrix. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
blockPointer to an array of length equal to the number of columns in matrix. Each value of the array should specify the block for the corresponding column. Values should lie in \([0, B)\) for \(B\) unique blocks.
optionsFurther options.
Returns
Top markers for each pairwise comparison between labels. This is equivalent in structure to the return value of choose(), except that the combined difference between medians is reported for each marker.

◆ choose_blocked_index()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >
std::vector< std::vector< std::vector< Index_ > > > singler_classic_markers::choose_blocked_index ( const tatami::Matrix< Value_, Index_ > & matrix,
const Label_ * label,
const Block_ * block,
const ChooseBlockedOptions & options )

Variant of choose_blocked() that only reports the indices of the top markers for each pairwise comparison. This can be used directly in singlepp functions.

Template Parameters
Stat_Floating-point type of the differences between medians.
Value_Numeric type of matrix values.
Index_Integer type of matrix row/column indices.
Label_Integer type of the label identity.
Parameters
matrixMatrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
labelPointer to an array of length equal to the number of columns in matrix. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
blockPointer to an array of length equal to the number of columns in matrix. Each value of the array should specify the block for the corresponding column. Values should lie in \([0, B)\) for \(B\) unique blocks.
optionsFurther options.
Returns
Top markers for each pairwise comparison between labels. This is the same as the output for choose_blocked() except that only the row index is reported in the innermost vector.

◆ choose_index()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >
std::vector< std::vector< std::vector< Index_ > > > singler_classic_markers::choose_index ( const tatami::Matrix< Value_, Index_ > & matrix,
const Label_ * label,
const ChooseOptions & options )

Variant of choose() that only reports the indices of the top markers for each pairwise comparison. This can be used directly in singlepp functions.

Template Parameters
Stat_Floating-point type of the differences between medians.
Value_Numeric type of matrix values.
Index_Integer type of matrix row/column indices.
Label_Integer type of the label identity.
Parameters
matrixMatrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
labelPointer to an array of length equal to the number of columns in matrix. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
optionsFurther options.
Returns
Top markers for each pairwise comparison between labels. This is the same as the output for choose() except that only the row index is reported in the innermost vector.

◆ default_number()

std::size_t singler_classic_markers::default_number ( std::size_t num_labels)
inline

Default number of markers in choose() and choose_blocked().

The exact expression is defined as \(500 (\frac{2}{3})^{\log_2{L}}\) for \(L\) labels, which steadily decreases the markers per comparison as the number of labels increases. This aims to avoid an excessive number of features when dealing with references with many labels. At \(L=0\), the number of markers is set to zero.

Parameters
num_labelsNumber of labels in the reference(s).
Returns
An appropriate number of markers for each pairwise comparison.