Implementation of the classic SingleR marker detection. More...

Classes
struct	ChooseBlockedOptions
	Options for `choose_blocked()`. More...

struct	ChooseOptions
	Options for `choose()`. More...

Functions
template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >
std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > >	choose (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ *label, const ChooseOptions &options)

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >
std::vector< std::vector< std::vector< Index_ > > >	choose_index (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ *label, const ChooseOptions &options)

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >
std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > >	choose_blocked (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ label, const Block_ block, const ChooseBlockedOptions &options)

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >
std::vector< std::vector< std::vector< Index_ > > >	choose_blocked_index (const tatami::Matrix< Value_, Index_ > &matrix, const Label_ label, const Block_ block, const ChooseBlockedOptions &options)

std::size_t	default_number (std::size_t num_labels)

Detailed Description

Implementation of the classic SingleR marker detection.

Function Documentation

◆ choose()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >

std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > > singler_classic_markers::choose	(	const tatami::Matrix< Value_, Index_ > &	matrix,
		const Label_ *	label,
		const ChooseOptions &	options )

Implements the classic SingleR method for choosing markers from (typically bulk) reference datasets. We assume that we have a matrix of representative expression profiles for each label, typically computed by averaging across all reference profiles for that label. For the comparison between labels \(A\) and \(B\), we define the marker set as the top genes with the largest positive differences in \(A\)'s profile over \(B\). This difference can be interpreted as the log-fold change if the input matrix contains log-expression values.

Template Parameters

Stat_	Floating-point type of the differences between medians.
Value_	Numeric type of matrix values.
Index_	Integer type of matrix row/column indices.
Label_	Integer type of the label identity.

Parameters

matrix	Matrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
label	Pointer to an array of length equal to the number of columns in `matrix`. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
options	Further options.

Returns: Top markers for each pairwise comparison between labels. Given the output, the vector at output[i][j] contains the top markers for label i over label j. Each marker is represented by a pair containing the row index in matrix and the difference between medians. Each innermost vector is sorted by the differences between medians. All differences are guaranteed to be positive.

◆ choose_blocked()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >

std::vector< std::vector< std::vector< std::pair< Index_, Stat_ > > > > singler_classic_markers::choose_blocked	(	const tatami::Matrix< Value_, Index_ > &	matrix,
		const Label_ *	label,
		const Block_ *	block,
		const ChooseBlockedOptions &	options )

Variant of choose() that handles multiple blocks (e.g., batch effects) in the reference dataset. Differences between medians are computed within each block and then combined across blocks to obtain a single statistic per gene in each pairwise comparison. The default method is to compute the mean of the per-block differences, but we can also compute the minimum for greater stringency.

Template Parameters

Stat_	Floating-point type of the differences between medians.
Value_	Numeric type of matrix values.
Index_	Integer type of matrix row/column indices.
Label_	Integer type of the label identity.
Block_	Integer type of the block assignment.

Parameters

matrix	Matrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
label	Pointer to an array of length equal to the number of columns in `matrix`. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
block	Pointer to an array of length equal to the number of columns in `matrix`. Each value of the array should specify the block for the corresponding column. Values should lie in \([0, B)\) for \(B\) unique blocks.
options	Further options.

Returns: Top markers for each pairwise comparison between labels. This is equivalent in structure to the return value of choose(), except that the combined difference between medians is reported for each marker.

◆ choose_blocked_index()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ , typename Block_ >

std::vector< std::vector< std::vector< Index_ > > > singler_classic_markers::choose_blocked_index	(	const tatami::Matrix< Value_, Index_ > &	matrix,
		const Label_ *	label,
		const Block_ *	block,
		const ChooseBlockedOptions &	options )

Variant of choose_blocked() that only reports the indices of the top markers for each pairwise comparison. This can be used directly in singlepp functions.

Template Parameters

Stat_	Floating-point type of the differences between medians.
Value_	Numeric type of matrix values.
Index_	Integer type of matrix row/column indices.
Label_	Integer type of the label identity.

Parameters

matrix	Matrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
label	Pointer to an array of length equal to the number of columns in `matrix`. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
block	Pointer to an array of length equal to the number of columns in `matrix`. Each value of the array should specify the block for the corresponding column. Values should lie in \([0, B)\) for \(B\) unique blocks.
options	Further options.

Returns: Top markers for each pairwise comparison between labels. This is the same as the output for choose_blocked() except that only the row index is reported in the innermost vector.

◆ choose_index()

template<typename Stat_ = double, typename Value_ , typename Index_ , typename Label_ >

std::vector< std::vector< std::vector< Index_ > > > singler_classic_markers::choose_index	(	const tatami::Matrix< Value_, Index_ > &	matrix,
		const Label_ *	label,
		const ChooseOptions &	options )

Variant of choose() that only reports the indices of the top markers for each pairwise comparison. This can be used directly in singlepp functions.

Template Parameters

Stat_	Floating-point type of the differences between medians.
Value_	Numeric type of matrix values.
Index_	Integer type of matrix row/column indices.
Label_	Integer type of the label identity.

Parameters

matrix	Matrix containing a reference dataset. Each column should correspond to a sample while each row should represent a gene.
label	Pointer to an array of length equal to the number of columns in `matrix`. Each value of the array should specify the label for the corresponding column. Values should lie in \([0, L)\) for \(L\) unique labels.
options	Further options.

Returns: Top markers for each pairwise comparison between labels. This is the same as the output for choose() except that only the row index is reported in the innermost vector.

◆ default_number()

std::size_t singler_classic_markers::default_number ( std::size_t num_labels )

inline

Default number of markers in choose() and choose_blocked().

The exact expression is defined as \(500 (\frac{2}{3})^{\log_2{L}}\) for \(L\) labels, which steadily decreases the markers per comparison as the number of labels increases. This aims to avoid an excessive number of features when dealing with references with many labels. At \(L=0\), the number of markers is set to zero.

Parameters

num_labels Number of labels in the reference(s).

Returns: An appropriate number of markers for each pairwise comparison.

Classes

Functions

Detailed Description

Function Documentation

◆ choose()

◆ choose_blocked()

◆ choose_blocked_index()

◆ choose_index()

◆ default_number()