EPCY: Evaluation of Predictive CapabilitY

EPCY is a method used to rank genes (features) according to their potential as predictive (bio)markers, using quantitative data (like gene expression).


Using PyPI

pip install epcy
epcy -h


Similarly to Differential Expression analyses, EPCY take as input two tabulated files:

  • a table which describes the comparative design of the experience

  • a matrix which contains quantitative data for each gene (feature) and sample

Using these two input files, EPCY will evaluate the predictive capacity of each gene individually and return predictive scores, along with their confidence intervals.

To guarantee the reliability of predictive scores, EPCY uses a leave-one-out cross validation to train multiple Kernel Density Estimation (KDE) classifiers and evaluate their performances on unseen samples (see method for more details).


EPCY is a product of the Leucegene project and has been developed and tested specifically to analyse RNA-seq data of acute myeloid leukemia (AML) patients. However, the method implemented in EPCY is generic and should work on different quantitative data.


While we are finalizing work on the official paper, more details can be found in a poster presented at ISMB ECCB 2019:

Audemard E, Sauvé L and Lemieux S. EPCY: Evaluation of Predictive CapabilitY for ranking biomarker gene candidates [version 1; not peer reviewed]. F1000Research 2019, 8(ISCB Comm J):1349(poster)