mimicDetector: a pipeline for protein motif mimicry detection in host-pathogen systems
mimicDetector: a pipeline for protein motif mimicry detection in host-pathogen systems
Rich, K. D.; Wasmuth, J. D.
AbstractMotivation: Molecular mimicry is a widespread strategy used by pathogens to evade the host immune system and manipulate other host cellular processes. Detecting these events--where pathogen proteins resemble host molecules--is challenging due to limitations in the sensitivity, specificity, and scalability of current bioinformatics tools. The challenges are pronounced when identifying subtle similarities in short protein fragments. Results: We present mimicDetector, an optimized bioinformatic pipeline for systematically identifying protein-level molecular mimicry between pathogens and their hosts. mimicDetector builds on existing k-mer-based approaches with three key improvements: (i) improved sensitivity for short-sequence alignments using the PAM30 substitution matrix and tuned BLASTP parameters; (ii) a revised k-mer filtering strategy based on bitscore differences rather than percent identity; (iii) the removal of overly conservative homologue exclusion steps. Applied to 17 globally important pathogens, mimicDetector identified a broad and biologically plausible set of mimicry candidates, including helminth proteins mimicking components of the human complement system and a Leishmania infantum mimic of Reticulon-4, a regulator of immune cell recruitment. Availability and implementation: mimicDetector is freely available at https://github.com/Kayleerich/mimicDetector/, implemented in Python, and compatible with Unix-based systems.