Design
The design module provides facilities for nucleic acid design. It is currently based on an artificial immune system algorithm with naive sequence sampling. Future versions will include smarter sampling algorithms and arbitrary optimization drivers (think Ant-colony optimization, etc.).
General usage
The basic usage of nimna.design relies on the definition of a fitness function, and a set of constraints on the sequence space to be searched. Those are then fed into a DesignEngine, which performs the actual sequence optimization.
Example
We want the following fold and sequence constraint: N N N N --- G N --- G N --- C 5' N N N N N | | | | N 3' N U C C N N H ---- N H --- N H --- A G A U
Let's design such a sequence:
import strformat import nimna, nimna.design const structure = "(((((((...)))...(((...))).))))" constraint = "NNNNNNNNNNGGCNNNNNAAUGHHHNCCUN" population = 100 let opts = settings(temperature = 37.0) proc fitness(c: Compound): float = c.update(opts) let targetEnergy = c.eval(structure) (ensembleEnergy, _) = c.pf ## we want a sequence for which the target ## structure dominates the ensemble. result = targetEnergy - ensembleEnergy var engine = newEngine(100, fitness) engine.pattern = constraint for idx in 0 ..< 5: engine.step(1000) engine.mutationProbability = engine.mutationProbability - 0.1 echo fmt"Best candidate is {engine.best.sequence} with score {engine.score}"
Types
Mutator = object constraintString*: string backgroundProbs*: Table[char, float] totalProb*: Table[char, float] mutationProb*: float consistentProb*: float stringLength*: int pairConstraints*: seq[tuple[i: int, j: int]] freeConstraints*: seq[int]
- Object containing all constraints and parameters used for sequence mutation.
DesignEngine = ref object population*: seq[Compound] populationSize: int best*: Compound score*: float scoringFunction: proc (c: Compound): float settings*: Settings mutator*: Mutator
- Object containing a population and scoring function for nucleic acid design.
Consts
concreteAlphabet = {'A', 'a', 'C', 'c', 'G', 'g', 'T', 't', 'U', 'u'}
abstractAlphabet = {'N', 'n', 'B', 'b', 'D', 'd', 'H', 'h', 'V', 'v', 'W', 'w', 'S', 's', 'R', 'r', 'Y', 'y'}
skipNucleotides = (data: [(0, 0, {}), (0, 0, {}), (66, 66, {65, 97}), (98, 98, {65, 97}), (68, 68, {67, 99}), (100, 100, {67, 99}), (0, 0, {}), (0, 0, {}), (72, 72, {71, 103}), (104, 104, {71, 103}), (0, 0, {}), (0, 0, {}), (0, 0, {}), (0, 0, {}), (0, 0, {}), (0, 0, {}), (0, 0, {}), (0, 0, {}), (82, 82, {67, 99, 84, 116}), (83, 83, {65, 97, 84, 116}), (115, 115, {65, 97, 84, 116}), (114, 114, {67, 99, 84, 116}), (86, 86, {84, 116}), (118, 118, {84, 116}), (87, 87, {71, 103, 67, 99}), (119, 119, {71, 103, 67, 99}), (89, 89, {65, 97, 71, 103}), (121, 121, {65, 97, 71, 103}), (0, 0, {}), (0, 0, {}), (0, 0, {}), (0, 0, {})], counter: 16)
Procs
proc newEngine(popSize: int; fitness: proc (c: Compound): float): DesignEngine {.
raises: [], tags: [].}- Creates a new DesignEngine with a population of size popSize and a fitness function fitness. For the fitness function smaller is better.
proc background=(eg: DesignEngine; probs: Table[char, float]) {.
raises: [KeyError], tags: [].}- Sets the background probabilities of nucleotides.
proc mutationProbability=(eg: DesignEngine; prob: float) {.
raises: [], tags: [].}- Sets the probability of a base to be mutated at each step.
proc consistentMutationProbability=(eg: DesignEngine; prob: float) {.
raises: [], tags: [].}- Sets the probability of a base pair, or unpaired base to be mutated consistent with a set of proposed secondary structures.
proc pattern=(eg: DesignEngine; pattern: string) {.
raises: [], tags: [].}-
Sets a constraint on the bases at each position in the population. The constraint follows pattern in IUPAC notation:
E.g: H corresponds to one of A, C, T, N to anything, and so on.
proc addStructure(eg: DesignEngine; structure: string) {.
raises: [], tags: [].}- Adds a structure for consistent mutation to the DesignEngine eg.
proc structure=(eg: DesignEngine; structure: string) {.
raises: [], tags: [].}- Sets a structure for consistent mutation for the DesignEngine eg.
proc mutate(mt: Mutator; source: string = ""): Compound {.
raises: [KeyError, Exception], tags: [RootEffect].}- Returns a mutated Compound derived from source, according to the parameters set in the Mutator mt.
proc populate(eg: DesignEngine) {.
raises: [KeyError, Exception], tags: [RootEffect].}- Populates a DesignEngine eg according to its set properties.
proc eval(eg: DesignEngine) {.
raises: [Exception], tags: [RootEffect].}- Evaluates all members of the population stored in eg and selects the best.
proc step(eg: DesignEngine; iterations: int = 1) {.
raises: [KeyError, Exception], tags: [RootEffect].}- Performs iterations steps of mutation and evaluation on the population in the DesignEngine eg.