is a tool for inferring and plotting 3D structures using partitioned MDS, a novel approximation to multidimensional scaling (MDS). It produces a single 3D structure from a Hi-C BED file, representing an ensemble average of chromosome conformations within the population of cells. miniMDS is able to process high-resolution Hi-C data quickly with limited memory requirements. Human genome 3D structures can be inferred at kilobase-resolution within several hours on a desktop computer. miniMDS also supports inter-chromosomal structural inference.
is a framework for characterizing class-discriminative motifs in a collection of genomic loci that have several (overlapping) annotation labels.
is a novel framework for analyzing collections of multi-condition ChIP-seq datasets and characterizing differential binding events between conditions. In analyzing multiple-condition ChIP-seq datasets, MultiGPS encourages consistency in the reported binding event locations across conditions and provides accurate estimation of ChIP enrichment levels at each event. MultiGPS manual and downloads are available here.
GPS & GEM predict protein-DNA interaction events at high spatial resolution from ChIP-seq or ChIP-exo data while retaining the ability to resolve closely spaced events that appear as a single cluster of reads. Both GPS and GEM model observed reads using a complexity penalized mixture model and efficiently predict event locations with a segmented EM algorithm. GPS was our first generation software package in this series. GEM extends the concept by linking binding event discovery and motif discovery with an integrated model of ChIP reads and proximal DNA sequences. GEM uses predicted binding events as a positional prior for a novel k-mer based de novo motif discovery algorithm, and reciprocally improves the resolution of the binding event predictions using the discovered motifs as a positional prior.
STAMP is a webserver resource for aligning transcription factor DNA-binding motifs. Input motifs may be aligned against each other using a wide choice of comparison metrics and alignment strategies. A multiple alignment, familial binding profile, and similarity tree are also produced from the set of input motifs. STAMP also matches each of the input motifs against a choice of databases of known TF binding motifs. STAMP allows the input of many different motif formats, including the input of entire output files from a number of supported motif-finders. In this way, STAMP provides a valuable resource for those researchers who wish to interpret their motif-finding results; such users may simply analyze their results using STAMP to see if any of their newly discovered motifs are similar to any known binding preferences.
Code for a command-line version of STAMP is available here: https://github.com/seqcode/stamp
SOMBRERO is a motif-finder that is based on the Self-Organizing Map neural network algorithm. In contrast to other probabilistic motif discovery tools, SOMBRERO poses motif-finding as a clustering problem. As such, SOMBRERO simultaneously estimates all motif signals in the input sequences (regulatory signals are separated from others during post-processing), as opposed to estimating each significant signal one-by-one. This clustering approach to motif-finding is undoubtedly more computationally costly than more traditional approaches. However, the great advantage of the approach is that multiple instances of prior knowledge may be used to initialize the motif-search. Prior knowledge of the implanted motif has been shown to significantly improve the accuracy of motif-finders. Of course, in typical de novo motif searches, we do not know what type of signal we are looking for. Traditional motif-finders may only incorporate one prior at a time, so the application of priors to motif-finding has been limited to those rare cases where certain motif signals are expected. SOMBRERO is the first motif-finder that can incorporate knowledge of all known motifs at the start of the motif search.
RescueNet uses the Self-Organizing Map neural network algorithm for codon usage anaysis and gene-prediction. In its gene prediction functionality, RescueNet can estimate multiple models of gene codon usage properties during training. This offers advantageous gene-finding performance in cases where a diverse number of codon usage patterns are displayed. Examples include metagenomic datasets and prokaryotic genomes where mutational pressure, translational efficiency and horizontal gene transfer have diversified the displayed codon usage patterns.