# Nasv **Repository Path**: fcnd007/Nasv ## Basic Information - **Project Name**: Nasv - **Description**: No description available - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-08 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 1.Preparation Please install python3 and following module first: scipy,numpy,configparser and PyVcf. # 2.Pre-processing ## 2.1 LAST Installation NaSV was built based on LAST mapping data.LAST mapping LAST is able to map the reads in non-overlapping split segments.LAST can be download as the zip file from http://last.cbrc.jp/, and installed by following comands: ``` > gunzip last.zip > cd /path/to/lastdir > make ``` ## 2.2 Indexing Reference Genome First you need to index your reference genome by creating a lastal database: ``` > lastdb [referencedb] [reference.fa] ``` ## 2.3 Mapping Train LAST to get the best scoring parameters for your particular alignment. We typically use a subset of > 10,000 reads for this step: ``` > last-train -Q1 [referencedb] [reads_sample.fastq] > [my-params] ``` Map your fastq data to reference: ``` > lastal -Q1 -p [my-params] [referencedb] [reads.fastq] | last-split > [reads.maf] ``` All of the above commands can also be run at once using pipes: ``` > lastal -Q1 -p [my-params] [referencedb] [reads.fastq] | \ > last-split > [reads.maf] ``` # 3.SV calling using Nasv ``` NaSV usage > NaSV.py [-h] [-c CONFIG] [-o OUTPUT] [-i reads.maf] NaSV arguments and parameters: required arguments: -i, --maf : /path/to/reads.maf -o, --output : Give the full path to the output vcf file optional arguments: -h, --help : Show the help message and exit -c, --config : Give the full path to your own ini file [ default: config.ini ] ``` optional configuration parameters: Nasv uses a config.ini file which contains default settings for all running parameters. Users can change the parameters by creating their own config.ini file and provide this as a command line argument [-c] ``` **Reads segments options:** **[Filter options]** *Minimum mapping quality of the segment* mq_cutoff=10 *Maximum percentage of unaligned first bases for a qualified read* start_cutoff=0.2 **Parameters for tuning detection and clustering of breakpoints:** **[Detection options]** *Minimum number of breakpoint-junctions (i.e. split-read junctions) for clustering* read_cutoff=5 *Maximum distritution distance for a breakend position (assuming normal distribution)* merge_bp_cutoff=100 *Maximum distance for two breakend positions decided as homology* merge_cutoff=20 **Parameters for setting the FILTER flag in the vcf output:** **[Output filter options]** *Minimum reads percentage for a breakpoint decided as precise* precise_cutoff=0.5 ```