Pipeline for clean up of data
Getting the data
- Data was copied from Hollie’s disk to KITT /RAID_STORAGE2/Shared_Data/20190819_RAD_EPIRAD/30-233732769/
- I copied it from here to my dir tejashree/Moorea/raw_data/
Barcode files
-
Barcode files were made from the csv file on google drive Moorea_2018_Sampling Adapter/Index Map
-
Barcode files for renaming STACKS output files for dDocent: dDocent requires barcodes file to rename .fq.gz files as it wants loc_sample.fq.gz.These were generated as before but adding the location name from the shared google drive csv Moorea_2018_Sampling. An acronym for each location was made as follows:
- West Opunohu Backreef : WOB
- East Opunohu Backreef : EOB
- Public Beach Fringe: PBF
- West Opunohu Fringe: WOF
STACKS
Script used: GitHub
stacks.sh
- Clone filter: This step will reduce each set of identical oligos to a single representative in the output.
- Process_radtags: This program examines raw reads from an Illumina sequencing run and first, checks that the barcode and the RAD cutsite are intact, and demultiplexes the data.
Ddocent
Renaming output files from Stacks GitHub
Rename_for_dDocent.sh
Edited the script to handle paired-end barcodes
Written on February 11, 2020