Backup of Conservation Genetics of Mangroves/Seminars/Data Analysis(No. 4)

Data Analysis †

1. Sequence data editing by Sequence Scanner Software v1.0
Using this software, you can view, edit, print and export sequence text data generated by the Applied Biosystems Genetic Analyzers. This software also generates graphically expressive reports on results.
2. MEGA4 is an integrated tool for conducting automatic and manual sequence alignment, inferring phylogenetic trees, mining web-based databases, estimating rates of molecular evolution, and testing evolutionary hypotheses.

3. Peak Scanner™ Software v1.0
Use this free software to perform DNA fragment analysis; separate a mixture of DNA fragments according to their sizes, provide a profile of the separation, and precisely calculate the sizes of the fragments. The software allows you to view, edit, analyze, print, and export fragment analysis data generated using the Applied Biosystems Genetic Analyzers.

4. Notepad++ is a free (as in "free speech" and also as in "free beer") source code editor and Notepad replacement that supports several languages. Running in the MS Windows environment, its use is governed by GPL License.

6. Jedit is a mature programmer's text editor with hundreds (counting the time developing plugins) of person-years of development behind it.

Sometimes, you want to download sequences that are published in a paper. For example, look at this paper: Ted R. Schultz and Sean G. Brady. 2008. PNAS. 105(14): 5435-5440., "Major evolutionary transitions in ant agriculture"
What you should do to download GenBank is preparing list of accession numbers
In the supplement information, Table S2, accession numbers are listed.
To obtain these sequence from GenBank, you need to prepare, comma delimited accession numbers
```
Accession number 1, Accession number 2, ....
```
- To do this, follow the work flow in the text editor
  - Copy data from the Table S2
Using Jedit, do search/replace using regular expressio
- At first, repeated appeared same pattern will be replaced (spaces and dots should be replaced into other characters
```
no seq → no_seq
cf.  → cf_
```
- multiple white spaces to something
- change the file name into
```
 species name with accession
```
  by regular replace command with regular expression
To prepare the list of accession numbers, copy the list of accession numbers from the Table S2, and search GenBank with comma delimited accession numbers. Save sequences in FASTA formatted files
The obtained FASTA file can be alignment with clustalw
But, because of the limitation in the character length of sample name, it's better to edit sample name, at first.