CheckMatrix or py_matrix_2D_VXXX_RECBIT.py gradually evolved into three versions: V087, V112 and V246. All three versions are similar in general. The latest one py_matrix_2D_V248_RECBIT.py has a greater flexibility and functionality. You need to read this web page first regardless of which version you are going to use. | ||
py_matrix_2D_V087_RECBIT.py generates 2D plots of all markers versus all markers displaying recombination scores by color gradient (read this web page for details). Two input files are required: map file and matrix file with recombination scores. Matrix file can be generated using Python_MadMapper_V112_RECBIT.py script. | py_matrix_2D_V112_RECBIT.py was extended to generate images with graphical genotyping. Additional file with raw marker scores is required to use this version. Representation of genetic map in a form of circular graph was implemented here. Read more about this version at Genetic_Map_Raw_Scores.html web page (of course, after reading current web page). | py_matrix_2D_V248_RECBIT.py has greater flexibility and functionality, color scheme for this version was designed to enhance highlighting of the regions with negative linkage on genetic map. Additional text output files are generated to help to validate constructed genetic maps. Jump to Genetic_Map_Matrix_Plot_Art.html to find details (after reading this web page). |
py_matrix_2D_V087_RECBIT.py Python script is designed
to visualize and validate genetic maps. py_matrix_2D_V087_RECBIT.py generates 2D plot images
of all markers (X axis) versus all markers (Y axis). Each dot on these plots is a recombination/linkage
value or other type of scores (LOD, BIT) for any given pair of markers. Markers are ordered as they
ordered on genetic maps. Visualization of patterns of colored diagonals on those 2D plots can help to
understand and validate constructed genetic maps. Two input files are required. First input file is
a genetic map data. Second input file is a "global" matrix file with recombination/linkage data for all
possible pairs of markers. Matrix file can be generated by JoinMap program (jmrec), for
example, or by accompanying Python_MadMapper_V112_RECBIT.py (PyMap)
Python script. Details about usage of Python_MadMapper_V112_RECBIT.py script
you can find here.
Detailed description of BIT scoring system
you can find here.
Data for Arabidopsis genetic map
based on genotyping of recombinant inbred lines (RILs) developed by Dean and Lister have been used
to illustrate usage of py_matrix_2D_V087_RECBIT.py and Python_MadMapper_V112_RECBIT.py scripts.
Excel spreadsheet with marker scores and map data
have been downloaded from NASC web site.
Modified version (can be used by JoinMap) of the file with recombination scores:
DL_RIL_Data.may2001.loc
and map data for five Arabidopsis chromosomes:
ath-chrom1-map.txt
ath-chrom2-map.txt
ath-chrom3-map.txt
ath-chrom4-map.txt
ath-chrom5-map.txt
have been used for further analysis.
Note:
|
|
|
|
|
|
recombi- nation scores |
|
|
|
|
|
LOD scores |
|
|
|
|
|
recombi- nation scores |
|
|
|
|
|
BIT scores |
|
|
|
|
|
HOW IT WAS DONE OR
STEP BY STEP INSTRUCTIONS HOW TO GENERATE GENETIC MAP 2D PLOTS
STEP 1
Generation of "global" matrix file using Python_MadMapper_V112_RECBIT.py
script.
Input files: DL_RIL_Data.may2001.loc and optional list of framework markers
Dean_Lister_frame.IDs.
Execute from command line:
$python Python_MadMapper_V112_RECBIT.py DL_RIL_Data.may2001.loc DL_RIL_Data.may2001.out 0.2 100 0.25 Dean_Lister_frame.IDs
where:
DL_RIL_Data.may2001.loc - input file,
DL_RIL_Data.may2001.out - output file,
0.2 - recombination value cutoff,
100 - BIT score value cutoff,
0.25 - datapoints value cutoff
Dean_Lister_frame.IDs - framework markers list
With current dataset (1357 markers, 101 RILs) script will work for one or two hours to perform
pairwise comparisons between all markers and clustering (1GHz CPU, 2Gb of RAM).
Clustering (group analysis) is a final step of this script. Output of the program is a set
of 35 files. Detailed description of the output you can find here.
To generate 2D matrix plot we
are interested in DL_RIL_Data.may2001.out.pairs_all
file (53 Mb) only which contains recombination/linkage and
BIT scores for all pairs of markers.
[ Alternatively you can use JoinMap output (jmrec) as a global matrix file.
How to work with JoinMap is not a topic of this document ]
STEP 2
py_matrix_2D_V087_RECBIT.py takes as input matrix file,
map file and three optional files: framework markers list,
list of IDs to highlight in red and *.loc file with recombination data.
if user provides framework markers list (map of framework markers) then
these markers will be painted in purple on 2D plot.
if user provides list of IDs to highlight in red then
these markers will be painted in red on 2D plot (we use it to highlight new markers on a map).
if user provides *.loc file with recombination scores (raw data for markers and RILs) then
allele composition plot will be generated on the bottom of image.
Program usage:
[matrix_file] [map_file] [output_file] [frame_marker_list] [red_list] [loc_file] [REC/BIT/LOD]
frame_marker_list is optional, if you do not have it just type X
red_list is a list of markers to highlight in red
red_list is optional, if you do not have it just type Y
loc_file is optional, if you do not have it just type Z
For example, for Arabidopsis chromosome 4 we can execute:
$python py_matrix_2D_V087_RECBIT.py DL_RIL_Data.may2001.out.pairs_all ath-chrom4-map.txt ath-chrom4-map.out.bit Dean_Lister_frame.IDs Y DL_RIL_Data.may2001.loc BIT
where:
DL_RIL_Data.may2001.out.pairs_all - matrix file (53 Mb)
ath-chrom4-map.txt - map file
ath-chrom4-map.out.bit - output file(s)
Dean_Lister_frame.IDs - map containing framework markers only
Y - we do not use list of markers to highlight them in red color
DL_RIL_Data.may2001.loc - *.loc file with recombination data
BIT - "BIT" option which tells to program generate image with BIT score
OUTPUT FILES:
Several image files will be generated and one text file:
ath-chrom4-map.out.bit.large.png - large image (full size image)
ath-chrom4-map.out.bit.medium.png - medium size image (1000 x 750 pixels)
ath-chrom4-map.out.bit.small.png - small image (200 x 150 pixels)
ath-chrom4-map.out.bit.2000.png - image with size 2000 x 1500 pixels (optional)
ath-chrom4-map.out.bit.tab - 2D matrix text file
by using REC option (last argument when you run the script) images with recombination scores will be generated:
$python py_matrix_2D_V087_RECBIT.py DL_RIL_Data.may2001.out.pairs_all ath-chrom4-map.txt ath-chrom4-map.out.pymap Dean_Lister_frame.IDs Y DL_RIL_Data.may2001.loc REC
ath-chrom4-map.out.pymap.large.png - large image (full size image)
ath-chrom4-map.out.pymap.medium.png - medium size image (1000 x 750 pixels)
ath-chrom4-map.out.pymap.small.png - small image (200 x 150 pixels)
ath-chrom4-map.out.pymap.2000.png - image with size 2000 x 1500 pixels (optional)
ath-chrom4-map.out.pymap.tab - 2D matrix text file
STEP 3 (optional luxury)
It is possible to generate genetic matrix 2D plot for the whole Arabidopsis genome.
Map data for all five chromosomes were concatenated into one file
ath-chrom-all-map.map. ath-chrom-all-map.map file
was modified using excel spreadsheet so map positions for markers for all five chromosomes form
contiguous sequence: ath-chrom-all-map.mad (check third column).
ath-chrom-all-map.mad file and matrix file
DL_RIL_Data.may2001.out.pairs_all have been used as input files for py_matrix_2D_V087_RECBIT.py script:
$python py_matrix_2D_V087_RECBIT.py DL_RIL_Data.may2001.out.pairs_all ath-chrom-all-map.mad ath-chrom-all-map.out.bit Dean_Lister_frame.IDs Y DL_RIL_Data.may2001.loc BIT
Output for the whole Arabidopsis genome:
ath-chrom-all-map.out.bit.large.png - large image (full size image)
ath-chrom-all-map.out.bit.medium.png - medium size image (1000 x 750 pixels)
ath-chrom-all-map.out.bit.small.png - small image (200 x 150 pixels)
ath-chrom-all-map.out.bit.2000.png - image with size 2000 x 1500 pixels
py_matrix_2D_V087_RECBIT.py was written for CGPDB project
to assist in construction, validation and visualization of
lettuce genetic map.
Arabidopsis data were used to check program functionality and compare results with lettuce recombination data.
-------------------------------------------------------------------------
WORK IN PROGRESS!
Download "Arabidopsis Genetic Map" poster 36 x 48 inches: ATH_GeneticMap_B1.ppt (PowerPoint format) images generated with py_matrix_2D_V087_RECBIT.py |
Download "Arabidopsis Five Linkage Groups" poster 36 x 48 inches: ATH_GeneticMap_A1.ppt (PowerPoint format) images generated with py_matrix_2D_V112_RECBIT.py |