Friday, July 31, 2009

Using R to reshape data

Sometimes marker data are sent to us in this format, where each row only contains one genotype for a particular marker and person: 
 
ID Marker A1 A2 
1 M1 1 2 
2 M1 2 2 
3 M1 2 2 
1 M2 1 1 
3 M2 1 2 
 
However, in order to put this marker data in LINKAGE-format, we need to reshape the data so that each row contains all the marker data for a specific person. This can easily be done using the ‘reshape’ command in R: 
 
 
> a < - read.table("marker.txt",header=T) 
> a 
ID Marker A1 A2 
1 1 M1 1 2 
2 2 M1 2 2 
3 3 M1 2 2 
4 1 M2 1 1 
5 3 M2 1 2 
> attach(a) 
> b < - reshape(a,idvar="ID",direction="wide",timevar="Marker") 
> b 
ID A1.M1 A2.M1 A1.M2 A2.M2 
1 1 1 2 1 1 
2 2 2 2 NA NA 
3 3 2 2 1 2 
 

No comments:

Post a Comment

About Me

My photo
Pittsburgh, PA, United States