There
are two tab-separated structured text files. chr column in AssociatedMarkers.txt is the logical foreign key
pointing to Chr column in DiseaseMarkers.txt.
We want to create a new structured text file, in which one column comes from AssociatedMarkers.txt’s
snps_BCG24 column and the other is a
computed column that will get its values through the following algorithm: If a
value of AssociatedMarkers.txt’s hg19pos
column falls within the startLoc and endLoc in DiseaseMarkers.txt, then output
it as inLocus; otherwise output it as an empty string. Selections of the two
files are as follows:
AssociatedMarkers.txt
DiseaseMarkers.txt
esProc approach:
A1,A2: Import
the files into memory. @t means importing column names at the same time.
A3: Perform join operation. Result is as follows:
A4: Retrieve desired columns from A3. _1.hg19pos column corresponds AssociatedMarkers.txt’s hg19pos column. The final result is as follows:
No comments:
Post a Comment