Tuesday, November 13, 2012

Sort lines in a file

To use a script, cut and paste the code from the light green or blue box into a terminal window, change the bold, red text as needed, and hit Enter.
See More Information for notes on using these tools.

Sort a file alphabetically

See General Sorting Notes for details on sorting order.

Sort alphabetically, in ASCENDING order (sort_alpha_asc)

Sort the lines in the file in ascending alphabetical order.
Input file(s)
Output file
perl -e ' while(<>) { push @lines, $_; } warn "\nSorted $. lines in ascending alphabetical order\n\n"; print sort @lines ' unsorted > sorted_asc
Example: Sort a gene list. Run the above script on a file called unsorted to get a file called sorted:
Original file (unsorted)Output file (sorted_asc)Screen Output
 ap23   7
 ap23   30
 CG2500 3
 cxb7   2
 CG12345        9
 CG2500 3
 CG12345        9
 CG2500 3
 CG2500 3
 ap23   30
 ap23   7
 cxb7   2
 Sorted 6 lines in ascending alphabetical order

Sort alphabetically, in DESCENDING order (sort_alpha_desc)

Sort the lines in the file in descending alphabetical order.
Input file(s)
Output file
perl -e ' while(<>) { push @lines, $_; } warn "\nSorted $. lines in descending alphabetical order\n\n"; print sort { $b cmp $a } @lines ' unsorted > sorted_desc
Example: Sort a gene list. Run the above script on a file called unsorted to get a file called sorted_desc:
Original file (unsorted)Output file (sorted_desc)Screen Output
 ap23   7
 ap23   30
 CG2500 3
 cxb7   2
 CG12345        9
 CG2500 3
 cxb7   2
 ap23   7
 ap23   30
 CG2500 3
 CG2500 3
 CG12345        9
 Sorted 6 lines in descending alphabetical order

Sort alphabetically, ignoring case

Z and z will count as the same letter in these sorts. See General Sorting Notes for more details on sorting order.

Sort alphabetically, ignoring case, in ASCENDING order (sort_nocase_asc)

Sort the lines in the file in ascending alphabetical order, counting z and Z as the same.
Input file(s)
Output file
perl -e ' while(<>) { push @lines, $_; } warn "\nSorted $. lines in ascending alphabetical order, ignoring case\n\n"; print sort { lc($a) cmp lc($b) } @lines ' unsorted > sorted_nc_asc
Example: Sort a gene list. Run the above script on a file called unsorted to get a file called sorted_nc_asc:
Original file (unsorted)Output file (sorted_nc_asc)Screen Output
 ap23   7
 ap23   30
 CG2500 3
 cxb7   2
 CG12345        9
 CG2500 3
 ap23   30
 ap23   7
 CG12345        9
 CG2500 3
 CG2500 3
 cxb7   2
 Sorted 6 lines in ascending alphabetical order, ignoring case

Sort alphabetically, ignoring case, in DESCENDING order (sort_nocase_desc)

Sort the lines in the file in descending alphabetical order, counting z and Z as the same.
Input file(s)
Output file
perl -e ' while(<>) { push @lines, $_; } warn "\nSorted $. lines in descending alphabetical order, ignoring case\n\n"; print sort { lc($b) cmp lc($a) } @lines ' unsorted > sorted_nc_desc
Example: Sort a gene list. Run the above script on a file called unsorted to get a file called sorted_nc_desc:
Original file (unsorted)Output file (sorted_nc_desc)Screen Output
 ap23   7
 ap23   30
 CG2500 3
 cxb7   2
 CG12345        9
 CG2500 3
 cxb7   2
 CG2500 3
 CG2500 3
 CG12345        9
 ap23   7
 ap23   30
 Sorted 6 lines in descending alphabetical order, ignoring case

Sort numerically

See General Sorting Notes for more details on sorting order.

Sort numerically, in ASCENDING order (sort_num_asc)

Sort a simple list numerically.
Input file(s)
Output file
perl -e ' while(<>) { push @lines, $_; } warn "\nSorted $. lines in ascending numerical order\n\n"; print sort { $a <=> $b } @lines ' unsorted > sorted_num_asc
Example: Sort a list of numbers. Run the above script on a file called unsorted to get a file called sorted_num_asc:
Original file (unsorted)Output file (sorted_num_asc)Screen Output
 Sorted 6 lines in ascending numerical order

Sort numerically, in DESCENDING order (sort_num_desc)

Input file(s)
Output file
perl -e ' while(<>) { push @lines, $_; } warn "\nSorted $. lines in ascending numerical order\n\n"; print sort { $b <=> $a } @lines ' unsorted > sorted_num_desc
Example: Sort a list of numbers. Run the above script on a file called unsorted to get a file called sorted_num_desc:
Original file (unsorted)Output file (sorted_num_desc)Screen Output
 Sorted 6 lines in ascending numerical order

Sort tabular data, based on the values in a given column

Sort numerically (ascending order) based on a given column (sort_on_column_num)

Sort lines in a tab-separated file in ascending order, based on the numerical value in a given column.
$columnNumerical column to sort by
Input file(s)
Output file
perl -e ' $column=1while(<>) { s/\r?\n//; @F=split /\t/, $_; push @sort_col, $F[$column]; push @lines, "$_\n"; } warn "\nSorted $. lines in ascending order, based on numerical values in column $column\n\n"; print @lines[sort { $sort_col[$a] <=> $sort_col[$b] } 0..$#sort_col] ' unsorted > sorted_col
Example: Sort a gene list based on the score in the first column. Run the above script on a file called unsorted to get a file called sorted_col:
Original file (unsorted)Output file (sorted_col)Screen Output
 ap23   7
 ap23   30
 CG2500 3
 cxb7   2
 CG12345        9
 CG2500 3
 cxb7   2
 CG2500 3
 CG2500 3
 ap23   7
 CG12345        9
 ap23   30
 Sorted 6 lines in ascending order, based on numerical values in column 1

Sort numerically (descending order) based on a given column (sort_on_column_num_desc)

Sort lines in a tab-separated file in descending order, based on the numerical values in a given column.
$columnNumerical column to sort by
Input file(s)
Output file
perl -e ' $column=1while(<>) { s/\r?\n//; @F=split /\t/, $_; push @sort_col, $F[$column]; push @lines, "$_\n"; } warn "\nSorted $. lines in descending order, based on numerical values in column 1\n\n"; print @lines[sort { $sort_col[$b] <=> $sort_col[$a] } 0..$#sort_col] ' unsorted > sorted_col
Example: Sort a gene list based on the score in the second, tab-separated column. Run the above script on a file called unsorted to get a file called sorted_col:
Original file (unsorted)Output file (sorted_col)Screen Output
 ap23   7
 ap23   30
 CG2500 3
 cxb7   2
 CG12345        9
 CG2500 3
 ap23   30
 CG12345        9
 ap23   7
 CG2500 3
 CG2500 3
 cxb7   2
 Sorted 6 lines in descending order, based on numerical values in column 1

Sort alphabetically (ascending order) based on a given column (sort_on_column_alpha)

Sort lines in a tab-separated file in ascending order, based on the text strings in a given column.
$columnText column to sort by
Input file(s)
Output file
perl -e ' $column=1while(<>) { s/\r?\n//; @F=split /\t/, $_; push @sort_col, $F[$column]; push @lines, "$_\n"; } warn "\nSorted $. lines in ascending order, based on text strings in column 1\n\n"; print @lines[sort { $sort_col[$a] cmp $sort_col[$b] } 0..$#sort_col] ' unsorted > sorted_col
Example: Sort a gene list based on the gene name in the second, tab-separated column. Run the above script on a file called unsorted to get a file called sorted_col:
Original file (unsorted)Output file (sorted_col)Screen Output
 1      ap23    7
 2      ap23    30
 3      CG2500  3
 4      cxb7    2
 5      CG12345 9
 6      CG2500  3
 5      CG12345 9
 3      CG2500  3
 6      CG2500  3
 1      ap23    7
 2      ap23    30
 4      cxb7    2
 Sorted 6 lines in ascending order, based on text strings in column 1

Sort alphabetically (descending order) based on a given column (sort_on_column_alpha_desc)

Sort lines in a tab-separated file in descending order, based on the text strings in a given column.
$columnText column to sort by
Input file(s)
Output file
perl -e ' $column=1; while(<>) { s/\r?\n//; @F=split /\t/, $_; push @sort_col, $F[$column]; push @lines, "$_\n"; } warn "\nSorted $. lines in descending order, based on text strings in column 1\n\n"; print @lines[sort { $sort_col[$b] cmp $sort_col[$a] } 0..$#sort_col] ' unsorted > sorted_col
Example: Sort a gene list based on the gene name in the second, tab-separated column. Run the above script on a file called unsorted to get a file called sorted_col:
Original file (unsorted)Output file (sorted_col)Screen Output
 1      ap23    7
 2      ap23    30
 3      CG2500  3
 4      cxb7    2
 5      CG12345 9
 6      CG2500  3
 4      cxb7    2
 1      ap23    7
 2      ap23    30
 3      CG2500  3
 6      CG2500  3
 5      CG12345 9

No comments:

Post a Comment