evolution & bioinformatics

Thursday, May 14, 2015

Meng's Notes: Simple Enrichment Test -- calculate hypergeometric...

Meng's Notes: Simple Enrichment Test -- calculate hypergeometric...: Hypergeometric test are useful for enrichment analysis. For example, having a gene list in hand, people might want to tell which functions (...

Wednesday, December 19, 2012

Simple Enrichment Test -- calculate hypergeometric p-values in R

Hypergeometric test are useful for enrichment analysis. For example,
having a gene list in hand, people might want to tell which functions
(GO terms) are enriched among these genes. Hypergeometric test (or its
equivalent: one-tailed
Fisher's exact test) will give you statistical confidence in

p-values.

R software provids function phyper and fisher.test
for Hypergeometric and Fisher's exact test accordingly. However, it is
tricky to get it right. I spent some time to make it clear.

Here is a simple example:

Five cards were chosen from a well shuffled deck

x = the number of diamonds selected.

We use a 2x2 table to represent the case:

                Diamond     Non-Diamond

selected        x                     5-x               total 5 sampled cards

left               13-x                 34+x             total 47 left cards after sampling

                 13 Dia        39 Non-Dia         total 52 cards

We 're asking if diamond enriched or depleted in our selected cards, comparing to the background.

Here are the different parameters used by phyper and fisher.test:

phyper(x, 13, 39, 5, lower.tail=TRUE);

# Numerical parameters in order:

# (success-in-sample, success-in-bkgd, failure-in-bkgd, sample-size).

fisher.test(matrix(c(x, 13-x, 5-x, 34+x), 2, 2), alternative='less');

# Numerical parameters in order:

# (success-in-sample, success-in-left-part, failure-in-sample, failure-in-left-part).

It's obvious that hypergeometric test compares sample to bkgd, while
fisher's exact test compares sample to the left part of bkgd after
sampling without replacement. They will give the same p-value (because
they assume the same distribution).

Here is the results:

x=1; # x could be 0~5

hitInSample = 1 # could be 0~5

hitInPop = 13

failInPop = 54-hitInPop

sampleSize = 5

Test for under-representation (depletion)

phyper(hitInSample-1, hitInPop, failInPop, sampleSize, lower.tail= TRUE);

## [1] 0.6329532

fisher.test(matrix(c(hitInSample, hitInPop-hitInSample, sampleSize-hitInSample, failInPop-sampleSize +hitInSample), 2, 2), alternative='less')$p.value;

## [1] 0.6329532

Test for over-representation (enrichment)

phyper(hitInSample-1, hitInPop, failInPop, sampleSize, lower.tail= FALSE);

## [1] 0.7784664

fisher.test(matrix(c(hitInSample, hitInPop-hitInSample, sampleSize-hitInSample, failInPop-sampleSize +hitInSample), 2, 2), alternative='greater')$p.value;

## [1] 0.7784664

Why hitInSample-1 when testing over-representation?

Because if lower.tail is TRUE (default), probabilities are
P[X ≤ x], otherwise, P[X > x]. We subtract x by 1, when P[X ≥ x] is needed.

So are there any advantages fisher.test has over phyper, as they give the same p-values?

Yes, fisher.test can do two other jobs: two-side test, and giving
confidence intervals of odds ratio. Please refer to its manual for
details. For one-side p-value calculating, they don't have any
difference if correct parameters were used.

Tuesday, May 12, 2015

How to change the alpha value of colours in R

30 Apr 2013 07:08 alpha channel , R , rgb , Tutorials

Often I like to reduce the alpha value (level of transparency) of colours to identify patterns of over-plotting when displaying lots of data points with R. So, here is a tiny function that allows me to add an alpha value to a given vector of colours, e.g. a RColorBrewer palette, using col2rgb and rgb, which has an argument for alpha, in combination with the wonderful apply and sapply functions.

      
## Add an alpha value to a colour

add.alpha <- function(col, alpha=1){

  if(missing(col))

    stop("Please provide a vector of colours.")

  apply(sapply(col, col2rgb)/255, 2, 

                     function(x) 

                       rgb(x[1], x[2], x[3], alpha=alpha))  

}

view raw add.alpha.R hosted with ❤ by GitHub

The example below illustrates how this function can be used with colours provided in different formats, thanks to the col2rgb function.

      
# Source add.alpha function from Github

require(RCurl)

source(textConnection(getURL("https://gist.github.com/mages/5339689/raw/576263b8f0550125b61f4ddba127f5aa00fa2014/add.alpha.R")))

## Example

set.seed(1)

n <- 1200

dat <- data.frame(

  x = gl(n=4, k=n/4),

  y = rnorm(n)

)

myColours = c(1, "steelblue", "#FFBB00", rgb(0.4, 0.2, 0.3))

myColoursAlpha <- add.alpha(myColours, alpha=0.4)

## "#00000066" "#4682B466" "#FFBB0066" "#66334D66" 

op <- par(mfrow=c(1,2), mar=c(2,2,3,1))

boxplot(y ~ x, data=dat, outline=FALSE,

        axes=FALSE, main="alpha=1")

points(x=jitter(as.numeric(dat$x)), y=dat$y, 

       col=myColours[dat$x], pch=19)

box()

boxplot(y ~ x, data=dat, outline=FALSE,

        axes=FALSE, main="alpha=0.4")

points(x=jitter(as.numeric(dat$x)), y=dat$y, 

       col=myColoursAlpha[dat$x], pch=19)

box()

par(op)

view raw add.alpha.example.R hosted with ❤ by GitHub

Session Info

sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RCurl_1.95-4.1 bitops_1.0-5

Friday, March 20, 2015

add a suffix to a column

##add a suffix to a column
awk -v PRE='./target_win_asso_ctr_win/' '{$0=PRE$0; print}' list_all >cor_5kb_100ctr_result_file_list

Thursday, March 19, 2015

How to Install Perl Modules on Mac OS X in 4 Easy Steps

Today at work, I learned how to install Perl modules using CPAN. It’s a lot easier than I thought.
You see, for the past couple of years, I’ve been a bit frustrated because OS X does not come with a whole lot of Perl modules pre-installed, and for all I googled, I couldn’t find an “idiot’s” guide for moderately-savvy-but-not-expert users like myself to install modules and dependencies on demand.
The only instructions I could find point to Fink, which basically installs modules in a path that isn’t included in the Perl @INC variable, meaning you have to manually specify the full path to the modules in every script — which is not a lot of fun if you’re developing on OS X and deploying on Red Hat, for instance.
Moreover, Fink doesn’t seem to make every module available, and it’s not very easy to determine which Fink package you need to install if you need a particular module.
So, with a script that called on several apparently unavailable modules, and a deadline looming, I finally decided to suck it up and figure out how to use CPAN to install them:

1) Make sure you have the Apple Developer Tools (XCode) installed.

These are on one of your install discs, or available as a huge but free download from the Apple Developer Connection [free registration required] or the Mac App Store. I thought I had them, but apparently when we upgraded that computer to Tiger, they went missing.
If you don’t have this stuff installed, your installation will fail with errors about unavailable commands.

1.5) Install Command Line Tools (Recent XCode versions only)

(Thank you to Tom Marchioro for informing me about this step.)
Older versions of XCode installed the command line tools (which are required to properly install CPAN modules) by default, but apparently newer ones do not. To check whether you have the command line tools already installed, run the following from the Terminal:
$ which make
This command checks the system for the “make” tool. If it spits out something like /usr/bin/make you’re golden and can skip ahead to Step 2. If you just get a new prompt and no output, you’ll need to install the tools:

Launch XCode and bring up the Preferences panel.
Click on the Downloads tab
Click to install the Command Line Tools

If you like, you can run which make again to confirm that everything’s installed correctly.

2) Configure CPAN.

$ sudo perl -MCPAN -e shell
perl> o conf init
This will prompt you for some settings. You can accept the defaults for almost everything (just hit “return”). The two things you must fill in are the path to make (which should be /usr/bin/make or the value returned when you run which make from the command line) and your choice of CPAN mirrors (which you actually choose don’t really matter, but it won’t let you finish until you select at least one). If you use a proxy or a very restrictive firewall, you may have to configure those settings as well.
If you skip Step 2, you may get errors about make being unavailable.

3) Upgrade CPAN

$ sudo perl -MCPAN -e 'install Bundle::CPAN'
Don’t forget the sudo, or it’ll fail with permissions errors, probably when doing something relatively unimportant like installing man files.
This will spend a long time downloading, testing, and compiling various files and dependencies. Bear with it. It will prompt you a few times about dependencies. You probably want to enter “yes”. I agreed to everything it asked me, and everything turned out fine. YMMV of course. If everything installs properly, it’ll give you an “OK” at the end.

4) Install your modules. For each module….

$ sudo perl -MCPAN -e 'install Bundle::Name'
or
$ sudo perl -MCPAN -e 'install Module::Name'
This will install the module and its dependencies. Nice, eh? Again, don’t forget the sudo.
The first time you run this after upgrading CPAN, it may prompt you to configure again (see Step 2). If you accept its offer to try to configure itself automatically, it may just run through everything without a problem.
There are a couple of potential pitfalls with specific modules (such as the LWP::UserAgent / HEAD issue), but most have workarounds, and I haven’t run into anything that wasn’t easily recoverable.
And that’s it!
Did you find this useful? Is there anything I missed?

Posted: Monday, November 27th, 2006
Filed under: os x, Perl

COMMENT: by Ken, January 15th, 2008

thank you!
similarly, i bet it would be also helpful for some people to see how to install perl packages with FC8 using yum…say you want to install the Perl Frontier::Client package (and its dependencies)…
$ su root
# yum -y install perl-Frontier-RPC
…all you do is append a ‘perl-’ to the package name and substitute the ‘::’ for a ‘-’ and you should be all set…
COMMENT: by Alex, March 14th, 2008

This gave me the courage to get to grips with CPAN Very helpful, thanks!
COMMENT: by jpd, March 14th, 2008

Thanks. On Leopard there were some diffs but nothing too bad…The only thing I would say for newbies is that you don’t type sudo perl -MCPAN -e ‘install Module::Name’ – Module & Name are different depending what module you are loading. If you want the LWP module for web manipulation use sudo perl -MCPAN -e ‘install Bundle::lwp’
Hope that helps
COMMENT: by Noemi Millman, March 16th, 2008

Thanks, jpd. Would you mind sharing the differences you encountered on Leopard?
COMMENT: by Albert, March 18th, 2008

Excellent information. I have just got it working on Leopard. Cheers, Albert
COMMENT: by Michele, July 7th, 2008

Very helpful. I installed on Leopard without too much trouble. Only problem I ran into is that some of the standard config’d locations is not where my programs were located, and didn’t really feel like doing a system-wide grep. Google helped me locate them, and the install went great after that.
COMMENT: by Phoenix2Life, September 26th, 2008

Extremely well written and perfectly working CPAN guide I have come across. It has since long I always used to download RPMs/TAR.GZs and used to install/configure Perl on my linux boxes. This tutorial has helped me to setup nice automated way which I have been looking for. Thanks.
COMMENT: by nobighair, January 14th, 2009

Thanks for these instructions, certainly got me up and running.
I had a list of modules to install. Plus there were a couple I needed to force install. So I found it easier to split it up:
> sudo perl -MCPAN -eshell
To get the CPAN shell. Then in the shell:
> install XML::Writer
or
> force install XML::Writer
Cheers
COMMENT: by Guizard SÃ©bastien, May 5th, 2009

hello,
I had stop the process at the step 2 when you have to enter the adress of the miror cpan(for searching this adress).
Now, when I restart the command, it don’t ask me for the Cpan miror and the command make was not created. What can I do ? It’s my first macbook, I’ve bought it 3 day ago. I don’t know what can I do ! i’m thinking to re instal Mac OS X. I don’t it’s good idea ! if you can help me I will be very glad ! ! !
PS : I’m sory if my english is not very good, I’m learning right now in USA ^^
COMMENT: by Noemi Millman, May 6th, 2009

Guizard, have you tried simply running through Step 2 (above) again?
COMMENT: by Mac, June 20th, 2009

Now how do you get back to a regular shell to run scripts?
COMMENT: by Noemi Millman, June 23rd, 2009

Mac: try typing “exit” or “quit”
COMMENT: by Nick, July 30th, 2009

thank you so much! i needed this
COMMENT: by Simon, November 17th, 2009

Thanks so much! This is exactly what I was looking for.
COMMENT: by nod, January 4th, 2010

Thanks muchly! This was very helpful and I’m so glad I came across it. New to Mac and there is a lot to get used to.
COMMENT: by Christian, June 8th, 2010

I am on 10.6.3
I followed these instructions and everything went well until I tried to install a module. Maybe I just misread the post, but instead of ‘install Module::Name’ I had to use ‘install Name’
COMMENT: by Noemi Millman, June 8th, 2010

Christian, you don’t use the word “Module” — it varies depending on what specifically you’re trying to install. See JPD’s notes above.
COMMENT: by Christian, June 8th, 2010

Right, but if I wanted to install the DateTime module, I would use ‘install DateTime’ and not ‘install Module::DateTime’. That was just unclear to me from the post.
COMMENT: by JM, August 31st, 2010

Thanks for your tutorial. It was very helpful, and enabled me to just installed seven PERL mods, as well as the Expat C library from Sourceforge.
Like Christian though, I only had to use “sudo cpan ModuleName” at the Terminal prompt to install most of them. I think there was only one where I had to prefix the command with “Bundle::”.
This is on a 27″ iMac running Snow Leopard 10.6.4.
I also installed them on a ten-year-old G4 mini-tower running Tiger.
YMMV
Thanks again!
COMMENT: by Pierre, January 10th, 2011

Nice! Exactly what I needed…using MP3::Tag
COMMENT: by Dan, May 20th, 2011

Thank you! Very helpful! Worked perfectly.
COMMENT: by Richard Uschold, July 17th, 2011

So, EXACTLY WERE does CPAN put the perl modules it installs?
I still get the error: “Can’t locate SOAP/Lite.pm in @INC (…)”
COMMENT: by Richard Uschold, July 17th, 2011

Never mind! I figured it out. I have two different versions of perl installed. I had to do the CPAN install for both vesions of perl.
All is good, now!
COMMENT: by Noemi Millman, July 18th, 2011

Richard, how did you end up with two versions of Perl? Was one installed via Fink or Macports or something?
COMMENT: by bert, November 4th, 2011

I tried to install LWP and Mechanize but it constantly lead to this when i also try to get the dependencies do you Noemi know what i can do? Regarding CPAN there are few places that have clear explanation as this site. I want to web scrape but it’s damn hard because nobody explains which specific problem i have. I have a mac xcode is installed although i dont know if that means i also have mac developer tools:(
GAAS/libwww-perl-6.03.tar.gz
/usr/bin/make install — NOT OK
—-
You may have to su to root to install the package
(Or you may want to run something like
o conf make_install_make_command ‘sudo make’
to raise your permissions.Warning (usually harmless): ‘YAML’ not installed, will not store persistent state
COMMENT: by Noemi Millman, November 4th, 2011

Bert, I’m not an expert in this, but it sounds like one of the dependencies may need to compile and install a binary somewhere on your filesystem that the user you’re logged in as doesn’t have permissions for — possibly when attempting to install a YAML module. You can elevate your permissions using sudo / su but I won’t promise that that’s safe solution. Best of luck!
COMMENT: by bert, November 6th, 2011

Thanks a lot:) I have nearly figured it all out:P
COMMENT: by Mandy, July 5th, 2012

Thank you so much! I am relatively new to perl and was having so much trouble installing XML::RSS, and this solved it first time!
COMMENT: by Cliff, July 9th, 2012

thank you..I finally got this working and the modules installed
COMMENT: by John Wooten, Ph.D., July 10th, 2012

The instructions don’t appear to work for OS X Lion 10.7.4. I did every step and after a long time of recursively descending down to more and more routines, it fails on almost everything.
Has anyone installed PDL on OS X Lion 10.7.4? If so, how?
COMMENT: by Tom Marchioro, July 23rd, 2012

Noemi,
Really clear and useful instructions. You should be proud (and I’m an Eli of an age who doesn’t easily praise a Tiger
BUT, as John Wooten notes, the instructions need a slight updating for the current state of Appledom. I doubt this is Lion specififc, but the new XCode has turned into a standalone app that does NOT come with the standard command line tools by default, so your instructions should now be:
1. Make sure you have the Apple Developer Tools Installed.
2. Launch XCode and bring up the Preferences panel.
3. Click on the Downloads tab and then click to install the Command Line Tools (otherwise CPAN cannot access a working version of make).
After that I think Noemi’s instructions work perfectly (at least for DBI and LWP). thanks!
Hope this helps — tom
COMMENT: by Noemi Millman, July 23rd, 2012

Thanks, Tom. That’s very good to know, and I’ll update the instructions accordingly.
COMMENT: by laura, July 30th, 2012

thanks for this! i have one tweak to add if you are trying to do this in Mountain Lion – the install commands all failed for me (at the ftp step) until i added “env FTP_PASSIVE=1″ to the command. so this:
sudo perl -MCPAN -e ‘install Bundle::Name’
becomes this:
sudo env FTP_PASSIVE=1 perl -MCPAN -e ‘install Bundle::Name’
(tip found at http://hints.macworld.com/article.php?story=20090716132354455)
laura
COMMENT: by Noemi Millman, July 31st, 2012

Good tip, Laura. It sounds like you may be behind a stricter firewall than most.
COMMENT: by John, August 10th, 2012

Thanks for this info. Just what was required to fix the problem I was having.
COMMENT: by Anarcissiea, August 24th, 2012

Worked for me (OS X 10.6.8). Thanks!
COMMENT: by vogen, September 1st, 2012

Great!! this is what i was looking for. Works with 10.7.4
Just installed Module Prima
Thanks heaps
COMMENT: by Avita, November 20th, 2012

Preciate it man! I was pissed of trying to install modules from cpan on my mac. Thanks a lot for wonderful documentation of the steps.
COMMENT: by Bretfort, January 1st, 2013

Jazakallah, brief and to the point
COMMENT: by Ezmyrelda, March 5th, 2013

Excellent tutorial! very helpful.
COMMENT: by Collin Dyer, May 13th, 2013

Thank you very much – worked great for this perl noob!
COMMENT: by Perl on Mac | BnafetS, May 27th, 2013

[...] to Noemi Millman! Rate this:Share this:Like this:Like [...]
COMMENT: by Gaelle, June 26th, 2013

Thank you very much : perfectly clear tutorial. I saved an hour thanks to you Note that CPAN configuration is much simpler now (autoselect mirror for instance)
COMMENT: by Susanne, July 8th, 2013

Thank you very much for this excellent howto – it certainly saved me a lot of time! Everything worked right away (Mac OS 10.8.3).
COMMENT: by Ryan, July 21st, 2013

Great article! It helps a lot!
COMMENT: by Jack, November 8th, 2013

very helpful of CPAN
COMMENT: by Anonymous, December 17th, 2013

after running the upgrade command (sudo perl -MCPAN -e ‘install Bundle::CPAN’) and starting the long compiling/configuring/testing process, it eventually stopped at
“t/lock.t …………… 1/4″
It’s been stuck on that for about 20 minutes now. Is this normal?
COMMENT: by Anonymous, December 17th, 2013

well I ended up just killing the make test process and it seemed to proceed with the rest of the installation. So far it hasn’t given me any problems. I guess it was just some weird bug
COMMENT: by Flo, June 18th, 2014

Still relevant for Maverick, thanks a lot!

Wednesday, October 22, 2014

count redundant values in a array

my @array = qw(foo bar foo bar baz foo baz bar foo);
    my %counts = ();
    for (@array) {
       $counts{$_}++;
    }
    foreach my $keys (keys %counts) {
       print "$keys = $counts{$keys}\n";
    }

Unique values in an array in Perl

In this part of the Perl tutorial we are going to see how to make sure we only have distinct values in an array.
Perl 5 does not have a built in function to filter out duplicate values from an array, but there are several solutions to the problem.

List::MoreUtils

Depending on your situation, probably the simplest way is to use the uniq function of the List::MoreUtils module from CPAN.


use List::MoreUtils qw(uniq);
 
my @words = qw(foo bar baz foo zorg baz);
my @unique_words = uniq @words;

A full example is this:


use strict;
use warnings;
use 5.010;
 
use List::MoreUtils qw(uniq);
use Data::Dumper qw(Dumper);
 
my @words = qw(foo bar baz foo zorg baz);
 
my @unique_words = uniq @words;
 
say Dumper \@unique_words;

The result is:

$VAR1 = [
        'foo',
        'bar',
        'baz',
        'zorg'
      ];

For added fun the same module also provides a function called distinct, which is just an alias of the uniq function.
In order to use this module you'll have to install it from CPAN.

Home made uniq

If you cannot install the above module for whatever reason, or if you think the overhead of loading it is too big, there is a very short expression that will do the same:


my @unique = do { my %seen; grep { !$seen{$_}++ } @data };

This, of course can look cryptic to someone who does not know it already, so it is recommended to define your own uniq subroutine, and use that in the rest of the code:


use strict;
use warnings;
use 5.010;
 
use Data::Dumper qw(Dumper);
 
my @words = qw(foo bar baz foo zorg baz);
 
my @unique = uniq( @words );
 
say Dumper \@unique_words;
 
sub uniq {
  my %seen;
  return grep { !$seen{$_}++ } @_;
}

Home made uniq explained

I can't just throw this example here and leave it like that. I'd better explain it. Let's start with an easier version:


my @unique;
my %seen;
 
foreach my $value (@words) {
  if (! $seen{$value}) {
    push @unique, $value;
    $seen{$value} = 1;
  }
}

Here we are using a regular foreach loop to go over the values in the original array, one by one. We use a helper hash called %seen. The nice thing about the hashes is that their keys are unique.
We start with an empty hash so when we encounter the first "foo", $seen{"foo"} does not exist and thus its value is undef which is considered false in Perl. Meaning we have not seen this value yet. We push the value to the end of the new @uniq array where we are going to collect the distinct values.
We also set the value of $seen{"foo"} to 1. Actually any value would do as long as it is considered "true" by Perl.
The next time we encounter the same string we already have that key in the %seen hash and its value is true, so the if condition will fail, and we won't push the duplicate value in the resulting array.

Shortening the home made unique function

First of all we replace the assignment of 1 $seen{$value} = 1; by the post-increment operator $seen{$value}++. This does not change the behavior of the previous solution - any positive number is going to be evaluated as TRUE, but it will allow us to include the setting of the "seen flag" within the if condition. It is important that this is a postfix increment (and not a prefix increment) as this means the increment only takes place after the boolean expression was evaluated. The first time we encounter a value the expression will be TRUE and the rest of the times it will be FALSE.


my @unique;
my %seen;
 
foreach my $value (@data) {
  if (! $seen{$value}++ ) {
    push @unique, $value;
  }
}

This is shorter, but we can do even better.

Filtering duplicate values using grep

The grep function in Perl is a generalized form of the well known grep command of Unix.
It is basically a filter. You provide an array on the right hand side and an expression in the block. The grep function will take each value of the array one-by-one, put it in $_, the default scalar variable of Perl and then execute the block. If the block evaluates to TRUE, the value can pass. If the block evaluates to FALSE the current value is filtered out.
That's how we got to this expression:


my %seen;
my @unique = grep { !$seen{$_}++ } @words;

Wrapping it in 'do' or in 'sub'

The last little thing we have to do, is wrapping the above two statements in either a do block


my @unique = do { my %seen; grep { !$seen{$_}++ } @words };

or, better yet, in a function with an expressive name:


sub uniq {
  my %seen;
  return grep { !$seen{$_}++ } @_;
}

Home made uniq - round 2

Prakash Kailasa suggested an even shorted version of implementing uniq, for perl version 5.14 and above, if there is no requirement to preserve the order of elements.
Inline:


my @unique = keys { map { $_ => 1 } @data };

or within a subroutine:


my @unique = uniq(@data);
sub uniq { keys { map { $_ => 1 } @_ } };

Let's take this expression apart:
map has a similar syntax to grep: a block and an array (or a list of values). It goes over all the elements of the array, executes the block and passes the result to the left.
In our case, for every value in the array it will pass the value itself followed by the number 1. Remember =>, aka. fat comma, is just a comma. Assuming @data has ('a', 'b', 'a') in it, this expression will return ('a', 1, 'b', 1, 'a', 1).


map { $_ => 1 } @data

If we assigned that expression to a hash, we would get the original data as keys, and the number 1-es as values. Try this:


use strict;
use warnings;
 
use Data::Dumper;
 
my @data = qw(a b a);
my %h = map { $_ => 1 } @data;
print Dumper \%h;

and you will get:

$VAR1 = {
          'a' => 1,
          'b' => 1
        };

If, instead of assigning it to an array we wrap the above expression in curly braces, we will get a reference to an anonymous hash.


{ map { $_ => 1 } @data }

Let's see it in action:


use strict;
use warnings;
 
use Data::Dumper;
my @data = qw(a b a);
my $hr = { map { $_ => 1 } @data };
print Dumper $hr;

Will print the same output as the previous one, barring any change in order in the dumping of the hash.
Finally, starting from perl version 5.14, we can call the keys function on hash references as well. Thus we can write:


my @unique = keys { map { $_ => 1 } @data };

and we'll get back the unique values from @data

Exercise

Given the following file print out the unique values:
input.txt:

foo Bar bar first second
Foo foo another foo

expected output:

foo Bar bar first second Foo another

Exercise 2

This time filter out duplicates regardless of case.
expected output:

foo Bar first second another

Written by
Gabor Szabo

Do you want to improve your Perl?

In order to register to our newsletter, please type your e-mail here: Registered people will be notified when a new article is published on the Perl Maven web site.

Published on 2012-09-20

Comments

In the comments, please wrap your code snippets within <pre> </pre> tags and use spaces for indentation.

sort hash by value and return the associated key

If instead you want to sort by numeric hash values, you'd write:

foreach my $key  (sort { $hash{$a} <=> $hash{$b} } keys %hashes)  {  do something; }

Thursday, May 14, 2015

Wednesday, December 19, 2012

Simple Enrichment Test -- calculate hypergeometric p-values in R

Tuesday, May 12, 2015

How to change the alpha value of colours in R

Session Info

Friday, March 20, 2015

Thursday, March 19, 2015

How to Install Perl Modules on Mac OS X in 4 Easy Steps

1) Make sure you have the Apple Developer Tools (XCode) installed.

1.5) Install Command Line Tools (Recent XCode versions only)

2) Configure CPAN.

3) Upgrade CPAN

4) Install your modules. For each module….

COMMENT: by Ken, January 15th, 2008

COMMENT: by Alex, March 14th, 2008

COMMENT: by jpd, March 14th, 2008

COMMENT: by Noemi Millman, March 16th, 2008

COMMENT: by Albert, March 18th, 2008

COMMENT: by Michele, July 7th, 2008

COMMENT: by Phoenix2Life, September 26th, 2008

COMMENT: by nobighair, January 14th, 2009

COMMENT: by Guizard SÃ©bastien, May 5th, 2009

COMMENT: by Noemi Millman, May 6th, 2009

COMMENT: by Mac, June 20th, 2009

COMMENT: by Noemi Millman, June 23rd, 2009

COMMENT: by Nick, July 30th, 2009

COMMENT: by Simon, November 17th, 2009

COMMENT: by nod, January 4th, 2010

COMMENT: by Christian, June 8th, 2010

COMMENT: by Noemi Millman, June 8th, 2010

COMMENT: by Christian, June 8th, 2010

COMMENT: by JM, August 31st, 2010

COMMENT: by Pierre, January 10th, 2011

COMMENT: by Dan, May 20th, 2011

COMMENT: by Richard Uschold, July 17th, 2011

COMMENT: by Richard Uschold, July 17th, 2011

COMMENT: by Noemi Millman, July 18th, 2011

COMMENT: by bert, November 4th, 2011

COMMENT: by Noemi Millman, November 4th, 2011

COMMENT: by bert, November 6th, 2011

COMMENT: by Mandy, July 5th, 2012

COMMENT: by Cliff, July 9th, 2012

COMMENT: by John Wooten, Ph.D., July 10th, 2012

COMMENT: by Tom Marchioro, July 23rd, 2012

COMMENT: by Noemi Millman, July 23rd, 2012

COMMENT: by laura, July 30th, 2012

COMMENT: by Noemi Millman, July 31st, 2012

COMMENT: by John, August 10th, 2012

COMMENT: by Anarcissiea, August 24th, 2012

COMMENT: by vogen, September 1st, 2012

COMMENT: by Avita, November 20th, 2012

COMMENT: by Bretfort, January 1st, 2013

COMMENT: by Ezmyrelda, March 5th, 2013

COMMENT: by Collin Dyer, May 13th, 2013

COMMENT: by Perl on Mac | BnafetS, May 27th, 2013

COMMENT: by Gaelle, June 26th, 2013

COMMENT: by Susanne, July 8th, 2013

COMMENT: by Ryan, July 21st, 2013

COMMENT: by Jack, November 8th, 2013

COMMENT: by Anonymous, December 17th, 2013

COMMENT: by Anonymous, December 17th, 2013

COMMENT: by Flo, June 18th, 2014

Post a Comment

Wednesday, October 22, 2014

List::MoreUtils

Home made uniq

Home made uniq explained

Shortening the home made unique function

Filtering duplicate values using grep

Wrapping it in 'do' or in 'sub'

Home made uniq - round 2

Exercise

Exercise 2

Do you want to improve your Perl?

Comments

Do you want to improve your Perl?

Do you want to become a Pro?