Arrays and Hashes
Download
Report
Transcript Arrays and Hashes
Hashes
1
2
Hashes
• cover hashes before 2-D arrays because 2-D
arrays require a new concept (references)
• hash (aka associative array)
– a data structure similar to an array
– can hold any number of values like an array
• array values are indexed by integers/ numbers
• hash values are indexed by name or key
– the indices are not numbers, they are arbitrary unique
strings
– keys must always be unique, but values may be
duplicated
3
Array VS Hash
array
hash
values
indices
keys
values
0
35
"two"
35
1
12.4
"hi"
12.4
2
"hello, world"
"2"
"hello, world"
3
1.72e30
"3.5"
1.72e30
4
"bye\n"
"Fred"
"bye\n"
5
69
"betty"
69
This is misleading because there is no real order to a hash – there if no first
element, and no last element.
4
Hash
A hash is more like a
"bucket" of data, where
each value has a tag
attached. The tag is the
key.
"3.5"
"2"
"hello, world"
Retrieving a specific item
requires knowing the key.
You can also reach into the
barrel, pull out any key, and
see what value is attached
(or all the keys, or all the
values)
1.72e30
"Fred" "bye\n"
"betty" 69
"hi"
12.4
"two"
35
5
Hash VS array
• arrays are linear and ordered (0,1,2,3,
etc.)
– searching an array for a value takes time
proportional to the size of the array
• hashes map a key to a value
– a hash item may be accessed in constant
time
• generally, hashes are used when one set
of data has a natural relationship to
another set of data
6
Example Hash Datasets
• Last name, first name (Braun maps to Terry)
• Name, address (Terry Braun maps to 814 20th
Ave)
• Host name, IP address (texas.eng.uiowa.edu
maps to 128.255.22.207)
• word, word count (the, 42)
• user name, disk usage (tabraun, 27.2 gig)
• codon, amino acid (ATG, M)
• CCA=>P, CCG=>P, CCC=>P, CCT=>P
• P=> ???
• Subjects, genotypes
• Genes, expression values
7
Hash Syntax
$hash{$some_key} VS $array[$index]
Note: Larry Wall says that curly braces are
used for hashes because something
fancier than array indexing is being
performed
8
Accessing Hash Elements
$name{"fred"} = "flintstone";
$name{"barney"} = "rubble";
$name{"dino"} = "flintstone";
@first = qw/ fred barney dino/;
foreach $first_name (@first) {
print "I've met $first_name $name{$first_name}\n";
}
.
9
Scalars, Arrays, Hashes, and
Subroutines, oh my!
$person = "terry"; #scalar
$person = &person; #subroutine call
$person[1] = "bob"; #array
$person{$person} = "braun"; #hash
# note that the "key" can be any expression
# and not just a string literal
$person[0] = “braun”;
$person{$person} = $person[0];
$test = $person{"jill"}; #accessing outside the hash returns
an UNDEF.
.
10
Hash Variable - %
%hash = ("foo", 35, "bar", 12.4, 2.5, "hello",
"wilma", 1.72e30, "betty", "bye\n");
- key-value pairs
@any_array = %some_hash;
# This "unwinds" the key-value pairs into an array
– the keys and values are placed into the array
#Note that the order may be different than the
order in which they were placed into the hash
11
More Hashing
%inverse_hash = reverse %any_hash;
A hash has no particular ordering. However, in this
example, the reverse command unwinds the hash into
an array context, of key-value pairs (key, value, key,
value, …)
Then reverse turns the list end-for-end, so the list is now
(value, key, value, key, …)
This is then stored in %inverse_hash – the list is "rehashitized" -- so now values are the keys.
This only works if the values are unique, or the "last one
wins," meaning the mapping of P=>CCC and P=>CCT
depends on which one was processed last.
12
Another Way to
Construct a Hash: =>
my %name = (
"fred" => "finstone",
"dino" => undef,
"Barney" => "rubble",
"betty" => "rubble", ); #extra comma harmless
# easier to read if you put each key-value on its own line
.
13
Hash Table Example -- another way
%aminos = ( "TTT", "F", #key first
"TTC", "F",
"TTA", "L",
"TTG", "L",
"CTT", "L",
"CTC", "L",
"CTA", "L",
"CTG", "L",
"ATT", "I",
"ATC", "I",
"ATA", "I",
"ATG", "M",
"GTT", "V",
"GTC", "V",
"GTA", "V",
"GTG", "V",
"TCT", "S",
"TCC", "S",
"TCA", "S",
"TCG", "S“)
14
Hash Functions
keys HASH
-- returns all keys of a hash
values HASH
-- returns all values of a hash
%my_hash = (a => 1, b=> 2, c=>33);
@keys = keys %my_hash;
@values = values %my_hash;
print "@keys @values\n";
#c a b 33 1 2
Note that the order of the keys will correspond to the order
of the values
This remains true unless the hash is manipulated (add or
remove elements)
Perl may re-order the hash for efficiency.
15
Hash Functions
Scalar context – keys returns the number of key-value pairs
$count = keys %hash;
exists $HASH{KEY}
-- returns true if the HASH contains the given KEY
if (exists $aminos{CCC}) {
print "The amino acid for codon CCC is $aminos{CCC}\n";
} # CCC->P
16
Hash Functions
delete $HASH{KEY}
-- remove a key-value pair from the hash
$codon = "CCC";
$aminos{$codon} = "P";
delete $aminos{$codon};
each %HASH
-- returns a key-value pair as a two-element list
while ( ( $key, $value) = each %hash) {
print "$key => $value\n";
}
17
Using foreach with a Hash
foreach $key (sort keys %hash) {
$value = $hash{$key};
print ”$key => $value\n";
}
#same as
foreach $key (sort keys %hash) {
print "$key => $hash{$key}\n";
}
18
#same as
@keys = keys %hash;
@keys = sort @keys;
foreach $key (@keys) {
print "$key => $hash{$key}\n";
}
19
Hash Element Interpolation
or lack thereof
$books{"tom"}=5;
$books{"bob"}=32;
foreach $person (sort keys %books) {
if($books{$person}) { #hash of book count
print "$person has $books{$person} items\n"; #bob has 3 items
}
}
print "The hash is %books\n"; #this does NOT interpolate
# The hash is %books
20
Example
• Generate random numbers
• Use hash to count numbers (to validate
‘randomness’)
– Print count for each number
– Print average
21
#!/usr/bin/perl
# countLoopSrand.pl
#
srand(time() ^ ($$ + ($$ << 15)) );
print "Enter sequence length (nucleotides): ";
$length = <STDIN> ; #length of sequence to generate -- from screen
if(!($length =~ m/^\d+/)) { #match only an integer at the beginning
die ("Invalid input: $&\n");
}
while($length) { # stay in loop until have generated enough sequence
$rand = int rand(4); # Interger number between (0-3) inclusive
$length--; #decrease loop counter
print "$rand ";
}
RUN EXAMPLE
22
#!/usr/bin/perl
# countLoopSrand-hash.pl
#
srand(time() ^ ($$ + ($$ << 15)) );
print "Enter sequence length (nucleotides): ";
$length = <STDIN> ; #length of sequence to generate -- from screen
if(!($length =~ m/^\d+/)) { #match only an integer at the beginning
die ("Invalid input: $&\n");
}
while($length) { # stay in loop until have generated enough sequence
$rand = int rand(4); # Interger number between (0-3) inclusive
$length--; #decrease loop counter
$count{$rand}=$count{$rand}+1; #a Hash to count random numbers
print "$rand ";
}
@keys = keys %count;
print "\n";
foreach $key (@keys) {
print "count:$key $count{$key}\n";
$total = $total + $count{$key};
}
print "Average = ".$total/(keys %count)."\n";
RUN EXAMPLE
23
End
24