« Useless Factoid of the Day: Psittacosis | Main | The technological generation gap »
January 05, 2005
Perl 6: A Cool Programming Language
Perl may inspire strong feelings of love and hate in people, but it is unarguably a unique language.On the surface it sort of looks like C and was initially meant to be an awk-killer. However, you quickly realize that except for the semicolons, braces and a few functions names, Perl has little in common with C. Perl was designed by Larry Wall, a linguist by training. If you read Perl scripts you'll quickly realize that they read like some other languages you may know, like *gasp* English! Heck, Perl is probably the only programming language that has a culture of poetry!.
Programming in Perl can be a LOT of fun (more on that in the extended entry) Nevertheless, Perl 5, which is the latest and greatest Perl out there, suffers from some stodgy constructs that seem to take some of the fun out of hacking up a quick and dirty Perl script. Perl 6 goes a long way in fixing some of it, as can be seen in this great article. This blog entry is about why programming in Perl is fun and about what I think are the best new features of Perl 6!
Perl is fun!
Just compare
open(INPUT, $file) or die("Cannot open $file!");
with
try {
File f = new File(filename);
FileInputStream stream = new FileInputStream(f);
} catch(FileNotFoundException e1) {
// Handle this exception
} catch(SecurityException e) {
// Handle this exception
}
The code excerpt on the top is a line of Perl, while the code excerpt below accomplishes the same thing in Java. It is clear that the Perl reads like something you would say in everyday speech. "Open the file or die trying!" The Java example has much more structured exception handling and a more formal way of opening a stream to a file. The Perl script, on the other hand, Does What You Mean (tm) and worries about the error handling in a different way entirely.
Perl has a type system that may seem weird if you come from a C/C++/Java/C# background. However, very soon you'll learn to like it because Perl's types correspond roughly to how our linguistic system is wired. Perl has a scalar type, which could be a number, a word or "one of anything". In Perl you would say, "Pass me the ball", in C you'd be saying "Pass me the ball which is exactly 14.75 inches in diameter". Do you really care so long as it's clear you want just one ball?
Perl also has an array type, which is "something made of other things". In Perl, you'd be passing around a basket of apples. The basket may be empty or full and if you need to put in more apples at any time, you just add in the apples. What's more, you can also put in apples, pears, cupcakes, knives and anything that could go into a basket. In C and C++, you'd be required to indicate exactly how many apples that basket can take before filling it up with apples. If you have to put in an extra apple, you either can't do it or you move all the apples from this basket into an entirely new basket on your own. If you didn't put in enough apples to fill up the basket, random undesirable things may appear in the space that is left over. But no, you can put in only apples and not anything that could go into a basket.
Perl has a hash type, which is "a bunch of pairs of things" which are associated with each other in some way. You could use a hash to indicate the concept of relationships. You could use a hash to say "George Bush is related to America in the same way as Queen Elizabeth is related to the United Kingdom" And later, you could say "Oh wait! Jacques Chirac is related to France in the same way too!" Without using a special data structure called a hashtable in C, you'd have to come up with your own way of keeping track. You could keep two arrays of things where a country and its head of state are stored at the same place in each array, but that doesn't make their relationship explicit enough.
Perl deals in things, groups of things and relationships between things. It doesn't much care what the thing is, how it is stored and so on. Contrast this with a C-like language, where the language imposes terms on what a thing could mean. Is the thing stored using 16 bits, 32 bits or 64 bits? Or, is it a letter instead? Or *gasp*, a word? Perl is this magic language, but ultimately it runs on real computers that unfortunately deal in ugly things like 16, 32 and 64 bit chunks. But rather than expose all that to you, it runs around pasting pieces of duct tape everywhere so your program holds together.
Another big feature of Perl is that it is a programming language with pronouns. This fact plays a big role in the natural feeling you get when you read Perl, because we use a lot of pronouns to refer to the immediate context of a sentence, when we understand and process English. In a set of instructions like "Soak shirt in laundry detergent for two hours. Wash. Rinse. Repeat until stain is gone.", there is an implicit 'it' after the verbs at the start of the last three sentences. We know exactly what each 'it' refers to. The first two its refer to the shirt and the last it refers to the set of instructions itself. Consider what would happen if we were really explicit about everything and did not use the implicit 'it's at all. The instructions would read "Soak shirt in laundry detergent for two hours. Wash the shirt. Rinse the shirt. Repeat 'Soak shirt in laundry detergent for two hours. Wash the shirt. Rinse the shirt.' until the stain is gone." Sounds a bit convoluted, doesn't it?
If you wanted to cut every apple in a basket of apples in C, your code would look something like this:
Apple basket[10];
// ...
for(int j = 0; j < 10; ++j)
{
Apple an_apple = basket[j];
cut(an_apple);
}
In English, your code would read something like "Counting from 0 upwards until 10 (because that is *exactly* how many apples this basket can hold), let us assign it to a number J. Pick up the Jth apple from the basket and set it on the table. Let's call this apple A. Now, cut A"
In Perl, your code would read like this:
my @basket;
foreach(@basket)
{
cut;
}
You can practically read it out aloud and figure out what it says! Notice that you don't ever explicitly refer to the apple you're cutting right now. All you say is "for each apple in the basket, cut it". Perl has a magic variable called the "topic variable", to which it automatically assigns the apple you're working on right now. Its value can be accessed through the variable name $_. The same $_ acts as the 'it' and is passed into the subroutine 'cut'. See how simple and natural it is? So natural that even a C-style language like C# has borrowed from Perl and introduced a foreach loop, in addition to the usual for, while and do...while loops.
Perl also has flexible clause ordering, a distinctive feature that makes it seem chatty and colloquial compared to other programming languages that mandate a fixed order of if(condition) action(); You can write valid Perl that reads like so:
print("$variable not defined!") if !defined($variable);
or
print("$variable has a value!") unless !defined($variable);
Perl allows an if or a while checking for a condition to be placed after the set of actions associated with that condition. The statement has exactly the same effect as if the if or while were placed before the set of actions for a condition. This flexibility mimics how we occasionally put ifs or whiles at the end of our sentences when we speak English. Note that this flexibility parallels English in another subtle way. If we are about to issue a complicated set of instructions for some condition, we usually express the condition before the actions. For example, we might say "If the cabin pressure goes down, grab the oxygen masks that appear. Put the mask over your mouth and nose. Continue breathing normally" However, if we're describing only one action to take when a condition is true, we might say "Turn off the A/C if it gets too cold" Perl supports the post-action if only if the action consists of one statement.
Another reason Perl is so much fun to write is that the wonderful hacker culture around Perl has given rise to remarkable function names, such as croak, carp, warn and bless. Variables can be "my" variables or "our" variables. The language really starts to be fun when you think you can get away with common words like my and our in something as official, boring and formal as a computer language.
So by now, you must be convinced Perl is a lot of fun to write, but it is definitely not fun to read, especially if you didn't write the code. Perl is well reputed to be a write-only language. Code once written, even by you, can be really hard to figure out, if you're trying to make sense of it later. Perl also lacks important constructs like C's switch..case, which takes a set of actions based on the different values that can be taken by a variable. It could be achieved with an onerous sequences of ifs and elses, but that's the best Perl can offer in terms of language support, without going into complicated constructs like hashes of function references. This language, with a culture of poetry, is also one with a tradition of programmers competing to write the most obfuscated code.
And Along Comes Perl 6...
Perl 6 is the next redesign of Perl. It is a rather ambitious departure from the currently available versions of Perl. It adds a few new language constructs, changes the feel of the language a little bit and makes it easier for programmers to avoid common mistakes. I'll briefly mention a few things I love about Perl 6 and let you read about the rest at http://www.linux-mag.com/2003-04/perl_6_01.html.The given...when construct
Perl 5 lacks a switch...case, but Perl 6 has one. It's calledgiven...when. If you want to take different courses of action based on the values of a variable, you can write Perl 6 code that reads like so:
given $country
{
$nbsp; when 'U.S.' { print 'North America' }
$nbsp; when 'India' { print 'Asia' }
$nbsp; when 'France' { print 'Europe' }
$nbsp; when 'Egypt' { print 'Africa' }
}
See how wonderfully and naturally it reads? The designers of Perl 6 have clearly prioritized a natural feel for this construct over sticking with the analogous switch...case construct in Perl's forebears.
Junctions
Perl 6 introduces a new fundamental type in the language, called a junction, which converts long and onerous boolean conditions in other programming languages to statements that practically read like English! Let's say you wanted to see if any of the numbers in a list was greater than 10 and print a message if that were the case. In classic Perl, you could do something like
foreach(@numbers) { print "Too big!" if $_ > 10; }
of if you wanted to get fancier,
map { print "Too big!" if $_ > 10; }, @numbers;
But, with Perl 6, you can say
print "Too big" if(any(@numbers) > 10);
The function any returns a disjunction (a collective OR) of all the variables in the list, so that each number, in effect, is compared against 10 and the condition is true if any one of them is greater than 10. Another kind of junction is a conjunction, which lets you write code like
print "$num is the biggest so far!" if(all(@numbers)) < $num;
What I am most excited about is that this will let you reduce the somewhat clunky
if($foo > 1 && $foo > 2 && $foo > 3)
{
do something
}
to the considerably less clunky
if($foo > all((1, 2, 3))
{
do something
}
where you don't keep mentioning $foo repeatedly. If you aren't impressed, try an abjunction:
if(one(@roots) == 0)
{
print "This polynomial has a unique root"
}
Now, if reading code with junctions doesn't remind you of another language you know (*cough* English *cough*), then who knows, maybe you don't know it that well? ;) Junctions are also ideal candidates for optimization, because the inherent parallelism in comparisons and other operations scripts perform with junctions lets compilers allocate them to different threads or processors as it sees appropriate. This junction stuff packs a powerful punch.
Consistent sigils
This is a fairly major change that helps Perl programmers avoid common and hard to spot mistakes and brings the language more in line with our own linguistic intuitions. This would have to be my favorite feature of them all.
As you may know, Perl uses different prefixes on variables depending on the context in which the variable is used. For example, you'd declare scalars, arrays and hashes as follows:
my $scalar;
my @apples;
my %hash;
These prefixes are known as sigils. Perl programmers: did you even know they were called sigils? A sigil in a non-Perl context means a symbol or a sign.
However, when referring to the 5th element of @apples, you might think you could use @apples[4], but Perl 5 makes you use $apples[4]. In other words, you are supposed to use the scalar sigil even if you are referring to an element from an array. As you will see explained in the article about Perl 6above, sigils are like demonstrative pronouns. $ would roughly correspond to 'that' and @ would roughly correspond to 'those'. $number would be 'that number' and @apples would be 'those apples'. In English you'd refer to an apple as 'one of those apples'. However, Perl 5, in its infinite wisdom, makes you refer to it roughly as 'one of that apples'. This convention in Perl 5 isn't bad once you get used to it, but it is counterintuitive to begin with. This counterintuitiveness alone lets inadvertent mistakes slip in every once in a while. Sometimes these mistakes are caught by the Perl compiler. Otherwise, depending on which sigil you mix up with which other, these mistakes can cause your script to fail far away from the site of the mix-up, or worse, produce utterly unpredictable behavior.
Perl 6 fixes sigils to bring them in line with our linguistic intuitions. In other words, you can refer to the fifth apple in a basket of @apples as @apples[4]!
Summary and conclusion
I hope the above gave you some flavor of why Perl is a fun programming language in which to write code. Perl 5, the latest Perl out there, is undoubtedly fun, but also a bit rough around the edges. Perl 6 is an ambitious redesign of the language which adds a few new language features, fixes some of the rough edges and adds considerable power.Posted by Vishy at January 5, 2005 09:31 PM