Step One: point your browser at www.squeak.org
download a copy of Squeak Smalltalk for your platform.
install it, open up a morphic world, find the menu with the web browser in it.
come back here.
Step Two:
There are quite a few people who
read Wiki regularly
would like to learn Smalltalk Language
haven't the faintest idea where to start
or, need challenges that will help them discover the language.
See wiki.cs.uiuc.edu and wiki.cs.uiuc.edu
This was posted some time ago to the Silicon Valley Patterns Group mailing list by James Saywer... There's a pdf version available at: www.jera.com
My presentation is based on the way I tackle any language:
examine the character set and tokens
examine the reserved words
examine each unique syntactic form
examine each unique semantic form
examine the libraries this is the 20% that requires 80% of the effort, Hints And Exercises please
So here goes.
Standard character set. The special characters are:
: ^ . []|() ' " ; #
The tokens are:
{identifier}
{number}
{string}
{comment}
{binaryOperator}
{keyword}
{specialToken}
Identifiers are the same as you'd expect, except that we useCapitalLettersLikeThis rather_than_underscores.
Numbers are also as you'd expect.
'Strings are enclosed in single quotes'
"Comments are enclosed in double quotes"
Binary operators are composed of one or two characters. The characters which can form a {binaryOperator} vary a little bit between implementations, but for the purpose of *reading* Smalltalk, you can assume that any non-alphaNumeric character which is not in the above list of special characters forms a {binaryOperator}.
+ is a {binaryOperator}.
++ is a {binary operator}.
?* is a {binaryOperator}.
-> is a {binaryOperator}.
A {keyword} is just an identifier with a colon on the end of it, e.g.
anyIdentifierLikeThis:
is a keyword. In Smalltalk, a keyword is only special in the sense that it is part of a "keyword message". It is a distinct kind of token (different from an identifier or a string) but it's meaning is not special. Some languages have 'keywords' like BEGIN and END with builtin special meanings. 'Keyword' in Smalltalk is not this sort of thing, it's strictly a syntactic form.
{Special Tokens} are just the remaining tokens, the separator characters for parsing the language. openParenthesis is a '{specialToken}', carat is a {specialToken}, etc.
there are five reserved words:
nil
false
true
self
super
nil is the value of any variable which hasn't yet been initialized. It is also the value of any variable whose initialization has been forgotten. It *should* be used to mean "I have no idea", "Has never had a value", or "If it ever had a value, someone has since asked that we behave as if it never had one; therefore - I have no idea".
It is sometimes incorrectly used for things that should be Null Objects or Exceptional Values
true and false are singleton instances of the classes True and False, respectively.
self refers to the object whose class contains the method you are presently reading, when you are reading one and encounter the word 'self'. If the object's class has no such method, you must be reading the 'nearest' superclass which does have such a method.
super refers to the same object as self.
Read that last sentence 100 times, until you accept it as fact, then move on.
So why have two names for one thing? This is a little hard to follow until you get used to it. 'super' is the same object as self, but when you try to figure out which method the object will execute in response to the message being sent, pretend the object's class didn't have such a method. In other words, if the object's class does have a method for the message you're sending, don't use it. *Always* start looking for the method in the object's superclass.
This is so you can extend your superclass' behavior without having to rewrite it. For example,
do the same thing as the superclass, and then some:
>>aMethod
super aMethod. self doSomeMoreStuff.
or, do some new stuff, follow it up with whatever the superclass does:
>>aMethod
self doSomeStuff. super aMethod.
These are all reserved because the compiler, optimizer, and VM know about them.
there is one overriding, but previously unfamiliar pair of concepts at work in Smalltalk:
everything is an object.
all code takes the single conceptual form:
object messageSentToIt.
If you want to continue working in C++, Java, etc.then make very certain you do not understand what this means. If it starts to make sense to you then by all means stop reading Smalltalk, you are in serious danger. More on this later
there are 6 syntactic forms:
1 unary message send
object isSentThisUnaryMessage.
2 binary message send
object {isSentThisBinaryOperator} withThisObjectAsOperand.
3 keyword message send
object isSentThisKeywordMessage:
withThisObjectAsParameter.
object isSent:
thisObject and: thisOtherObject.
object is:
sent this: message with: 4 parameters: ok.
object is:
sent this message: with parameters: (1 + 2).
object is:
(sent this) message: (with) parameters: (3).
These are a little bit weirder, until you catch on.Keyword messages written as C function calls would look like this:
isSentThisKeywordMessage(object,andParameter); isSentAnd(object,thisObject,thisOtherObject); isThisWithParameters(object,sent,message,4,ok); isMessageParameters(object,this(sent),with,(1+2)); isMessageParameters(object,(this(sent)),(with),(3));
Which is sort of why we *refer* to keyword messages, descriptively, like this:
isSentThisKeywordMessage: isSent:and: is:this:with:parameters: is:message:parameters:
even though we actually write them as shown earlier.
Note that a parameter, or the operand of a binary message, can be either an object, or the result of sending a message to an object.
Just as in C, where a parameter, or the operand of an operator, can be either
:an object a literal, a constant, a variable, a pointer :the result of... an expression, or a function call
4 a block (a.k.a. closure, or block closure}
[thisObject willGetThisUnaryMessageSentToIt]
[:anyObject| anyObject willGetThisMessage]
[:first :second| thisObject gets:
first and: second]
[:first :second| first gets:
thisObject and: second]
A block can be thought of as the only instance of an impromptu class with no superclass and exactly one method.(Not actually true, but think of it this way until you really need to understand otherwise).
Q What is the 'one method'?
A Depends on the number of parameters.If the block has no parameters,
[ "parameter less block" ]
then its only known method is:
value
If the block has one parameter,
[:x| "a one parameter block"]
then its only known method is:
value:
actualParameter
If the block has two parameters:
[:x :y| "a two parameter block"]
then its only known method is:
value:
firstActual value: secondActual
and so on.
Examples:
[object messageSent]
When this block receives the unary 'value' message, the unary message 'messageSent' will be sent to the object 'object'.
[some code] value.
The 'value' message causes the block to 'execute' "some code".
[:one| "any code can be in here"] value:
object
The 'value: object' message causes the formal parameter 'one' to be bound with the actual parameter 'object', and the code then "executes".
5 return a value (a.k.a. early out)
^resultingObject.
Every method contains at least one of these, even if you can't see it. Usually, you can see it, and it is the last line of the method. If you can't see it, pretend you saw
^self.
as the last line of the method. The other use for this thing is the "early out", as in
object isNil ifTrue:
[^thisObject].
object getsThisMessage. ^self.
6 method definition
When using a browser, you don't actually see this syntactic form, but when Smalltalk is being described outside it's own environment, the following syntax is used to indicate the definition of a method:
ClassName>>methodSelector
anObject getsThisMessage. ^resultingObject.
This means that the class named "Class Name" has a method definition for the unary message "methodSelector" and it's definition is as shown.
ClassName>>+ operand
instanceVariable := instanceVariable + operand. ^self.
This means that the class named "Class Name" has a method definition for the binary message "+ operand" and it's definition is as shown.
Class Name>>keyword:
object message: text
Transcript nextPut: object ; nextPut: ' ' ; nextPutAll: text ; nextPut: cr .
This means that the class named "Class Name" has a method definition for the 2 parameter keyword message "keyword:message:" and it's definition is as shown.
7 Ok, I lied, there are seven syntactic forms.In that last example you see some semi-colons. The semi-colon is shorthand for
send this next message to the same object (the one that received the last message).
Hence, the above example means
send the "nextPut:" keyword message to the object named "Transcript",
then send another "nextPut:" message to the same object (i.e. Transcript),
then send a "nextPutAll:" message to that same object,
then send another "nextPut:" message to it.
Finally, return yourself as the result of this method.
That's it!
Let me repeat - That's it! That's the entire language.The only thing left is to learn the library, and the tricks and idioms of the language.
Now the astute reader will likely say something like
Wait a minute. You didn't cover Control-Flow.You didn't cover variables, types, allocation and deallocation, pointers, templates, virtual member functions, etc. etc. etc.
Well, such a reader would be wrong. I covered all of that. Ok, ok, you win. I never said anything about variables. That's because they have no syntactic form, other than assignment
instVar1 := 'aString'.
and the notation for temporaries
| aTemp anotherTemp |
Other than this, you define instanceVariables by typing their names into a special place in the browser, and classVariables into a different special place. There is no syntactic form that goes with it, as it's not part of the "code".
There are no types, and no 'builtin' syntactic specialties like arithmetic, casting, dereferencing, etc. There is allocation, but it is always a message send: Class Name new and there is no deallocation. When the last reference to an object ceases to exist, the object is garbage collected. You couldn't cause a *(VOID *)(0) if you wanted to.
None of the rest of that stuff exists either.
False, you say. You didn't go over the special syntax for control flow.
Yes I did. There isn't any. Turns out you don't need such a concept as control flow littering up your syntax.
Oh, don't be ridiculous, of course you do. It's completely special.
Sorry to disappoint you. Remember when I said "think of blocks as if they only have one method"? Here's where the truth comes out. Blocks also respond to a few other messages, like:
[ ] whileTrue:
[ ]
Which means:
send a message to an object.
Literally
send the keyword message
whileTrue:
(with it's parameter (the second block)) to an object (the first block).
What do you suppose a block does when it gets such a message?
The first block evaluates itself. (sends itself the 'value' message). If the result is true, it sends a 'value' message to the second block, and then starts over. Otherwise, it just quits, and returns 'false'.
Of course Booleans also have methods for similar looking messages:
False>>ifTrue:
aBlock
^false
False>>ifFalse:
aBlock
^aBlock value
False is a class, which has methods for these two messages. Since every object which is an instance of class False is by definition logically false, there is nothing to test. It effectively ignores requests to do something "ifTrue:" and always does the thing when asked to do something "ifFalse:".
(Don't think about this one too much, it will hurt you. You'll start to think Smalltalk might not be as slow as some think it is.)
Checkout the library to see how variations on this simple theme build up every control structure you've ever thought of.
Except one.
Nobody ever put a SWITCH/CASE type of semantic form into the library.
Drives beginners nuts.
Later you discover that your methods are always too short to care about such a thing, and when they seem to want for one, it means your design is not taking advantage of polymorphism the way it should.
So you fix that instead...
One last piece of syntactic sugar to deal with.
'ThisIsAString'
#ThisIsASymbol
These behave pretty much the same, except that the latter is guaranteed to be a Singleton, with a unique hash value. Useful for table lookups and such, but otherwise you can ignore it.
Hope this helps your attempts to read Smalltalk.
But be careful! The minute you get an inkling of what this all means, you'll find it very difficult to continue to use whatever language you're using now... Bar none.
You've been warned ;-)
Once upon a time there was a book entitled "A Quick Trip to Objectland" that had the all-time best Smalltalk tutorial for newbies. The book is unfortunately out of print, and was based on Smalltalk/V when it was still $100, and it was (maybe annoyingly) quirky. I no long have a copy. But the idea was a series of exercises mostly done in workspaces where Smalltalk told you how it worked. The exercises made maximum use of the interactivity and reflectiveness of the Smalltalk environment.
The book has a second addition which came out in December 2001. It is now based on Squeak and was a very good browse in the bookstore this morning. I'll probably go back for it soon. --Sean Oleary
Squeak: A Quick Trip to Object Land by Gene Korienek, Tom Wrensch, Doug Dechow. ISBN: 0201731142
Check out the tutorial at: www.object-arts.com
It has one of the clearest descriptions of the Smalltalk language's concepts and constructs that I've seen. I tried many of the samples in the first section in Squeak, and they worked.
See original on c2.com