the blog
Back to -Blog

What Is Encapsulation? (and does it matter?)

A fundamental feature of OOP that means different things to different people...
by Huw Collingbourne
Tuesday 22 July 2008.

"No component in a complex system should depend on the internal details of any other component."
Dan Ingalls (Smalltalk Architect)

A small discussion about ‘encapsulation’ broke out a few months back in response to an article I wrote in my series, ‘Ruby The Smalltalk Way’. In the course of that article, I discussed the principle of encapsulation. I gave this description of encapsulation, taken from the ‘Smalltalk/V Tutorial’:

“Related data and program pieces are encapsulated within a Smalltalk object, a communicating black box. The black box can send and receive certain messages. Message passing is the only means of importing data for local manipulation within the black box.”

I pointed out that encapsulation is broken when a programmer assigns a variable (let’s call it x) and passes it to a method (e.g. someOb.someMethod( x )) inside which the value of x is modified and the programmer then uses this modified value - e.g.

x = 10
someOb.someMethod( x ) # <= let&#8217;s suppose this method changes x to 20
y = x * 2 #<= Now y is 40!

My assertion that this broke encapsulation turned out to be somewhat controversial. For example, in a thread on reddit, one writer commented: “Modifying objects passed into a method by calling methods on them does not break encapsulation, it’s the heart of what OO is about, creating side effects.”

The statement that ‘creating side effects’ is at the heart of Object Orientation surprised me (to put it mildly). However, the same writer later went on to expand upon his views, commenting: “That a programmer may implement a method outside of an object that modifies its state says nothing about the language or its ability to enforce encapsulation. The visitor pattern violates encapsulation in exactly this manner on purpose as its primary goal.” You can read the rest of this thread here: http://www.reddit.com/info/6d2p0/comments/

The mention of the ‘Visitor pattern’ in the comment above sent me scurrying away for my copy of the well-known ‘programming pattern’ reference, “Design Patterns: Elements of Reusable Object-Oriented Software” by Gamma, Helm, Johnson and Vlissides. First I looked up their definition of ‘encapsulation’ in the glossary. Here it is:

encapsulation: The result of hiding a representation and implementation in an object. The representation is not visible and cannot be accessed directly from outside the object. Operations are the only way to access and modify an object’s representation.”

Yes, well, that seems a pretty good definition to me. So now let’s see what they have to say about the ‘Visitor pattern’. This is what I find on page 337:

Breaking encapsulation: ... the pattern often forces you to provide public operations that access an element’s internal state, which may compromise its encapsulation.”

OK, so far so good. We have a workable definition of encapsulation. We also agree that the visitor pattern violates this. But I still disagree (profoundly) with the statement: “Modifying objects passed into a method by calling methods on them does not break encapsulation.”

Note: Some of the quotations in this article are taken from BYTE magazine, August 1981.

Others are taken from a number of classic Smalltalk books which are now available for download from:
Stéphane Ducasse’s :: Free Online Books

Implementation-Hiding

In my view, encapsulation necessarily means that the internal representation of an object (both its data and its methods) are hidden from the world outside. In the August 1981 ‘Smalltalk special’ issue of BYTE magazine (which I bought when it first appeared in 1981 and which still sits, tatty and much thumbed, here on the shelf next to my desk), Dan Ingalls says this:

”No component in a complex system should depend on the internal details of any other component.”

(The Dan Ingalls article is available online HERE)

That simple, succinct sentence tells you everything you need to know about encapsulation.

What this implies is that it should be possible substantially to rewrite the implementation of a method without having any effects (what I would call ‘side-effects’) on any code that ‘calls’ that method.

Here, are a couple more quotes on the same theme, once again taken from that classic issue of BYTE:

“A message must be sent to an object to find out anything about it... This is needed because we don’t want the form of an object’s inside known outside of it.”
(‘Object -Oriented Software Systems, David Robson, Xerox PARC, BYTE Magazine, August 1981)

Modularity: No component in a complex system should depend on the internal details of any other component....

The message-sending metaphor provides modularity by decoupling the intent of a message (embodied in its name) from the method used by the recipient to carry out the intent. Structural information is similarly protected because all access to the internal state of an object is through this same message interface.”
(‘Design Principles Behind Smalltalk, Daniel H. H. Ingalls, BYTE Magazine, August 1981)

In other words, the key, the central idea of what we now call ‘encapsulation’ is not merely data-hiding, but implementation-hiding. You don’t need simply to hide information (variables) from the world beyond the object - you also want to hide behaviour (methods). If the implementation details of a method have any effect of any sort on code outside of that object, then encapsulation is broken.

Simple Ways To Break Encapsulation

Here are a few examples of ways in which you can easily break encapsulation (that is, you can make external code dependent on the internal implementation details of objects ) in Ruby:

- 1. Modifying the value of an argument inside a method breaks encapsulation

class C
 def aMethod( aVar )
   aVar << "hello"
   return aVar.reverse
 end
end

ob1 = C.new
mystring = ["world"]

In the above, the author of class C appends “hello” to the argument, aVar and reverses the array when returning it. So, when a C object is used as the author intended, this is the result:

p ob1.aMethod( mystring ) #<= [&#8220;hello&#8221;, &#8220;world&#8221;]

But, since the argument, aVar is modified inside the aMethod() method, it is quite possible for a programmer to ‘hang onto’ the ingoing variable and use its modified value instead of the returned value, giving this result:

p mystring #<= [&#8220;world&#8221;, &#8220;hello&#8221;]

In other words, the same method produces different results according to how it is invoked. By ‘hanging onto’ the ingoing values (in the second example above), a programmer’s code becomes implementation dependent. If the author of class C reimplements aMethod, a programmer who uses the explicit return value will see no change (encapsulation seems to be working!) whereas the code of the programmer who uses the value of the ingoing argument will now behave differently (contrary to the intentions of the author of class C):

- 2. Dynamic programming Breaks Encapsulation

class C        
end

ob1 = C.new
ob1.instance_variable_set(:@a, 100 )
p ob1   #<=  #<C:0x2ab0584 @a=100>

In Ruby and many other ‘dynamic’ languages you can create or modify classes and objects at runtime. In many cases, this lets you tinker with the internal details of objects ‘from the outside’. The above example in Ruby code is a case in point - it creates and initializes an instance variable, @a, which is then poked into the object, ob1. The Ruby class documentation describes this method thus: “Sets the instance variable names by symbol to object, thereby frustrating the efforts of the class‘s author to attempt to provide proper encapsulation.” Getting dynamic programming and encapsulation to live together in peace poses a very tricky problem!

- 3. Global Variables Break Encapsulation

$x = 100

class C
 def aMethod
   $x = 200
 end
end

class C2
 def anotherMethod
   return $x * 2
 end
end

ob1 = C.new
ob2 = C2.new

In the above, ob2 and ob1 are objects of different classes. Calling methods of one object should have no effect when calling methods of the other. But they do, thanks to the reference to the global variable, $x. In fact, the results of my code change according to the order in which the methods are called...

puts ob1.aMethod   #<= 200
puts ob2.anotherMethod   #<= 400

puts ob2.anotherMethod   #<= 200
puts ob1.aMethod   #<=200

That, in effect, ‘exposes’ the implementation details of each class’s methods. If I change the code inside them, the effects will ripple through to unrelated objects!

I won’t labour the point and further. Suffice to say that in most mainstream languages there are many ways in which an object’s encapsulation (that is, the strict privacy of its internal structure - either its data or code) may be broken. For example, you can do so by extending or modifying base classes (from which other classes are derived) or by passing internal details from one object to another (via, for example, ‘lambda functions’, ‘blocks’ or ‘closures’). I’m not saying that these activities are always or necessarily bad - but, nonetheless, they do have important consequences for encapsulation.

Most modern OOP languages - C++, C#, Delphi, Java et al - don’t pay much attention to the data-hiding part of encapsulation. They generally consider this to be an optional extra, something you can enforce to a greater or less degree by using ‘privacy’ keywords, voluntarily adhering to certain coding standards or just documenting your intentions. This may explain why many programmers regard ‘encapsulation’ as a description of the ‘wrapping up’ inside an object of locally scoped variables and the methods to act upon them but do not consider it to imply the hiding of information (data) and implementation details.

Here are a couple more quotes that address the vital area of ‘implementation hiding’:

“Encapsulation is a great bonus from the point of view of the user of an object - they do not need to know anything about the object’s implementation, only what its published protocols are.”
(Smalltalk and Object Orientation: An Introduction - John Hunt)

“The separation between the internal and external views of an object is fundamental to the programming philosophy embodied in Smalltalk. To use an object, it is necessary to understand only its protocol or external view. The fundamental advantage of this approach is that, provided the message protocol or external view is not changed, the internal view may be changed without impacting users of the object.”
(Inside Smalltalk, by Wilf R LaLonde and John R Pugh, 1990)

Since it is clearly the case that ‘encapsulation’ means different things to different people, I can’t help thinking that it might be clearer if I were to use the term ‘modularity’ instead. Indeed, in the early literature of Smalltalk and OOP, ‘modularity’ was the more commonly used term. These days, however, even the word ‘modularity’ is ambiguous. For example, a Modula-2 module bears no resemblance to a Ruby module. Moreover, ‘encapsulation’ has become one of the three great tenets of OOP: inheritance, polymorphism and encapsulation so, for better or worse, we are stuck with the word!

In the next article in this series, I will explain exactly what I mean by modularity and why I believe it to be so important - and will become increasingly important during the coming decade.


Huw Collingbourne is one of the architects of Sapphire - a new OOP language for the DLR which is currently being developed by SapphireSteel Software. One of the fundamental design principles of Sapphire is rigorous encapsulation/modularity.

Bookmark and Share   Keywords:  general programming  Sapphire  smalltalk
  • What Is Encapsulation? (and does it matter?)
    17 September 2008

    [...] tinker with the internet details of objects [...]

    Tinker with your proofreading :p

    • What Is Encapsulation? (and does it matter?)
      17 September 2008, by Huw Collingbourne

      That was a deliberate error to check if people are reading this.... :-)

      Now fixed.

  • What Is Encapsulation? (and does it matter?)
    25 July 2008, by Tony Marston

    I have to disagree with every statement made in this article. The one and only definition of encapsulation that is worth bothering about is this:

    “Encapsulation is the act of placing data and the operations that perform on that data in the same class. The class then becomes the ’capsule’ or container for the data and operations.”

    Note that there are no "rules" as to what you can and can’t do with that data, nor what you can and can’t do with the operations. Those are not "rules" at all, just personal preferences, and as such anyone can choose to ignore them with utter impunity.

    And another thing, "implementation hiding" is NOT the same thing as "information hiding". There is no "rule" in the definition of encapsulation which prohibits data from being invisible to the outside world, thus I can access a piece of data either with or without the use of an operation or method should I wish to do so.

    Implementation hiding is nothing more than hiding the CODE behind each method signature - all the outside world can see is the method signature (its name and is arguments) but not the code behind that signature, its internal workings. Thus I am free to change the code within a method without having any effect on any outside code which references that method.

  • What Is Encapsulation? (and does it matter?)
    22 July 2008, by Chris

    The problem with "pure" encapsulation is that code contains bugs, is poorly documented so you need to look at the source to see what is going on anyway.

    “Encapsulation is a great bonus from the point of view of the user of an object - they do not need to know anything about the object’s implementation, only what its published protocols are.”

    That’s just not reality, it’s a Utopian view of software development. I’ve come to prefer Python’s view on encapsulation, we’re all adults so just do the right thing. I do believe you can view most if not all of the source of Smalltalk distributions (it’s been a long time) so I think that I agree with the spirit of but not actually the letter of the law.

    I think you have some valid points in the examples. The rules laid out here are just common sense for good software development.

  • What Is Encapsulation? (and does it matter?)
    22 July 2008, by jungle

    Your examples are specifically designed to support your special definition of encapsulation by omitting significant method names. Would a method called "multiplyArgumentByTwo()" break encapsulation if it did what your first example does? Unless you want to live within the constraints of pure functional programming, you have to admit that it doesn’t, and the same applies to all your other examples. Encapsulation means that it doesn’t mater how the method doubles its argument, not that it can’t have side effects.

    • What Is Encapsulation? (and does it matter?)
      24 July 2008, by Thom Parkin

      Precisely. Encapsulation means the ’internal workings’ of the object are hidden. And although those internals can change without any effect on those who depend upon it, the result (side effect) MUST remain unchanged. A class would be useless by your definition of ’pure’ encapsulation.

      If I put a cup of water into a microwave oven and cook it for 1 minute, I expect the water to be hot (changed). Regardless of how the microwave oven works internally - what specific parts are used, what brand of components control the electronics - the output MUST always be the same. If I replace anyone of those parts, and the output remains unchanged, it demonstrates encapsulation.

      • What Is Encapsulation? (and does it matter?)
        24 July 2008, by Huw Collingbourne

        Was that comment addressed to me? If so, I see nothing to disagree with. Of course the internal state of an object must be amenable to change in response to messages which are sent to it. Is there anything I have written which suggests that I hold a contrary view?

        I’ll discuss why ’encapsulation’ by the dictionary definition is neither possible nor desirable in the next article in this series. In the meantime, the theme of the present article relates to the hiding of the internal workings of one object from that of any other object. That, I maintain, is a requirement for true modularity/encapsulation.

© SapphireSteel Software 2014