Monday, December 29, 2014

Java Collections Framework

This post will give a brief knowledge on Java Collections framework. This may not cover each and every implementation but most commonly used classes.

Before Java Collections framework was introduced, Arrays, HashTables and Vectors were the standard methods for grouping and manipulate collections of objects. But these implementations used different methods and syntax for accessing members (arrays used [], Vector used elementAt() methods while HashTable used get() and put() method. So you can't wrap one object to another easily). They were too static (changing the size and the type is hard) and had few built-in functions. Further most of the methods in Vector class were marked as final (so that you can't extend is behavior to implement a similar sort of collection), and most importantly none of them implements a standard interface. As programmers develop algorithms (such as sorting) to manipulate collections, they have to decide on what object to pass to the algorithm. Should they pass an Array or Vector or implement both interfaces?

The Java Collections framework provides a set of classes and interfaced to handle collections of objects. Most of the implementations can be found in java.util package. The Collections interface extends Iterable interface which make the Iterable interface (java.lang.Iterable) one of the root interfaced of the java collection classes. Thus all the subtypes of Collections also implements the Iterable interface. It has only one methd: iterator().

There are mainly 2 groups in the framework: Collections and Maps.

Collection

We have collection interface at the top that is used to pass collections around and manipulate then in the most generic way. All other interfaces List, Set etc extends this and make more customized objects.

Figure 1:Overview of Collection
List, Set, SortedSet, NavigableSet, Queue, Deque extends the Collection interface. The Collection interface just defines a set of  basic behaviors (methods such as adding, removing etc) that each of these collection sub-types share.
Both Set and List implements the Collections interface. The Set interface is identical to the Collection interface except for an additional method toArray() which converts a Set to an object array. The List interface also implements the Collections interface but provide more accessors that use and integer index into the List. For instance get(), remove(), and set() all take an integer that affects the indexed element in the list.

Map

The Map interface is not derived from Collection, but provides an interface similar to the methods in java.util.Hashtable.

Figure 2:Map interface
Keys are used to put and get values. This interface is implemented with two new concrete implementations, the TreeMap and the HashMap. The TreeMap is a balanced tree implementation that sorts elements by the key.


Benefits


  • The main advantage is it provides a consistent and a regular API. This reduces the effort required to design and implement APIs. 
  • Reduces programming effort - by providing data structures and algorithms so you don't have to write them yourself. 
  • Increases performance -  by providing high performance implementations of data structures and algorithms. As various implementations of each interface are interchangeable, programs can be tuned by switching implementations.  


List

The 2 commonly used implementations are LinkedList and ArrayList. List allow duplicates in the collections and use indexes to access elements. When deciding between these two implementations, we should consider whether the list is volatile (grows or shrinks often) and whether the access to items is random or ordered.

Following are 2 ways of initializing an ArrayList. Which method is correct?


List<String> values = new ArrayList<String>();

ArrayList<String> values = new ArrayList<String>();

Both approaches are acceptable. But the first option is more suitable. The main reason you'd do the first way is to decouple your code from a specific implementation of the interface and it allows you to switch between different implementations of the List interface with ease. eg: method public List getList() will allow you return any type that implements List interface. An interface gives you more abstraction, and make the code more flexible and resilient to changes as you can use different implementations.

ArrayList - A widely used collections framework in java. We can specify the size while initializing if not a default size will be used initially. We can add and retrieve items using add() and get() methods. But incase of removing items have to use with care. Internally this maintains an array to store items. So if you remove an item at the end, it will just remove the last element. But if you remove the first element, the operation will be very slow. (As it will remove first and copy all the following items back to fill the gap).

LinkedList - Internally maintains a doubly linked list, so that each element has a reference to the previous and next element. Thus retrieving an item from the list will be slow compared to the ArrayList as it traverse elements from the beginning.

The rule is if you want to add or remove items from the end of your list, it is efficient to use ArrayList, and to add or remove items from anywhere else it is efficient to use LinkedList.

Below sample will show using List implementations.




Set

Set does not allow duplicates in the collection. Adding a duplicate add nothing to a Set.  Sets allow mathematical operations such as intersection, union, difference.



Map

Maps store elements as key-value pairs. Can store any kind of objects. If you need to store customized objects as the keys, then you need to implement your own hashcode. The keys have to be unique. The inbuilt method map.keySet() returns a Set and does not have duplicates in the result. Thus if you have added different values with same keys, older will be replaced by the new one
If you iterate and retrieve items from the map, this will not maintain any order. Hence will get random order when iterate.

Two commonly used implementations are LinkedHashMap and TreeMap. LinkedHashMap will return values in order that they were inserted, while the treeMap is going to sort the keys


References:

[1]This is a very useful set of video tutorials that I could found on - Java Collections Framework







No comments:

Post a Comment