Monday, December 29, 2014

Java Collections Framework

This post will give a brief knowledge on Java Collections framework. This may not cover each and every implementation but most commonly used classes.

Before Java Collections framework was introduced, Arrays, HashTables and Vectors were the standard methods for grouping and manipulate collections of objects. But these implementations used different methods and syntax for accessing members (arrays used [], Vector used elementAt() methods while HashTable used get() and put() method. So you can't wrap one object to another easily). They were too static (changing the size and the type is hard) and had few built-in functions. Further most of the methods in Vector class were marked as final (so that you can't extend is behavior to implement a similar sort of collection), and most importantly none of them implements a standard interface. As programmers develop algorithms (such as sorting) to manipulate collections, they have to decide on what object to pass to the algorithm. Should they pass an Array or Vector or implement both interfaces?

The Java Collections framework provides a set of classes and interfaced to handle collections of objects. Most of the implementations can be found in java.util package. The Collections interface extends Iterable interface which make the Iterable interface (java.lang.Iterable) one of the root interfaced of the java collection classes. Thus all the subtypes of Collections also implements the Iterable interface. It has only one methd: iterator().

There are mainly 2 groups in the framework: Collections and Maps.

Collection

We have collection interface at the top that is used to pass collections around and manipulate then in the most generic way. All other interfaces List, Set etc extends this and make more customized objects.

Figure 1:Overview of Collection
List, Set, SortedSet, NavigableSet, Queue, Deque extends the Collection interface. The Collection interface just defines a set of  basic behaviors (methods such as adding, removing etc) that each of these collection sub-types share.
Both Set and List implements the Collections interface. The Set interface is identical to the Collection interface except for an additional method toArray() which converts a Set to an object array. The List interface also implements the Collections interface but provide more accessors that use and integer index into the List. For instance get(), remove(), and set() all take an integer that affects the indexed element in the list.

Map

The Map interface is not derived from Collection, but provides an interface similar to the methods in java.util.Hashtable.

Figure 2:Map interface
Keys are used to put and get values. This interface is implemented with two new concrete implementations, the TreeMap and the HashMap. The TreeMap is a balanced tree implementation that sorts elements by the key.


Benefits


  • The main advantage is it provides a consistent and a regular API. This reduces the effort required to design and implement APIs. 
  • Reduces programming effort - by providing data structures and algorithms so you don't have to write them yourself. 
  • Increases performance -  by providing high performance implementations of data structures and algorithms. As various implementations of each interface are interchangeable, programs can be tuned by switching implementations.  


List

The 2 commonly used implementations are LinkedList and ArrayList. List allow duplicates in the collections and use indexes to access elements. When deciding between these two implementations, we should consider whether the list is volatile (grows or shrinks often) and whether the access to items is random or ordered.

Following are 2 ways of initializing an ArrayList. Which method is correct?


List<String> values = new ArrayList<String>();

ArrayList<String> values = new ArrayList<String>();

Both approaches are acceptable. But the first option is more suitable. The main reason you'd do the first way is to decouple your code from a specific implementation of the interface and it allows you to switch between different implementations of the List interface with ease. eg: method public List getList() will allow you return any type that implements List interface. An interface gives you more abstraction, and make the code more flexible and resilient to changes as you can use different implementations.

ArrayList - A widely used collections framework in java. We can specify the size while initializing if not a default size will be used initially. We can add and retrieve items using add() and get() methods. But incase of removing items have to use with care. Internally this maintains an array to store items. So if you remove an item at the end, it will just remove the last element. But if you remove the first element, the operation will be very slow. (As it will remove first and copy all the following items back to fill the gap).

LinkedList - Internally maintains a doubly linked list, so that each element has a reference to the previous and next element. Thus retrieving an item from the list will be slow compared to the ArrayList as it traverse elements from the beginning.

The rule is if you want to add or remove items from the end of your list, it is efficient to use ArrayList, and to add or remove items from anywhere else it is efficient to use LinkedList.

Below sample will show using List implementations.

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
/**
* This class demonstrates a comparison of two List implementations in java collections framework
* ArrayList and LinkedList
*/
public class ListSample1 {
public static void main(String[] args) {
//List objects implements the List interface
List<Integer> arrayList = new ArrayList<Integer>();
List<Integer> linkedList = new LinkedList<Integer>();
//Basic operations use indexes to access items
//Adding
arrayList.add(10);
arrayList.add(90);
arrayList.add(78);
linkedList.add(22);
linkedList.add(43);
//retrieving
System.out.println(arrayList.get(0));
System.out.println(linkedList.get(0));
//removing
arrayList.remove(0); //slow
arrayList.remove(arrayList.size() - 1); //fast
linkedList.remove(0); //fast
linkedList.remove(linkedList.size() - 1); //slow
}
}
view raw ListSample1 hosted with ❤ by GitHub
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
/**
* This class demonstrates a comparison of two List implementations in java collections framework
* ArrayList and LinkedList
*/
public class ListSample {
public static void main(String[] args) {
//List objects implements the List interface
List<Integer> arrayList = new ArrayList<Integer>();
List<Integer> linkedList = new LinkedList<Integer>();
System.out.println("Timing for ArrayList ..... ");
calculateTime(arrayList);
System.out.println("Timing for LinkedList ...... ");
calculateTime(linkedList);
}
//I have set paramter to List so that I can pass any List implementation (eg: LinkedList, ArrayList)
private static void calculateTime(List<Integer> list) {
//adding items to the list
for (int i = 0; i < 1E5; i++) {
list.add(i);
}
long start = System.currentTimeMillis();
addItems(list);
long end = System.currentTimeMillis();
System.out.println("Time taken to add values: " + (end - start) + " ms");
start = System.currentTimeMillis();
addToFirstItem(list);
end = System.currentTimeMillis();
System.out.println("Time taken to add first item: " + (end - start) + " ms");
start = System.currentTimeMillis();
addToMiddleItem(list);
end = System.currentTimeMillis();
System.out.println("Time taken to add middle item: " + (end - start) + " ms");
}
private static void addItems(List<Integer> list){
for (int i = 0; i < 1E5; i++) {
list.add(list.size() - 1, i);
}
}
private static void addToFirstItem(List<Integer> list){
for (int i = 0; i < 1E5; i++) {
list.add(0, i);
}
}
private static void addToMiddleItem(List<Integer> list){
for (int i = 0; i < 1E5; i++) {
list.add(100, i);
}
}
}
view raw ListSample2 hosted with ❤ by GitHub
Timing for ArrayList .....
Time taken to add values: 9 ms
Time taken to add first item: 4377 ms
Time taken to add middle item: 6254 ms
Timing for LinkedList ......
Time taken to add values: 17 ms
Time taken to add first item: 8 ms
Time taken to add middle item: 18 ms
view raw output hosted with ❤ by GitHub



Set

Set does not allow duplicates in the collection. Adding a duplicate add nothing to a Set.  Sets allow mathematical operations such as intersection, union, difference.

Hash set:............
[Three, One, Two, Four]
Linked Set:.....
[Four, One, Two, Three]
Tree Set............
[Four, One, Three, Two]
view raw output1 hosted with ❤ by GitHub
map.............
1 : Bob
2 : Mike
3 : Alen
2 : Mike
set.............
1 : Bob
2 : Mike
3 : Alen
2 : Mike
view raw output2 hosted with ❤ by GitHub
map.............
1 : Bob
2 : Mike
3 : Alen
set.............
1 : Bob
2 : Mike
3 : Alen
view raw output3 hosted with ❤ by GitHub
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.LinkedHashSet;
import java.util.Map;
import java.util.Set;
import java.util.TreeSet;
/**
* This class demonstrates the Set interface in java collections
*/
public class SetSample {
public static void main(String[] args) {
//Sets keep unique values
//Hashset does not retain in any order
Set<String> set = new HashSet<String>();
//Keep values in order
Set<String> linkedSet = new LinkedHashSet<String>();
Set<String> treeSet = new TreeSet<String>();
System.out.println("\nHash set:............");
testSet(set);
System.out.println("\nLinked Set:.....");
testSet(linkedSet);
System.out.println("\nTree Set............");
testSet(treeSet);
}
private static void testSet(Set<String> set) {
set.add("Four");
set.add("One");
set.add("Two");
set.add("Three");
System.out.println(set);
}
}
view raw SetSample1 hosted with ❤ by GitHub
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.LinkedHashSet;
import java.util.Map;
import java.util.Set;
import java.util.TreeSet;
/**
* This class demonstrates the Set interface in java collections
*/
public class SetSample {
public static void main(String[] args) {
//Sets keep unique values
//Hashset does not retain in any order
Set<String> set = new HashSet<String>();
//Keep values in order
Set<String> linkedSet = new LinkedHashSet<String>();
Set<String> treeSet = new TreeSet<String>();
createCustomKeys();
}
private static void createCustomKeys() {
//I have created two Person objects with same name and id.
//generally we think the 2 persons with the same name & id should be same.
//but in objects they are different
Person p1 = new Person(1, "Bob");
Person p2 = new Person(2, "Mike");
Person p3 = new Person(3, "Sue");
Person p4 = new Person(2, "Mike");
Map<Person, Integer> personMap = new LinkedHashMap<Person, Integer>();
Set<Person> set = new LinkedHashSet<Person>();
personMap.put(p1, 1);
personMap.put(p2, 2);
personMap.put(p3, 3);
personMap.put(p4, 5);
set.add(p1);
set.add(p2);
set.add(p3);
set.add(p4);
//this prints both equal objects
//Set or Map does not analyse the content of the object to see if they are equal
System.out.println("map.............");
for (Person key : personMap.keySet()) {
System.out.println(key.toString());
}
System.out.println("set.............");
for (Person p : set) {
System.out.println(p.toString());
}
}
}
class Person {
private int id;
private String name;
public Person(int id, String name) {
this.id = id;
this.name = name;
}
public String toString() {
return id + " : " + name;
}
}
view raw SetSample2 hosted with ❤ by GitHub
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.LinkedHashSet;
import java.util.Map;
import java.util.Set;
import java.util.TreeSet;
/**
* This class demonstrates the Set interface in java collections
*/
public class SetSample {
public static void main(String[] args) {
//Sets keep unique values
//Hashset does not retain in any order
Set<String> set = new HashSet<String>();
//Keep values in order
Set<String> linkedSet = new LinkedHashSet<String>();
Set<String> treeSet = new TreeSet<String>();
createCustomKeys();
}
private static void createCustomKeys() {
//I have created two Person objects with same name and id.
//as I have implemented hashcode so that Person objects having same id and name are equal.
Person p1 = new Person(1, "Bob");
Person p2 = new Person(2, "Mike");
Person p3 = new Person(3, "Sue");
Person p4 = new Person(2, "Mike");
Map<Person, Integer> personMap = new LinkedHashMap<Person, Integer>();
Set<Person> set = new LinkedHashSet<Person>();
personMap.put(p1, 1);
personMap.put(p2, 2);
personMap.put(p3, 3);
personMap.put(p4, 5);
set.add(p1);
set.add(p2);
set.add(p3);
set.add(p4);
//this does not print both equal objects
System.out.println("map.............");
for (Person key : personMap.keySet()) {
System.out.println(key.toString());
}
System.out.println("set.............");
for (Person p : set) {
System.out.println(p.toString());
}
}
}
class Person {
private int id;
private String name;
public Person(int id, String name) {
this.id = id;
this.name = name;
}
public String toString() {
return id + " : " + name;
}
//generate hashcode with the variables you think that are important when deciding if the two objects are equal
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Person)) return false;
Person person = (Person) o;
if (id != person.id) return false;
if (name != null ? !name.equals(person.name) : person.name != null) return false;
return true;
}
//produces an id
@Override
public int hashCode() {
int result = id;
result = 31 * result + (name != null ? name.hashCode() : 0);
return result;
}
}
view raw SetSample3 hosted with ❤ by GitHub


Map

Maps store elements as key-value pairs. Can store any kind of objects. If you need to store customized objects as the keys, then you need to implement your own hashcode. The keys have to be unique. The inbuilt method map.keySet() returns a Set and does not have duplicates in the result. Thus if you have added different values with same keys, older will be replaced by the new one
If you iterate and retrieve items from the map, this will not maintain any order. Hence will get random order when iterate.

Two commonly used implementations are LinkedHashMap and TreeMap. LinkedHashMap will return values in order that they were inserted, while the treeMap is going to sort the keys

import java.util.HashMap;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.TreeMap;
/**
* This class demonstrates the Map interface and its implementations
*/
public class MapSample {
public static void main(String[] args) {
//HashMap has no order
Map<Integer, String> map = new HashMap<Integer, String>();
//Below Map implementations keep values in order
Map<Integer, String> linkedHashMap = new LinkedHashMap<Integer, String>();
Map<Integer, String> treeMap = new TreeMap<Integer, String>();
//add key-value pairs
map.put(1, "One");
map.put(3, "Three");
map.put(2, "Two");
map.put(5, "Five");
map.put(6, "Six");
map.put(7, "Seven");
map.put(8, "Eight");
map.put(0, "Zero");
//retrieving
//if value does not exist, it will return null
System.out.println(map.get(4));
//adding same key will overwrites the first (no duplicate keys)
map.put(1, "This is a duplicate-key");
System.out.println(map.get(1));
}
}
view raw MapSample1 hosted with ❤ by GitHub
import java.util.HashMap;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.TreeMap;
/**
* This class demonstrates the Map interface and its implementations
*/
public class MapSample {
public static void main(String[] args) {
//HashMap has no order
Map<Integer, String> map = new HashMap<Integer, String>();
//Keep values in order
Map<Integer, String> linkedHashMap = new LinkedHashMap<Integer, String>();
Map<Integer, String> treeMap = new TreeMap<Integer, String>();
System.out.println("\nHashMap ... ");
testMap(map);
System.out.println("\nLinked Hash Map...");
testMap(linkedHashMap);
System.out.println("\nTree Map...");
testMap(treeMap);
}
private static void testMap(Map<Integer, String> map) {
//adding values to the Map
map.put(8, "Eight");
map.put(1, "One");
map.put(3, "Three");
map.put(2, "Two");
map.put(5, "Five");
map.put(6, "Six");
map.put(7, "Seven");
map.put(0, "Zero");
//retrieving values
for (Map.Entry<Integer, String> entry : map.entrySet()) {
System.out.println(entry.getKey() + " : " + entry.getValue());
}
}
}
view raw MapSample2 hosted with ❤ by GitHub
HashMap ...
0 : Zero
1 : One
2 : Two
3 : Three
5 : Five
6 : Six
7 : Seven
8 : Eight
Linked Hash Map...
8 : Eight
1 : One
3 : Three
2 : Two
5 : Five
6 : Six
7 : Seven
0 : Zero
Tree Map...
0 : Zero
1 : One
2 : Two
3 : Three
5 : Five
6 : Six
7 : Seven
8 : Eight
view raw output hosted with ❤ by GitHub

References:

[1]This is a very useful set of video tutorials that I could found on - Java Collections Framework