Thursday, September 8, 2016

Behavior parameterization in Java 8 with lambda expressions and functional interfaces

Java 8 is packed with some new features in language level. In this blog post I hope to give an introduction to behavior parameterization with samples using lambda expressions. I will first describe a simple scenario and give a solution with java 7 features and then improve that solution with java 8 features.

What is behavior parameterization?

Behavior parameterization is a technique to improve the ability to handle changing requirements by allowing a caller of a function to pass custom behavior as a parameter. In a simple note, you can pass a block of code as an argument to another method, which will parameterize the behavior based on the passed code block.

Sample scenario:

Assume a scenario of a Company with set of Employees. The management of the company need to analyze the details of the employees of the company to identify/categorize employees into set of groups. (eg: group based on age, gender, position etc).


Below is a sample code for categorizing employees based on age and gender with java 7.

Solution 1 - Using java 7


There are 2 methods filterByAge() and filterByGender() which follows the same pattern except one logic which is the logic inside the if statement. If we can parameterize the behavior inside the if block, we could use a single method to perform both filtering options. It improves the ability to tell a method to take multiple strategies as parameters and follow them internally as per the requirements.

Lets try to reduce the code to a single method using anonymous classes. We are still using java 7 and no java 8 features were used.

Solution 2 - Improved with anonymous classes

Instead of maintaining two methods we have introduced a new method “filterEmployee” which takes 2 arguments an employee inventory and an EmployeePredicate. EmployeePredicate is a customized interface that has a single abstract method test() which takes an Employee object and returns a boolean. Then we have used 2 implementations of EmployeePredicate interface as anonymous classes to pass the behavior as per the requirement.

We have changed our program from solution-1 to solution-2 with following steps:
  1. We have reduced 2 methods to a single method and improved that method to accept a behavior. (This is an improvement)
  2. We had to introduce a new custom interface EmployeePredicate and use anonymous classes to pass the behaviour. (This is not a good enough and verbose. We need to improve this)

Functional Interfaces

Functional interface is an interface that has only a one abstract method. (Similar to what we have introduced in the previous solution, EmployeePredicate interface). Functional interface can have other default methods (which is another new feature introduced in java8) as long as it includes a single abstract method.

Sample functional interfaces:


As per our latest solution with anonymous classes, we need to create our own interface that includes a method accepting an object of our preference and return some output. But with java 8 there are some generic functional interfaces that were newly introduced. We can reuse them to pass different behaviors without creating our own interfaces. I have listed some of them below:

  1. java.util.function.Predicate<T> interface has a one abstract method “test()” that takes an Object of type T and returns a boolean.
    1. Eg: as per our scenario We take Employee as an object and return a boolean to indicate if the employee’s age is less than 30.
  2. java.util.function.Consumer<T> interface has a one abstract method “accept()” that takes an Object of type T and does not return anything
    1. Eg: Assume we need to print all the details of a given Employee. But not return anything. We can use Consumer interface.
  3. java.util.function.Function<T,R> interface has a one abstract method “apply()” that takes an Object of Type T and returns an Object of type R.
    1. Eg: Assume we need to take Employee object and return the employee ID as an integer.

What should be the functional interface that we need to use to improve our existing solution in the employee categorizing scenario?
Lets try to use Predicate functional interface and improve the solution.

Solution 3 - Using java 8 functional interfaces



So far we have changed our program from solution-2 to solution-3 with the steps below:
  1. We have removed the customized interface EmployeePredicate and used an existing Predicate functional interface from java 8 - (This is an improvement and we have reduced an interface)
  2. We still use the anonymous functions. (still verbose and not good enough)

Lambda expressions:

We can use lambda expressions in any place we need to pass a functional interface. Lambda expressions can represent behavior or pass code similar to anonymous functions. It can has a list of parameters, method body and a return type.
We can describe a lambda expression as a combination of 3 parts as below:

  1. List of parameters
  2. An arrow
  3. Method body

Consider the sample lambda expression below:

(Employee employee) ---> employee.getAge() < 30

This sample shows how we can pass a behavior to a Predicate interface that we have used in solution-3. Lets first analyze a anonymous class we have used.




We are implementing the Predicate functional interface that has a single abstract method. This abstract method takes an Object as parameters and return a boolean value. In the lambda expression we have used:

  1. (Employee employee)   : Parameters for the abstract function of Predicate interface
  2. --->                                : Arrow separates the list of parameters from the body of lambda
  3. employee.getAge() < 30 : This is the body of the abstract method of the predicate. The result of the body is a boolean value. Hence the above lambda expression returns a boolean value.

Sample lambda expressions:
  1. (Employee employee) ---> System.out.println(“Employee name : ” + employee.name + “\n Employee ID: ” + employee.id)
    1. This a possible implementation of the Consumer functional interface that has a single abstract method that accepts an object and return void.
  2. (String s) ---> s.length()
    1. This a possible implementation of the Function functional interface that has a single abstract method that accepts an object and return another object.
  3. () → new Integer(10)
    1. This lambda expression is for a functional interface that has a single abstract method with no arguments and return an integer.
  4. (Employee employee, Department dept)  ---> {
                              If (dept.getEmployeeList().contains(employee.getID)) {
                                          System.out.println(“Employee : ” + employee.getName());
                              }
                }

    1. This lambda expression is for a functional interface that has a single abstract method with 2 arguments of type object and return a void.
                              

Let's rewrite the solution using lambda expressions.
Solution - 4 (Using lambda expressions)


Solution-4 can further improved using method references. I have not discussed method references with this blog post and will not include that solution here.

Wednesday, August 10, 2016

Using NoSQL databases


Databases plays a vital role when it comes to managing data in applications. RDBMS (Relational Database Management Systems) are commonly use to store/manage data/transactions in application programming.
As per the design of RDBMS, there are some limitations when applying RDBMS to manage Big/dynamic/unstructured data.
  • RDBMS use tables, join operations, references/foreign keys to make connections among tables. It will be costly to handle complex operations that involve multiple tables.
  • It is hard to restructure a table. (eg: each entry/row in the table has similar set of fields). If the data structure changed, the table has to be changed
In contrast, there are applications that process large scale, dynamic data (eg: geospatial data, data used in social networks). Due to the limitations above, the RDBMS may not be the ideal choice. 

What is No-SQL?

No-SQL (Not only SQL) is a non-relational database management system, that has some significant differences than RDBMSs. No-SQL as the name suggest does not use a SQL as the querying language and uses javascript(commonly used) instead. JSON is frequently used when storing records. 

No-SQL databases some key features that make it more flexible than RDBMS,
  1. The database, tables, fields need not to be pre-defined when inserting records. If the data structure is not present database will create it automatically when inserting data. 
  2. Each record/entry (or row in terms of RDBMS tables) need not to have the same set of fields. We can create fields when creating the records.
  3. Allows nested data structures (eg: arrays, documents)
Different types of No-SQL data:

  1. Key-Value:
    1. A simple way of storing records with a key(from which we can lookup the data) and a value (can be a simple string or a JSON value)
    1234Nipuni
    1345"{Name: Nipuni, Surname: Perera, Occupation: Software Engineer}"

  2. Graph:
    1. Used when data can be represented as interconnected nodes.     
  3. Column:
    1. Uses a similar flat table structure used in RDBMSs, but keys are used in columns rather than in rows. 
    ID234345456567
    NameNipuniJohnSmithBob

  4. Document:
    1. Stored in a format like JNSON, XML.
    2. Each document can have a unique structure. (Document type is used when storing objects and support OOP)
    3. Each document usually has a specific key, which can use to retrieve the document quickly.
    4. Users can query data by the tagged elements. The result can be a String, array, object etc. (I have highlighted some of the tags in the sample document below.)
    5. A sample document data that stores personal details may look like below:
      1. {
Id”:”133”
Name”: “Nipuni”
Education”: [
{ “secondary-education”:”University of Moratuwa”}
, { “primary-education”: ”St.Pauls Girsl School”}
]
}

Possible application for No-SQL
  1. No-SQL commonly used in web applications, that involves dynamic data. As per the data type description above, No-SQL is capable of storing unstructured data. No-SQL can be a powerful candidate for handling big data. 
  2. There are many implementations available for No-SQL (eg:  CouchDB, MongoDB) that serve different types of data structures.
  3. No-SQL can use to retrieve full list (that may involve multiple tables when using RDBMS). Eg: Retrieving details of a customer in a financial company may have different levels of information about the customer (eg: personal details, transaction details, tax/income details). No-SQL can save all this data in a single entry with a nested data type (eg: document), which then can retrieve complete data set without any complex join operation. 
The decision on which scheme to use depend on the requirement of the application. Generally, 

  1. Structured, predictable data can be handled with →  RDBMS
  2. Unstructured, bid data, complex and rapidly changing data can manage with → No SQL (But there are different implementations for No-SQL that provide different capabilities. No-SQL is just a concept for database management systems.)


No-SQL with ACID properties



Relational databases usually guarantee ACID properties. ACID provides a rule set that guarantees to handle transactions keeping its data safe. It depend on which No-SQL implementation you choose, and how much the database implementation guarantee the ACID properties.



  • Atomicity - when you do something to change a database the change should work or fail as a whole. Atomicity is guaranteed in document wide transactions. Writes cannot be partially applies to an inserted document.
  • Consistency-  the database should remain consistent. This feature support depend on your chosen No-SQL implementation. As No-SQL databases mainly support distributed systems, consistency and availability may not compatible.

  • Isolation - If multiple transactions are processing at the same time they shouldn't be able to see mid-status. There are No-SQL implementations that support read/write locks to to support isolation mechanism. But this too depends on the implementation.
  • Durability - If there is a failure (hardware or software) the database needs to be able to pick itself back up. No-SQL implementations support different mechanisms (eg: MongoDB supports journaling. With journaling when you do an insert operation in mongoDB it keeps that in memory and insert into a journal. )

Limitations of No-SQL


  1. There are different DBs available that uses No-SQL, you need to evaluate and find out which fits your requirements the most.
  2. Possibility of duplication of data.
  3. ACID properties may not support for all the implementations.

I have mainly worked with RDBMS, and have a general idea about the No-SQL concept. There is are significant differences between RDBMS and No-SQL database management systems. The choice depends on the requirements of the application and the No-SQL implementation to use. IMHO the decision should take after a proper evaluation of the requirement, and the limitation that the system can afford.