Firstly, we will focus on paradigms and use cases for Hadoop systems and how they have evolved over the last few years. Originally designed for batch processing only, they have evolved into a much more general platform for solving real-time use cases. Such platforms are often prerequisites for further data analysis.
The second part of the talk will be dedicated to predictive modelling. The big data in many domains (especially banking and telecommunications) mostly consist of very large transaction datasets. Each transaction (payment, call, text, message, etc.) is a small piece of evidence of the existence of a hidden link between the subjects on both sides of the transaction. By processing data carefully, we can aggregate this evidence and find those links. Where there are links, there is a network – a network built on implicitly expressed preferences. In our research program, we focus on mining business knowledge from such networks.