Computational Intelligence (CI) commonly refers to a variety of bio-inspired and/or human-like techniques that can be applied in optimisation, learning and modelling problems. Broadly speaking, CI comprises Artificial Neural Networks, Fuzzy Sets and Fuzzy Logic and Evolutionary Computation. In the era of big data, CI in conjunction with data mining techniques are expected to help uncover useful knowledge from big data as they are very well suited for dealing with the intrinsic veracity and variety of big data; however, they are challenged by the volume and velocity of big data which typically limit their application in this context.
In this talk, I will start off with an introduction to machine learning and big data, discussing how to effectively parallelise the computation across a number of computing nodes using PySpark. I will then present some examples of the application of CI to big data scenarios, including Citizen Science and Real-Time Hot Spots Detection. I will discuss an approach to handle uncertainty in Citizen Science data with a case study in Galaxy Image classification, and a dynamic bio-inspired approach to detect hot spots in big data streams and its application to identify incidents on the roads caused by heavy goods vehicles as a case study.