Luca Fontanili

Technical Product Manager for IBM Process Mining @ IBM

About Me


Currently Technical Product Manager for IBM Process Mining at IBM; former Head of Innovation, Data Scientist and Back-End developer.

Big Data lover, eager learner, teammate.

Experience

JEM Tech

http://www.jemtech.it/

Technical Consultant

December 2021 - Present

Business or sector Computer Vision, Health

IBM

https://www.ibm.com/it-it

Technical Product Manager

June 2021 - Present

Business or sector Process Discovery & Mining

Cedacri

https://www.cedacri.it

Innovation Area Manager

February 2020 - June 2021

Business or sector Financial Services, Big Data, R&D, Cloud

myInvenio

https://www.my-invenio.com

Senior Engineer ML - Data Scientist

February 2019 - February 2020

Business or sector Process Discovery & Mining

Ubiq S.R.L. - SIA Group

https://www.sia.eu

Computer Engineer – Software Development Leader

March 2015 - February 2019

Business or sector Financial Services, Big Data

Datalogic ADC Inc

https://www.datalogic.com/eng/index.html

Software Engineering Intern

August 2014 - February 2015

Business or sector Computer Vision

Education

Data Science and AI group in Parma

Member and speaker

October 2017 - Present

Speaker at public events on Machine Learning and Data Science topics

University of Bologna

Master's Degree in Computer Engineering QEQ7

September 2012 - March 2015

110/110 with Honours

Thesis Development of parallel processing approach to compute Zero-mean Normalized Cross- Correlation. CUDA framework has been used to develop fast approach to detect characters in images, using Template Matching approach, based on Zero-mean Normalized Cross-Correlation measure

University of Parma

Bachelor's Degree in Computer Engineering QEQ6

September 2009 - July 2012

107/110

Thesis A Machine Learning project based on arm gesture recognition and humanoid imitation. Arm gestures are acquired by inertial motion sensors and modelled in an OpenGL 3D virtual space. A new method based on Functional Principal Component Analysis is used in MATLAB for both a supervised clustering of training data and gesture recognition. Recognized gestures are imitated by a small humanoid robot

A Little More About Me

PERSONAL SKILLS

Languages Levels: A1/2: Basic user - B1/2: Independent user - C1/2 Proficient user - MT Mother Tongue

Communication Skills I gained good communication skills speaking and interacting with customers and clents. I'm also part of the Data Science and Artificial Intelligence Group in Parma, and I give a speech about Machine Learning topics once every few months, usually in front of dozens of peoples.

Organisational/managerial Skills Excellent social/organisational skills acquired during my experience as Team Leader/Area Manager.

Other Skills

Sport:

  • American Football, 4 years’ experience in the Parma Panthers, interaction with American staff and players. Participation in European Championship with the Italian National American Football team

Driving License: B (own car)


show the code

import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

import nltk
from nltk.corpus import stopwords
nltk.download('stopwords', quiet=True)
nltk.download('punkt', quiet=True)
from nltk.tokenize import word_tokenize
import string

from sklearn.feature_extraction.text import HashingVectorizer
from sklearn.manifold import TSNE
from sklearn.cluster import KMeans

working_experiences = pd.read_sql_query('SELECT * FROM work_experiences where employer in ("Datalogic", "Ubiq", "MyInvenio", "Cedacri", "IBM")', conn)
descriptions = working_experiences.drop(['employer', 'keyword'], axis=1)
keywords = working_experiences.drop(['employer', 'description'], axis=1)
employers = working_experiences.drop(['description', 'keyword'], axis=1)

def split_and_clean(input_string):
    return word_tokenize(str(input_string).lower().translate(str.maketrans('', '', string.punctuation)))

words = []
for idx in range(0, len(descriptions)):
    
    word_list = [word for word in split_and_clean(descriptions.iloc[idx]['description']) if not word in stopwords.words()]
    word_list += ([word for word in split_and_clean(keywords.iloc[idx]['keyword']) if not word in stopwords.words()])
    words.append(word_list)

hvec = HashingVectorizer(lowercase=False, analyzer=lambda l:l, n_features=2**10)
X = hvec.fit_transform(words)
tsne = TSNE(perplexity=5, method='exact')
X_embedded = tsne.fit_transform(X.toarray())

working_experiences['x'] = X_embedded[:,0]
working_experiences['y'] = X_embedded[:,1]

k = 7
kmeans = KMeans(n_clusters=k, n_jobs=8)
y_pred = kmeans.fit_predict(X_embedded)

from bokeh.models import ColumnDataSource, HoverTool, LabelSet, RadioButtonGroup, CheckboxGroup, Slider
from bokeh.transform import linear_cmap
from bokeh.palettes import Blues9
from bokeh.io import output_file, save, show
from bokeh.plotting import figure
from bokeh.layouts import column 
from bokeh.models.callbacks import CustomJS

output_file("html/skills_scatter_plot.html")

y_labels = y_pred

# data sources
source = ColumnDataSource(data=dict(x= X_embedded[:,0], y= X_embedded[:,1], x_backup = X_embedded[:,0],
                                    y_backup = X_embedded[:,1], desc= y_labels, descriptions= working_experiences['description'],
                                    employers = working_experiences['employer'], keywords = working_experiences['keyword'], 
                                    level = working_experiences['managerial_level']
                                    ))

# map colors
mapper = linear_cmap(field_name='desc', palette=Blues9,low=min(y_labels), high=max(y_labels))

# hover over information
hover = HoverTool(tooltips=[
    ("Employer", "<b>@employers{safe}</b>"),
    ("Keyword", "<b>@keywords{safe}</b>"),
    ("Description", "@descriptions{safe}")
    ],
    point_policy="follow_mouse")

p = figure(plot_width=650, plot_height=650, tools=[hover], title="Cluster of all Working Experiences", toolbar_location=None)
p.background_fill_color = "#323A48"
p.border_fill_color = "#323A48"
p.xaxis.major_label_text_color = None
p.yaxis.major_label_text_color = None
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
p.title.text_color = "white"

# plot
p.scatter('x', 'y', size=10, source=source, fill_color=mapper, line_alpha=0.3, line_color="black")

labels = LabelSet(x="x", y="y", text="keywords", y_offset=-20, text_font_size="9pt", text_color="#FFFFFF",
                      source=source, text_align='center')
p.add_layout(labels)

work_labels = ['All'] + list(working_experiences.employer.unique())

option = RadioButtonGroup(labels=work_labels, active=0)
skill_type_slider = Slider(start=1, end=3, value=1, step=1, title="Managerial Level")

callback = CustomJS(args=dict(p=p, source=source, level=skill_type_slider, option=option), code="""
    var radio_value = option.active;
    const current_level = level.value;
    
    var x = source.data.x;
    var y = source.data.y;
    var x_orig = source.data.x_backup;
    var y_orig = source.data.y_backup;
    var employers = source.data.employers;
    var levels = source.data.level;

    for (var i = 0; i < x.length; i++) {
        if((levels[i] >= current_level) && (employers[i] == option.labels[radio_value] || option.labels[radio_value] == 'All')) {
            x[i] = x_orig[i]
            y[i] = y_orig[i]
        } else {
            x[i] = undefined
            y[i] = undefined
        }
    }
    
    p.title.text = 'Cluster of Working Experiences in ' + option.labels[radio_value]
    
    source.change.emit();

""")

option.js_on_click(callback)
skill_type_slider.js_on_change('value', callback)

checkbox_group = CheckboxGroup(labels=['Labels'], active=[0])
checkbox_group.js_on_click(CustomJS(args=dict(labels=labels),code="""
    if(this.active == '0') {
        labels.visible = true
        console.log('true')
    } else {
        labels.visible = false
    }
    labels.change.emit();
"""))

show(column(column(column(column(option),checkbox_group), skill_type_slider),p))

show the code

import os
from PIL import Image

import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_gradient_magnitude

from wordcloud import WordCloud, ImageColorGenerator, STOPWORDS

# get data directory (using getcwd() is needed to support running example in generated IPython notebook)
d = os.getcwd()

# load wikipedia text on rainbow
text = long_text

# load image. This has been modified in gimp to be brighter and have more saturation.
background_color = np.array(Image.open(os.path.join(d, "images/background.jpeg")))
# subsample by factor of 3. Very lossy but for a wordcloud we don't really care.
background_color = background_color[::3, ::3]

# create mask  white is "masked out"
background_mask = background_color.copy()
background_mask[background_mask.sum(axis=2) == 0] = 255

# some finesse: we enforce boundaries between colors so they get less washed out.
# For that we do some edge detection in the image
edges = np.mean([gaussian_gradient_magnitude(background_color[:, :, i] / 255., 2) for i in range(3)], axis=0)
background_mask[edges > .08] = 255

stopwords = set(STOPWORDS)
stopwords.add("different")
stopwords.add("development")

# create wordcloud. A bit sluggish, you can subsample more strongly for quicker rendering
# relative_scaling=0 means the frequencies in the data are reflected less
# acurately but it makes a better picture
wc = WordCloud(max_words=2000, mask=background_mask, max_font_size=40, random_state=42, relative_scaling=0, stopwords=stopwords, background_color="rgba(255, 255, 255, 0)", mode="RGBA")

# generate word cloud
wc.generate(text)

# create coloring from image
image_colors = ImageColorGenerator(background_color)
wc.recolor(color_func=image_colors)
plt.figure(figsize=(15, 10))
plt.imshow(wc, interpolation="bilinear")
plt.axis("off")
wc.to_file("images/word_cloud.png")
plt.show()

show the code

def show_skills_plot(x_label, output_file_name, skills=None):
  if skills is None:
      raise Exception("you should have at least one skill, shouldn't you?")
  from bokeh.io import show, output_notebook
  from bokeh.plotting import figure
  from bokeh.palettes import Blues9
  from bokeh.transform import factor_cmap
  from bokeh.models import ColumnDataSource
  output_file(output_file_name)

  
  source = ColumnDataSource(skills)
  
  TOOLTIPS = """
      <div style="width:300px">
          <table>
              <tr><td style="color:DodgerBlue;">Name</td><td style="text-align: center;"><b>@description</b></td></tr>
              <tr><td style="color:DodgerBlue;">Type</td><td style="text-align: center;">@subtype</td></tr>
              <tr><td style="color:DodgerBlue;">Details</td><td style="text-align: center;">@tooltip</td></tr>
          </table>
      </div>
  """

  p = figure(plot_width=700, plot_height=400, title="Expertise level",
             x_range=list(skills.description), toolbar_location=None, tooltips=TOOLTIPS)
      
  index_cmap = factor_cmap('subtype', palette=Blues9, factors=sorted(skills.subtype.unique()), end=1) 
  p.vbar(x='description', top='level', width=1, source=source,
         line_color="white", fill_color=index_cmap)

  p.y_range.start = 0
  p.y_range.end = 10
  p.x_range.range_padding = 0.1
  p.xgrid.grid_line_color = None
  p.xaxis.axis_label = x_label
  p.xaxis.major_label_orientation = 1.2
  #p.xaxis.major_label_text_color = 'white'
  #p.yaxis.major_label_text_color = 'white'
  #p.xaxis.axis_label_text_color = 'white'
  p.outline_line_color = None
  #p.background_fill_color = "#323A48"
  #p.border_fill_color = "#323A48"
  #p.title.text_color = "white"

  show(p)

job_skills = pd.read_csv('legacy/job_skills.csv', sep=';')
job_skills['subtype'] = job_skills.apply(lambda x : x.type if pd.isna(x.subtype) else x.subtype, axis=1)


ADDITIONAL INFORMATION

Publications

Arm Gesture Recognition and Humanoid Imitation Using Functional Principal Component Analysis,
J. Aleotti, A. Cionini, L. Fontanili, S. Caselli,
IEEE/RSJ International Conference on Intelligent Robotics and Systems (IROS),
Tokyo, November 2013

Challenges

Generali Data Challenge: top 10 finalist

Speaking

BPMNext 2019: Process Mining and DTO — How to Derive Business Rules and ROI from the Data

Cloud, the “New Normal”: Casi concreti di trasformazione e innovazione nel settore dei servizi finanziari

Honours and awards

Bronze medal of sporting merit given by CONI

Projects

  • Development of projects of personal interest using latest mobile and web programming languages (Android, HTML, CSS, SQL, JavaScript, Python) and tools (Nutch crawler, SonarQube, Tesseract).
  • Parma Panthers Official: development of the Android official application of the Parma Panthers, the American Football team 4 times champion of Italy in the main championship,
  • pysqoop: a Python package that lets you sqoop into HDFS data from RDBMS using Apache Sqoop (installable via pip).
  • Coinbase trading bot: a serverless Python bot running on AWS Lambda that trades crypto on Coinbase and that can be interacted with via Telegram Bot.
  • Member & Speaker of the Data Science & AI Group in Parma