Industry Day

Programme

Find the full conference programme here

Thursday the 2nd of April (Lecture Hall EI9)
8:00 Registration Opens
9:00 – 9:30 Welcome by Jussi Karlgren
9:30 – 10:30 Keynote Speaker: William Stevens (Europe Unlimited)
10:30 – 11:00 Coffee Break (Foyer)
11:00 – 12:30 Discussion Panel: The Idea Reactor
12:30 – 13:30 Lunch
13:30 – 16:30 Startup & Technology Talks
Catalyst – focus on an existing real world problem
Jeremy Pickens, Catalyst, Denver
Signal: The journey so far…
Miguel Martinez and Udo Kruschwitz, Signal, London
Thomson Reuters: Challenges and Evolution of an Information Company over time
Vassilis Plachouras, Thomson Reuters, London
Seznam.cz – the story of a successful web search engine
Jiří Materna, Seznam, Prague
From Last.fm to Lumi – Helping people find great content based on implicit userdata
Martin Stiksel, Lumi, London
Task selection based on topic induction from an IR perspective
João Graça, Unbabel, Lisbon
Spinque: a story of adaptation
Arjen P. de Vries, Spinque, Utrecth
15:00 – 15:30 Coffee Break (Foyer)
16:30 – 17:15 Summing up, final comments, and closing

Industry Day Keynote

William Stevens (Europe Unlimited)

A Belgian national, William Stevens graduated from EAP-European School of Management (Paris, Oxford, Berlin – Diplôme de Grande Ecole de Gestion – Diplom Kaufmann) in 1989 and previously from the Catholic University of Brussels in Economics. He embarked on his professional career at the European Venture Capital Association (EVCA) where he was appointed Secretary General at the age of 25. He launched several ambitious initiatives, one that led to the creation of EASDAQ (which became Nasdaq Europe) while significantly growing revenues, profits and membership. William founded Europe Unlimited in 1998 to be a much-needed European hub for fast-growing entrepreneurs raising their profile with venture capital investors. Today, Europe Unlimited has achieved that difficult mission and is a profitable company. Europe Unlimited organises 25 international venture and technology partnering forums with over 1,000 presenting entrepreneurs every year attracting a real network of venture capital investors, corporate partners, university tech transfer groups, innovation policy makers and deal makers. William also founded the International Venture Club (www.iventureclub.com) which is a collaborative platform of leading venture investors, involving independent venture funds and corporate, institutional and government investors. He also partnered at strategic level with www.techtour.com. William speaks English, French, Dutch and German and gets by in Italian. His interests are his family and friends, reading politics, travelling and hiking. He is married and has 2 daughters.

Abstract: The core activities of the Europe Unlimited and Tech Tour are international investment events where innovative entrepreneurs present to, and meet with venture investors and corporate partners. These events are hosted by development agencies and clusters in association with strong local partners fostering investment, innovation and entrepreneurship. Today the group including Tech Tour and the International Venture Club, organises 25 international events where 1,000 selected entrepreneurs meet with venture capital investors and corporate partners. The Tech Tour group with offices in Brussels and Geneva and a new location has the ambition to become the number 1 community platform facilitating world-class innovative entrepreneurship and investment.

Startup & Technology Talks

Jeremy Pickens (Catalyst, Denver)

Jeremy Pickens is a Senior Applied Research Scientist at Catalyst Repository Systems. Dr. Pickens has spearheaded the development of Catalyst’s Insight Predict, a platform for recall-oriented search and review. His ongoing research and development focuses on methods of using iterative, exploratory, and collaborative techniques to achieve higher recall, more precise results in the e-discovery domain. Dr. Pickens earned his doctorate at the University of Massachusetts, Amherst, Center for Intelligent Information Retrieval. He conducted his post-doctoral work at King’s College, London. As part of the OMRAS project (Online Music Recognition and Searching), he helped organize the first Music Information Retrieval (ISMIR) conference in Plymouth, Mass. Before joining Catalyst, Dr. Pickens spent five years as a research scientist at FX Palo Alto Lab, where his major research themes included video search and collaborative exploratory search.

Catalyst – focus on an existing real world problem

Abstract: In the technology space, many companies or spinoffs are started as a result of the invention of cool or interesting technology. Applications for that technology are sought and if the application is not a good match, the company then pivots to a new application using much of the same core technology (e.g. Flickr, IBM’s Watson). Often, the novelty of the technology is used to drive entirely new market segments, to create needs where none had existed previously (e.g. Pandora, Twitter). These approaches are the glamorous side of startups; they are model that many technologists strive for when inventing new technology. There is, however, a less glamorous but equally viable approach. That is to focus on an already existing real world problem — preferably a high value, currently underserved problem — and continue to pivot the technology itself (rather than the application of that technology) in an attempt to get closer and closer to solving that problem. This is the approach that Catalyst, a technology company working in the legal market, has taken. In this presentation I will detail our company’s startup trajectory, as spinoff from a law firm rather than from a university research lab or Silicon Valley garage, and offer recommendations for others wishing to follow a similar path.

Miguel Martinez-Alvarez (Signal, London)

Dr. Miguel Martinez-Alvarez is a researcher and a developer, mainly in the fields of Text Analytics and Information Retrieval. He is also the co-founder and Head of Research of Signal, a revolutionary platform to analyse text and discovering business intelligence. His main role is to investigate and apply the best possible algorithms and methods that are created in different fields of the academic world and apply them in a large-scale, commercially viable product.

Signal: The journey so far…

Abstract: While search technology has matured significantly in recent years, progress in information filtering has been limited. This is puzzling given that both areas appear to be different sides of the same coin. Signal focuses on the information filtering field and it delivers real-time relevant information based on personalised feeds. Such feeds are defined using keywords, locations, entities, topics and industries, and the system process more than 3 million documents a day from 65,000 traditional sources and 3.5 million blogs.It all started two years ago when David Benigson (now CEO of Signal) noticed that all the existing marketing intelligence tools, i.e. information filtering applications, suffered from either being very expensive or delivering poor precision and recall, while offering poor user experience and long integration efforts. David realised that he needed support to develop and scale the solution that will make Signal a reality. He hired an experienced software architect (Wesley Hall) as Signal’s CTO, and he was also awarded a KTP (Knowledge Transfer Partnership) grant with the University of Essex that allowed Miguel Martinez to become the Head of Research at Signal. This defined the philosophical backbone of the company to transform cutting-edge research emerging from an academic context into commercially viable, large-scale systems packaged up as simple and elegant products.Signal has grown from a three men company in a family garage in North-West London to a business with 15 employees (including several data scientists) with investment from VCs and paying costumers. We are now working on a new product with a long list of corporations and professionals in different sectors awaiting for it. And this is only the beginning of the journey…

Vassilis Plachouras (Thomson Reuters)

Vassilis Plachouras is a Senior Research Scientist at Thomson Reuters Corporate Research and Development Group, and is based in London UK. He received his PhD in computing science from the University of Glasgow, where he was also involved in the design and development of Terrier, an open source search engine. Previous to joining Thomson Reuters in 2014, he has worked at Yahoo! Research (Spain), PRESANS, a open innovation startup company (France) and Athena Research Center (Greece). His main research interests are in the area of Information Retrieval and applied Machine Learning, and he was the co-recipient of the best-paper award in the 18th CIKM conference.

Thomson Reuters: Challenges and Evolution of an Information Company over time

Abstract: Thomson Reuters is the world’s leading source of intelligent information for professionals and enterprises. Created in 2008 through the acquisition of Reuters Group PLC by The Thomson Corporation, it has a history of more than 150 years, evolving from a news reporting agency and newspaper publisher to a provider of high quality information for professionals in a broad range of sectors, namely finance & risk, legal, tax & accounting, intellectual property & science, and REUTERS news. Since 2008, Thomson Reuters has launched a series of flagship products, including WestlawNext, the leading system for legal research, Eikon, a platform for accessing and analyzing financial data, Elektron, a suite of data and trading propositions, and Cortellis, a powerful intelligence platform for the life science market.First, I will present an overview of the history of Thomson Reuters from Victorian times until today, providing insight into the way with which the company has transformed and adapted in response to a rapidly changing environment. I will outline some of the challenges that the company faced in the past and the key decisions that enabled Thomson Reuters to successfully overcome those challenges. Second, I will describe how Thomson Reuters’s services and products can enable third-parties to build applications that leverage the wealth of accurate and unbiased information provided by Thomson Reuters for a range of industry sectors.

Jiří Materna (Seznam.cz)

Jiří Materna is the Head of the Research department at Seznam.cz. He is a passionate researcher and software engineer, mostly focusing on challenging problems from the field of machine learning and natural language processing. After graduating from the Faculty of Informatics at Masaryk University, where he later got a Ph.D. in NLP, he briefly worked for a British lexicographical company, Lexical Computing Ltd. After that, he joined the full-text search group at Seznam.cz, where he is currently employed.

Seznam.cz – the story of a successful web search engine

Abstract: Seznam.cz, founded in 1996, is the biggest Czech Internet company. Combining a media house with state-of-the-art technological solutions, it generates 60 % of all Czech web page views and visits and is therefore perceived by many Czechs as a synonym for the Internet. The uniqueness of Seznam is supported by its web search engine; the Czech Republic is the only country in the world where Latin alphabet is used and where a local company successfully prevents global search engines from dominating the local market.In this talk, I would like to briefly outline the history of our search engine, which has naturally evolved from a simple directory of web pages. I will describe the technological difficulties we have encountered during the last decade as well as their practical solutions. Later, I would like to focus on models of cooperation with universities that help us to stay in touch with recent academic trends in information retrieval and machine learning.

Martin Stiksel (Lumi.do, London)

After a stint in journalism and music production Martin Stiksel created the music recommendation website Last.fm together with Felix Miller and Richard Jones in London in 2002. With its unique system of recommending music based on the songs people had previously listened to Last.fm became the place for new music for over 40M music fans. Since 2011 Martin has been working on Lumi, a website where people can use their browsing history to find things they are interested in. Again working with Felix Miller, he hopes to bring the Last.fm concept of effortless discovery to not just music, but to all sorts of online content.

From Last.fm to Lumi – Helping People Find Great Content Based on Implicit Userdata

Abstract:Last.fm had a simple proposition: It kept a record of what music you listened to and based on this music profile, found you new music that you would like, too.The lifeblood of this music recommendation service was it’s data: the play-counts of what music people actually listened to, straight from their personal itunes, or from their account on youtube.

It allowed everybody to feel like they are a music specialist, helping them find new music without having to do the legwork of going to specialists shops or reading the right music blogs.

Lumi is aiming to do something similar: Helping users discover new online content based on the webpages they have visited before, taking their browsing data as the starting point. The personalised news app Lumi is currently in beta on the Android store.

João Graça (Unbabel )

João Graça is currently the CTO of Unbabel. He was previously the data scientist and natural language processing expert at Dezine and Flashgroup. João did his PhD in Natural Language Processing and Machine Learning at Instituto Superior Técnico together with the University of Pennsylvania with Professors Fernando Pereira, Ben Taskar and Luísa Coheur. He is the author of several papers in the area, his main research topics are machine learning with side information, unsupervised learning and machine translation. João is one of the co-founders of the Lisbon Machine Learning Summer School.

Task Selection Based on Topic Induction from an IR Perspective

Abstract: Within a crowdsourced translation platform, the process of assigning tasks to users has a tremendous impact on the quality of the translation. At Unbabel, an AI Powered Translation Platform, user’s tasks consist of editing and correcting translations previously done by either other users or a machine translation engine. The problem of user selection is particularly important in this case, since selecting the user with the right expertise will mean the difference between a quality translation and a poor one. Other constraints, such as the SLA of the translation and the uncertainty of which users will be available at which times, make it hard to use a straightforward retrieval model.In this talk I will describe the current approach at Unbabel, where we create user profiles based on past history and ratings of previous tasks and use this information guide the assignment process, while respecting time constraints. We will discuss advantages and disadvantages and show initial results of the impact in speed and quality this approach had. Initial results show an improvement of 30% in the average quality of the produced translations.

Arjen P de Vries (Spinque, Utrecht)

Arjen P. de Vries leads the Information Access research group at the Centrum Wiskunde & Informatica (CWI) in Amsterdam. He also holds a part-time full professor position at Delft University of Technology. In November 2009, he co-founded CWI spin-off company Spinque to satisfy his interest into the integration of information retrieval and databases. Spinque develops novel search solutions based on “Search by Strategy”, an iterative 2-stage search process that separates search strategy definition (the how) from actual searching and browsing the collection (the what). This way, information specialists can reclaim their expertise in a time dominated by a “do-it-yourself” attitude to search. The technology builds on research in information retrieval (probabilistic relational algebra) and database architecture (column-stores), to turn the engineering of tailored search engines into a simple, flexible and efficient process.

Spinque: a story of adaptation

Abstract: Spinque is a spin-off company from CWI that builds on the research into Databases and Information Retrieval integration. We build tailor made search engines over connected datasets. Our technology allows to compose what we call “search strategies” out of pre-defined building blocks, and compile these strategies into custom, interactive search engines. With Spinque, information specialists can determine the way they approach their search tasks, and adapt their search strategy to the task at hand. Our main selling point today is to provide diversity thanks to flexibility. Relying on our unique approach to generate search engine technology from declarative specifications, we can provide the search engine for “search-based applications” at a low cost, tackling problems in core search but also related domains such as business intelligence, recommendation and ontology alignment.

Our main selling point today is to provide diversity thanks to flexibility. Relying on our unique approach to generate search engine technology from declarative specifications, we can provide the search engine for “search-based applications” at a low cost, tackling problems in core search but also related domains such as business intelligence, recommendation and ontology alignment.

In our initial plan, we had a launching customer who required patent search solutions, and we would have been their search technology partner. We have always been very technology oriented; until we asked outsiders for an honest opinion, we used code-name 5F that represented the five occurrences of the letter “f” in the key technical benefits of our envisioned product, Flexibility, eFFiciency, and eFFectiveness. While our ideas have evolved, the essence of our product remains that we create technical backend solutions to address complex search needs.

Technically, the Spinque platform relies on the most advanced solutions taken from various academic disciplines: columnar database architecture, query optimisation, probabilistic reasoning, information retrieval. Commercially, the challenge of selling search by strategy is not trivial. First, almost every organisation already has at least one solution for search. Ideally, we would replace their current solutions with our own framework and then build better search solutions on top of it, but replacing existing software is usually not within reach. Another complication is the strong market position of software solutions that appear to be free, supported by large established developer communities. Once we convince management to choose for Spinque, we then have to convince developers that they should not program low-level search features, but rely on a declarative approach. Just like the case for traditional database technology, many developers prefer to run their own solutions rather than having to learn a toolsuite developed by others. Finally, potential customers do not always understand that their problem is really a complex one. Our challenge here is to avoid to simplify either problem or tools, while making the overall solution feel achievable.

Looking forward, our ambition is to become the first choice for supplying “middleware search technology”. Thanks to the advances made in our search platform, we can define, generate and deploy search engine APIs with very low overhead. Keeping this platform up-to-date with advances in search and related disciplines requires a dedicated team with high technical expertise. We aim at creating a partner network for developing the end-to-end search solution, and building a developer pool who realize the power of our search platform.

Industry Day Chairs

Jussi Karlgren (Gavagai and KTH, Sweden) and Paul Ogilvie (LinkedIn, USA)