Meet team 3: DEKX (DERI Knowledge Xcavator )
Team members (from left to right):
Vit Novacek - DERI, National University of Ireland, Galway, Ireland
Siegfried Handschuh - DERI, National University of Ireland, Galway, Ireland
Tudor Groza - DERI, National University Of Ireland, Galway, Ireland
Project Title: CORAAL – Dive into Publications, Bathe in the Knowledge
Semi-final report as submitted to the judges
Extended version of our semifinal report
INTERVIEW WITH TEAM 3
>> Academic background of team members
Vit Novacek - contact point and knowledge representation enchanter; technical
responsibilities within the challenge: knowledge acquisition, KR&R,
Vit holds BA. degrees in media studies and political philosophy and BSc., MSc.
degrees in theoretical computer science and AI from Masaryk University, Czech Republic. Since 2007, he has been a PhD student at DERI, National University of Ireland Galway. His primary research concerns novel methods for representation and exploitation of emergent (e.g., automatically extracted) knowledge. He is also interested in applications of his research in the context of life sciences and bioinformatics. So far, Vit has been the primary author of more than 10 peer-reviewed international publications concerning the above topics (including high-profile journal and magazine articles and two book chapters).
Siegfried Handschuh - scientific supervisor, semantic annotation and
Dr. Siegfried Handschuh is a SFI Stokes Lecturer at the National University of Ireland, Galway (NUIG) and research leader of the Semantic Collaboration team at the Digital Enterprise Research Institute (DERI). Siegfried holds Honours Degrees in both Computer Science and Information Science and a PhD from the University of Karlsruhe. He published over 70 papers as books and journal, book chapters, conference, and workshop contributions, mainly in the areas of Annotation for the Semantic Web, Knowledge Acquisition and Social Semantic Collaboration.
Tudor Groza - hacking advisor and document metadata wizard; technical
responsibilities within the challenge: document annotation, decomposition,
interlinking and visualisation infrastructure
Tudor received his BSc. and MSc. degrees in Computer Science from the Technical University of Cluj Napoca, Romania, in partnership with DaimlerChrysler AG Berlin. Since 2006, he has been a PhD student at Digital Enterprise Research Institute, National University of Ireland, Galway. His main research interests include methods for the representation and extraction of shallow and deep metadata from scientific publications, argumentation modeling and the Social Semantic Desktop. He has written (and co-authored) a number of peer-reviewed international publications (including a book chapter) on the above-mentioned topics.
>> Current research interests related to Elsevier Grand Challenge
One of our primary interests is more efficient automated extraction and
integration of the knowledge present within legacy life science resources
that still somehow resist truly deep and meaningful machine processing.
By resources we mean mainly relevant publications, but also patient record
databases and similar semi-structured data like experimental protocols.
We have investigated and implemented a novel emergent knowledge representation and reasoning framework that will enable meaningful and effortless exploitation of automatically extracted knowledge in large scale. We believe this is very pertinent to the challenge tag-line and we will do our best in
order to prove it when playing with the Elsevier data.
Besides the automated knowledge extraction and processing, we are
involved in another major research topic - design and development of
frameworks for enriching scientific publications with semantic annotations
and a semantic structure. We investigate unobtrusive methods for the
manual acquisition of the semantic annotations, as well as respective
NLP-based automated annotation techniques. For the challenge purposes,
we are going to combine the architecture and interfaces for browsing and
searching in semantically annotated and interlinked document
repositories with the above-mentioned knowledge extraction and inference
>> Why were you inspired to enter the Grand Challenge?
Within our research of semantic authoring and publishing, we have been
following the general vision of changing the way how science is communicated
and published already for quite a long time. Another dream we have
started to fulfil recently is to make the knowledge hidden in legacy
unstructured resources more amenable to efficient and meaningful
computer-aided exploitation. When we noticed that Elsevier shares these
visions of ours and is interested in realising them together with selected
researchers, we immediately decided to participate in this exciting
endeavour in order to refine and showcase a selection of the fruits of our
work so far.
We have had another very relevant motivation, since we do not do the
computer science research just because we like to play with gadgets and
bleeding-edge technologies. We would also like to see the results of
our work helping people, no matter how naive or cliche-like it may
possibly sound. Since we are quite lazy (as most of the computer science
researchers, who want to make machines do their own work), we are
interested only in the most straightforward solutions. And we can hardly
think of a more direct "philanthropy" application of our research than
making the global biomedical and life science knowledge more efficiently
accessible to the expert people, who would use it in order to treat
diseases better and do other nice stuff we non-experts do not know much
Coming to less dreamy reasons of our decision to participate, Elsevier
has provided a lot of real world data for the challenge that can be used
to test our tools and make them work better. Also, if there was a
possibility of further cooperation with Elsevier in the long-term
perspective, even better. Elsevier is a huge publishing house with a large
community of expert users, who could tell us how they like the services
we research and what else would they suggest in order to improve them.
This is an opportunity that is usually not that easily achievable in the
context of testing a research prototype.
>> What do you see as the greatest challenge in finalizing your Grand Challenge project? (whether substantive, logistical, team composition, working solo etc.)
We foresee three major issues that will make us really busy when tackling
our Grand Challenge. First, we have to deal with the scalability of the
document set processing (since we want to process thousands of articles
and more if feasible at all already at the prototype stage). We must
carefully decide the features to possibly sacrifice, while still
maintaining sufficient advance ahead of the state-of-the-art. Second,
there are some engineering issues related to the software integration of
the knowledge representation and reasoning framework and the
semantically-enhanced document repository architecture. And third, we
want to design and implement the application interfaces and data
presentation principles in as elaborate and user-friendly way as possible
already for the prototype to be delivered, which may prove to be a
>> What would you do with the prize money?
First we were thinking like, wow, that would be a nice used car for Vit,
a lot of cool toys for Siegfried's children and a bunch of little useless
thingies to pamper our girlfriends and wives! But after few days of picturing
this hedonistic spree, we realised, to our big surprise, that this would be
a bit inconsistent with our primary motivations for entering the challenge.
Therefore we (very sadly, though) decided to revise the original plan
and propose a solution that would maintain our plain lives of humble
We got together and agreed on yet another crazy vision we would try to realise if we managed to be wonderfully surprised by the eventual victory. How about getting rid of all the money then? How about using the money for establishing a foundation, which would support AI- and Semantic Web-powered advances in the somewhat challenging domain of cancer research and treatment? Such a foundation could perhaps support an elaboration of our challenge project results
in cooperation with a community of interested life science research partners (among other things). It could also help to establish respective fruitful interdisciplinary research partnerships and/or provide funding for technical resources required in order to make further progress in bridging AI, Semantic Web and cancer research internationally.
>> Does your team have a name? If not, what would best personify your Grand Challenge equip?
The name of our team is DERI Knowledge Xcavator (abbreviated as DEKX, pronounced like "decks") - no matter how deep a piece of knowledge is
hidden in the huge sediment deposits of legacy resources, we will dig it
out, put it together with other pieces and deliver them to the right hands.
Further information on our project is posted at:
You can also check out our group's blog:
Cookies are set by this site. To decline them or learn more, visit our Cookies page.