The 3d International Workshop on Concept Discovery in Unstructured Data (CDUD 2016)
NEW: Venue
The workshop session will be held at the main venue (Myasnitskaya 11) in the Conference Hall, 5-th floor.
Call for papers
Concept discovery is a subdomain of Knowledge Discovery (KDD) that uses human-centered techniques such as Formal Concept Analysis (FCA), Topic Modeling, Visual Text Representations, Conceptual Graphs etc. for gaining insight into the underlying conceptual structure of the data. Traditional machine learning techniques are mainly focusing on structured data whereas most data available resides in unstructured, often textual, form. Compared to traditional data mining techniques, human-centered instruments actively engage the domain expert in the discovery process.
This workshop welcomes papers describing innovative research on data discovery techniques. Moreover, this workshop intends to provide a forum for researchers and developers of data mining instruments, working on issues associated with analyzing unstructured data. First, we are interested in methods for transforming unstructured into semi-structured information. Unstructured information such as texts or images can be tagged, keywords can be extracted from texts by means of Natural Language Processing methods, etc. For example, recently so-called Learning Representations such as Text Vectors or Visual Words have gained much attention in the domain of unstructured data. Second, in this workshop we also particularly welcome research on using human-centered instruments such as FCA to analyze unstructured and semi-structured data. Applications in which we are interested include but are not limited to Text Mining and Web Mining including forums, blogs, social sharing systems like Twitter and Facebook, mining sociological interviews, etc. We are also interested in innovative instruments for dealing with knowledge incompleteness and asymmetry.
Subject coverage
- Applications of FCA for discovery purposes
- Association Rules and Frequent Closed Itemsets
- Biclustering and Multimodal clustering
- Conceptual Clustering
- Dealing with knowledge incompleteness and asymmetry
- Deep Learning for Text Representations
- Discovery techniques for conceptual models
- Efficient indexing and structuring algorithms
- Formal Concept Analysis
- Graph Mining
- Knowledge discovery and representation
- Natural Language Processing
- Ontology Learning from text
- Probabilistic concept discovery
- Text Kernels
- Text Mining
- Topic Modeling
- Visual Analytics
Workshop chairs
Jaume Baixeries, Universitat Politècnica de Catalunya, Catalonia
Dmitry Ignatov, National Research University Higher School of Economics, Russia
Dmitry Ilvovsky, National Research University Higher School of Economics, Russia
Alexander Panchenko, Tehcnische Universitat Darmstadt, Germany
Program committee
Simon Andrews, Sheffield Hallam University, United Kingdom
Jaume Baixeries, Barcelona, Spain
Alexei Buzmakov, National Research University Higher School of Economics, Russia
Víctor Codocedo, LORIA, Nancy
Florent Domenach, University of Nicosia, Cyprus
Bernd Fischer, Stellenbosh University, South Africa
Gillian Greene, Stellenbosh University, South Africa
Dmitry Ilvovsky, National Research University Higher School of Economics, Russia
Martin Trnecka, UPOL, Czech Republic
Sergey Zykov, National Research University Higher School of Economics, Russia
Mehdi Kaytoue, LORIA, Lyon, France
Francesco Kriegel, TU Drezden, Germany
Sergei O. Kuznetsov, National Research University Higher School of Economics
Jan Konecny, Dept. Computer Science, Palacky University, Olomouc
Natalia Loukachevitch, Lomonosov Moscow State University, Russia
Dmitry Mouromtsev, National Research University of Information Technologies, Mechanics and Optics, Russia
Xenia Naidenova, Military Medical Academy, Russia
Amedeo Napoli, LORIA, Nancy, France
Alexey Neznanov, National Research University Higher School of Economics, Russia
Artem Revenko, TU Drezden, Germany
Pablo Cordera, Universidad de Málaga, Spain
Inma P. Cabrera, Universidad de Málaga, Spain
Alexander Panchenko, Tehcnische Universitat Darmstadt, Germany
Uta Priss, Edinburgh Napier University, United Kingdom
Jan Outrata, Dept. Computer Science, Palacky University, Czech Republic
Dmitry Ustalov, Ural Federal University, Russia
Barish Sertkaya, Frankfurt University of Applied Sciences, Germany
Dominik Slezak, University of Warsaw, Poland
Rustam Tagiew, Polarez GmBH, Germany
Jesus Medina Moreno, University of Cádiz, Spain
Important dates
Submission deadline: May 31, 2016
Notification of acceptance: June 18, 2016
Camera-ready due: June 23, 2016
Workshop: July 18, 2016
All accepted papers will be included in the workshop’s proceedings to be published online on the CEUR-Workshop web site in a volume with ISSN and indexed by Scopus. Two previous editions are available at the CEUR-Workshop web site: and
Submission Procedure
Electronic version of full paper complete with authors’ affiliations should be submitted through the conference electronic submission system.
Use the submission link
Manuscripts must be prepared with LaTeX or Microsoft Office and should follow the Springer format available at
The maximum number of accepted papers by an individual author that can be covered by the workshop’s registration charge is 3. The papers over 12 pages are not allowed.
Accepted Papers
- Valentin Malykh and Alexey Ozerin, Reproducing Russian NER Baseline Quality Without Additional Data
- Mikhail Kreines and Elena Kreines, Topic Modeling without Generative Probabilistic Model: Approach and its Validation (research proposal)
- Oksana Dereza and Vladislav Tushkanov, Verb-Noun Collocation and Government Model Extraction from Large Corpora (research proposal)
- Gillian Greene and Bernd Fischer, Single-Focus Broadening Navigation in Concept Lattices
- Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya, Alexandr Sboev, Dmitry Gudovskikh, Ivan Moloshnikov and Roman Rybka, Predicting The Gender of an Author of a Russian Text Using Regression and Classification Techniques
- Umme Hafsa Billah, Sheikh Muhammad Sarwar and Abdullah-Al-Mamun, Personalized Language Models for Computer-mediated Communication
- Abdus Satter, Amit Seal Ami and Kazi Sakib, Identification of Dead Fields by Analyzing Usage of Setup Fields and Field Dependency in Test Code
- Bato Merdygeyev and Sesegma Dambaeva, The Evaluation of the Quality of Ontology Based on Analysis of Relations on Concept Lattice (research proposal)
- Mikhail Bogatyrev and Kirill Samodurov, Framework for Conceptual Modeling on Natural Language Texts
- Ekaterina Chernyak and Dmitry Ilvovsky, Annotated Suffix Trees for Text Clustering
- 9:40 - 10:20. Invited talk: Natalia Loukachevitch, Sentiment analysis of Twitter messages: tasks, approaches and results
- 10:20 - 10.40. Valentin Malykh and Alexey Ozerin, Reproducing Russian NER Baseline Quality Without Additional Data
- 10:40 - 11:00. Gillian Greene and Bernd Fischer, Single-Focus Broadening Navigation in Concept Lattices
- 11:00 - 11:20. Coffee-break
- 11:20 - 11:40. Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya, Alexandr Sboev, Dmitry Gudovskikh, Ivan Moloshnikov and Roman Rybka, Predicting The Gender of an Author of a Russian Text Using Regression and Classification Techniques
- 11:40 - 12:00. Ekaterina Chernyak and Dmitry Ilvovsky, Annotated Suffix Trees for Text Clustering
- 12:00 - 12:20. Mikhail Bogatyrev and Kirill Samodurov, Framework for Conceptual Modeling on Natural Language Texts
- 12:20 - 12:40. Bato Merdygeyev and Sesegma Dambaeva, The Evaluation of the Quality of Ontology Based on Analysis of Relations on Concept Lattice
- 12:40 - 13:00. Umme Hafsa Billah, Sheikh Muhammad Sarwar and Abdullah-Al-Mamun, Personalized Language Models for Computer-mediated Communication
- 13:00 - 14:00. Lunch
- 14:00 - 14:20. Abdus Satter, Amit Seal Ami and Kazi Sakib, Identification of Dead Fields by Analyzing Usage of Setup Fields and Field Dependency in Test Code
- 14:20 - 14:30. Mikhail Kreines and Elena Kreines, Topic Modeling without Generative Probabilistic Model: Approach and its Validation, (research proposal, 5 mins poster intro)
- 14:30 - 14:40. Oksana Dereza and Vladislav Tushkanov, Verb-Noun Collocation and Government Model Extraction from Large Corpora, (research proposal, 5 mins poster intro)