The 3d International Workshop on Concept Discovery in Unstructured Data (CDUD 2016)

NEW: Venue

The workshop session will be held at the main venue (Myasnitskaya 11) in the Conference Hall, 5-th floor.

Call for papers

Concept discovery is a subdomain of Knowledge Discovery (KDD) that uses human-centered techniques such as Formal Concept Analysis (FCA), Topic Modeling, Visual Text Representations, Conceptual Graphs etc. for gaining insight into the underlying conceptual structure of the data. Traditional machine learning techniques are mainly focusing on structured data whereas most data available resides in unstructured, often textual, form. Compared to traditional data mining techniques, human-centered instruments actively engage the domain expert in the discovery process.

This workshop welcomes papers describing innovative research on data discovery techniques. Moreover, this workshop intends to provide a forum for researchers and developers of data mining instruments, working on issues associated with analyzing unstructured data. First, we are interested in methods for transforming unstructured into semi-structured information. Unstructured information such as texts or images can be tagged, keywords can be extracted from texts by means of Natural Language Processing methods, etc. For example, recently so-called Learning Representations such as Text Vectors or Visual Words have gained much attention in the domain of unstructured data. Second, in this workshop we also particularly welcome research on using human-centered instruments such as FCA to analyze unstructured and semi-structured data. Applications in which we are interested include but are not limited to Text Mining and Web Mining including forums, blogs, social sharing systems like Twitter and Facebook, mining sociological interviews, etc. We are also interested in innovative instruments for dealing with knowledge incompleteness and asymmetry.

Subject coverage

- Applications of FCA for discovery purposes

- Association Rules and Frequent Closed Itemsets

- Biclustering and Multimodal clustering

- Conceptual Clustering

- Dealing with knowledge incompleteness and asymmetry

- Deep Learning for Text Representations

- Discovery techniques for conceptual models

- Efficient indexing and structuring algorithms

- Formal Concept Analysis

- Graph Mining

- Knowledge discovery and representation

- Natural Language Processing

- Ontology Learning from text

- Probabilistic concept discovery

- Text Kernels

- Text Mining

- Topic Modeling

- Visual Analytics

Workshop chairs

Jaume Baixeries, Universitat Politècnica de Catalunya, Catalonia

Dmitry Ignatov, National Research University Higher School of Economics, Russia

Dmitry Ilvovsky, National Research University Higher School of Economics, Russia

Alexander Panchenko, Tehcnische Universitat Darmstadt, Germany

Program committee

Simon Andrews, Sheffield Hallam University, United Kingdom

Jaume Baixeries, Barcelona, Spain

Alexei Buzmakov, National Research University Higher School of Economics, Russia

Víctor Codocedo, LORIA, Nancy

Florent Domenach, University of Nicosia, Cyprus

Bernd Fischer, Stellenbosh University, South Africa

Gillian Greene, Stellenbosh University, South Africa

Dmitry Ilvovsky, National Research University Higher School of Economics, Russia

Martin Trnecka, UPOL, Czech Republic

Sergey Zykov, National Research University Higher School of Economics, Russia

Mehdi Kaytoue, LORIA, Lyon, France

Francesco Kriegel, TU Drezden, Germany

Sergei O. Kuznetsov, National Research University Higher School of Economics

Jan Konecny, Dept. Computer Science, Palacky University, Olomouc

Natalia Loukachevitch, Lomonosov Moscow State University, Russia

Dmitry Mouromtsev, National Research University of Information Technologies, Mechanics and Optics, Russia

Xenia Naidenova, Military Medical Academy, Russia

Amedeo Napoli, LORIA, Nancy, France

Alexey Neznanov, National Research University Higher School of Economics, Russia

Artem Revenko, TU Drezden, Germany

Pablo Cordera, Universidad de Málaga, Spain

Inma P. Cabrera, Universidad de Málaga, Spain

Alexander Panchenko, Tehcnische Universitat Darmstadt, Germany

Uta Priss, Edinburgh Napier University, United Kingdom

Jan Outrata, Dept. Computer Science, Palacky University, Czech Republic

Dmitry Ustalov, Ural Federal University, Russia

Barish Sertkaya, Frankfurt University of Applied Sciences, Germany

Dominik Slezak, University of Warsaw, Poland

Rustam Tagiew, Polarez GmBH, Germany

Jesus Medina Moreno, University of Cádiz, Spain

Important dates

Submission deadline: May 31, 2016

Notification of acceptance: June 18, 2016

Camera-ready due: June 23, 2016

Workshop: July 18, 2016

Proceedings

All accepted papers will be included in the workshop’s proceedings to be published online on the CEUR-Workshop web site in a volume with ISSN and indexed by Scopus. Two previous editions are available at the CEUR-Workshop web site: http://ceur-ws.org/Vol-757/ and http://ceur-ws.org/Vol-871/.

Submission Procedure

Electronic version of full paper complete with authors’ affiliations should be submitted through the conference electronic submission system.
Use the submission link http://www.easychair.org/conferences/?conf=cdud2016.
Manuscripts must be prepared with LaTeX or Microsoft Office and should follow the Springer format available at http://www.springer.de/comp/lncs/authors.html.
The maximum number of accepted papers by an individual author that can be covered by the workshop’s registration charge is 3. The papers over 12 pages are not allowed.

Accepted Papers

Valentin Malykh and Alexey Ozerin, Reproducing Russian NER Baseline Quality Without Additional Data
Mikhail Kreines and Elena Kreines, Topic Modeling without Generative Probabilistic Model: Approach and its Validation (research proposal)
Oksana Dereza and Vladislav Tushkanov, Verb-Noun Collocation and Government Model Extraction from Large Corpora (research proposal)
Gillian Greene and Bernd Fischer, Single-Focus Broadening Navigation in Concept Lattices
Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya, Alexandr Sboev, Dmitry Gudovskikh, Ivan Moloshnikov and Roman Rybka, Predicting The Gender of an Author of a Russian Text Using Regression and Classification Techniques
Umme Hafsa Billah, Sheikh Muhammad Sarwar and Abdullah-Al-Mamun, Personalized Language Models for Computer-mediated Communication
Abdus Satter, Amit Seal Ami and Kazi Sakib, Identification of Dead Fields by Analyzing Usage of Setup Fields and Field Dependency in Test Code
Bato Merdygeyev and Sesegma Dambaeva, The Evaluation of the Quality of Ontology Based on Analysis of Relations on Concept Lattice (research proposal)
Mikhail Bogatyrev and Kirill Samodurov, Framework for Conceptual Modeling on Natural Language Texts
Ekaterina Chernyak and Dmitry Ilvovsky, Annotated Suffix Trees for Text Clustering

Program

9:40 - 10:20. Invited talk: Natalia Loukachevitch, Sentiment analysis of Twitter messages: tasks, approaches and results
10:20 - 10.40. Valentin Malykh and Alexey Ozerin, Reproducing Russian NER Baseline Quality Without Additional Data
10:40 - 11:00. Gillian Greene and Bernd Fischer, Single-Focus Broadening Navigation in Concept Lattices
11:00 - 11:20. Coffee-break
11:20 - 11:40. Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya, Alexandr Sboev, Dmitry Gudovskikh, Ivan Moloshnikov and Roman Rybka, Predicting The Gender of an Author of a Russian Text Using Regression and Classification Techniques
11:40 - 12:00. Ekaterina Chernyak and Dmitry Ilvovsky, Annotated Suffix Trees for Text Clustering
12:00 - 12:20. Mikhail Bogatyrev and Kirill Samodurov, Framework for Conceptual Modeling on Natural Language Texts
12:20 - 12:40. Bato Merdygeyev and Sesegma Dambaeva, The Evaluation of the Quality of Ontology Based on Analysis of Relations on Concept Lattice
12:40 - 13:00. Umme Hafsa Billah, Sheikh Muhammad Sarwar and Abdullah-Al-Mamun, Personalized Language Models for Computer-mediated Communication
13:00 - 14:00. Lunch
14:00 - 14:20. Abdus Satter, Amit Seal Ami and Kazi Sakib, Identification of Dead Fields by Analyzing Usage of Setup Fields and Field Dependency in Test Code
14:20 - 14:30. Mikhail Kreines and Elena Kreines, Topic Modeling without Generative Probabilistic Model: Approach and its Validation, (research proposal, 5 mins poster intro)
14:30 - 14:40. Oksana Dereza and Vladislav Tushkanov, Verb-Noun Collocation and Government Model Extraction from Large Corpora, (research proposal, 5 mins poster intro)