Prof. Osvaldo N. Oliveira Jr.
São Carlos Institute of Physics and Center for Computational Linguistics (NILC)
University of São Paulo, Brazil
Speech Title: Complex networks in natural language processing
Speech Abstract: An overview will be presented of the various applications of complex networks in natural language processing tasks, including summarization, semantic disambiguation, essay evaluation, authorship recognition, evaluation of machine translation and semi-automatic surveys. These applications are based on the finding that the topology and dynamics of the networks established by representing text, e.g. with a word co-occurrence strategy, may correlate with structural and semantic features of the text. In assessing the quality of essays written by high-school students, for instance, the departure from a linear behavior in the dynamics of node linking correlates with poorer writing. For author recognition, the highest performance is achieved with a non-supervised machine learning algorithm with input features containing metrics of the network topology and the semantics of the most important nodes. Node importance can be quantified because text networks are scale free with a degree distribution following a power law. This scale-free property also explains why preservation of network topology in the target language is key for a successful machine translation. The flexibility of the network-based approach is exemplified by using sentences as nodes in creating networks for text summarization, and in establishing semantic fields in a corpus of scientific literature to develop semi-automatic surveys. As with many other areas, natural language processing is bound to rely increasingly on deep learning, and this will be discussed in the context of automatic scoring of written essays.
A short introduction to Prof. Osvaldo N. Oliveira Jr.:
Osvaldo N. Oliveira Jr. is a professor at the São Carlos Institute of Physics, University of São Paulo, Brazil. He obtained his BSc and MSc from the University of São Paulo, a PhD from the University of Wales, Bangor (1990), and an honorary doctorate (Honoris Causa) from the Federal University of Mato Grosso do Sul in 2019. Prof. Oliveira is a member of the Latin American Academy of Sciences, a former president of the Brazilian Materials Research Society, and executive editor of ACS Applied Materials & Interfaces. He has led research into the fabrication of novel materials in the form of ultrathin films obtained with the Langmuir-Blodgett and self-assembly techniques. Most of this work has been associated with fundamental properties of ultrathin films with molecular control, but technological aspects have also been addressed in specific projects. This is the case of an electronic tongue, whose response to a number of tastants is considerably more sensitive than the human gustatory system. In recent years, Prof. Oliveira has pioneered the combined use of methods from distinct fields of science, with the merge of methods of statistical physics and computer science to process text, and use of information visualization to enhance the performance of sensing and biosensing. This pioneering work is associated with the merge of nanotechnology with Big Data Analytics and machine learning, bound to yield developments in technology such as computer-aided diagnosis systems. Prof. Oliveira has also developed strategies for scientific writing, especially targeted for non-native users of English. As of October, 2020, he published over 580 papers in international journals, 3 books, in addition to filing close to a dozen patents, which have received ca. 13,700 citations (h =55, Web of Science) and ca. 20,500 citations (h = 67, Google Scholar). He was awarded with the Scopus Prize from Elsevier in 2006 as one of the most productive Brazilian scientists.