International Workshop
November 16, 2012
Fundação Calouste Gulbenkian, Room 2
Lisbon
White Paper
The launching of the following White Paper took place during the workshop:
António Branco, Amália Mendes, Sílvia Pereira, Paulo Henriques, Thomas Pellegrini, Hugo Meinedo, Isabel Trancoso, Paulo Quaresma, Vera Lúcia Strube de Lima and Fernanda Bacelar, 2012, A Língua Portuguesa na Era Digital / The Portuguese Language in the Digital Age, White Paper Series, Berlin, Springer, ISBN 978-3-642-29592-8 (printed book), ISBN 978-3-642-29593-5 (ebook).
Back cover quotes:
"This book presents an overview of the language technology area with a focus on the Portuguese language. Although written for a non-technical audience, the presentation is sound, what comes as no surprise from a set of authors where the most internationally recognized researchers in this area in Portugal are to be found. This is a must-read book for anyone wishing to understand the importance of this area."
— Prof. Doutor Miguel Filgueiras, Emeritus Professor (University of Oporto)
"The processing of written and spoken languages is a crucial area for the new modalities of human-computer natural interaction. In an accessible yet scientific and rigorous way, this book presents the state of the art in the digital age of the computational processing of the Portuguese language, one of the languages with more rapid expansion and more economic and technological importance in the western world."
— Dra. Daniela Braga, International Program Manager (Microsoft, Redmond WA, USA)
"The research carried out in the area of language technology is of utmost importance for the consolidation of Portuguese as a language of global communication in the information society."
— Dr. Pedro Passos Coelho, Prime-Minister of Portugal
"This book presents an overview of the language technology area with a focus on the Portuguese language. Although written for a non-technical audience, the presentation is sound, what comes as no surprise from a set of authors where the most internationally recognized researchers in this area in Portugal are to be found. This is a must-read book for anyone wishing to understand the importance of this area."
— Prof. Doutor Miguel Filgueiras, Emeritus Professor (University of Oporto)
"The processing of written and spoken languages is a crucial area for the new modalities of human-computer natural interaction. In an accessible yet scientific and rigorous way, this book presents the state of the art in the digital age of the computational processing of the Portuguese language, one of the languages with more rapid expansion and more economic and technological importance in the western world."
— Dra. Daniela Braga, International Program Manager (Microsoft, Redmond WA, USA)
"The research carried out in the area of language technology is of utmost importance for the consolidation of Portuguese as a language of global communication in the information society."
— Dr. Pedro Passos Coelho, Prime-Minister of Portugal
Clik here to open the eBook version of the White Paper
"The Portuguese Language in the Digital Age".
"The Portuguese Language in the Digital Age".
Registration
Participation in the workshop is open to everyone interested and gratis. Since attendance is limited to the number of seats available, your participation has to be secured by registering here:
Objective
With new and increasingly more powerful technologies, we are communicating with more people, more often and more easily. Critically, these new technologies are not only providing new supports or extended carriers for the exchange of linguistic information: they are inducing a deep technological shock in the way languages can be digitally processed and used.
In sharp rupture with the past, we will be using new technological solutions to communicate instantly in our mother language with people speaking a different language, and accessing information encoded in other languages that we do not speak. And we will be using natural language to interact with all sorts of artificial devices and services in the rapidly unfolding information society.
What are the new conditions of usage for natural languages? Which ones are being technologically equipped to face the digital shock and which ones are not? How will they thrive and which will lose relevance, or eventually get extinct, in the globalized world?
And in particular: What are the specific challenges for the Portuguese language? What strategic responses can be devised?
Seeking views, answers and strategies that help to handle these questions is the central goal of this workshop.
Program
8h30 Registration opens
9h00 Welcome address
9h10 - 10h20 Session P1 - Presentations
9h10 Strategies of Camões, IP for the Promotion of the Portuguese Language in the Internet
Rui Vaz, Camões, IP (former Instituto Camões)
9h35 Multilateral Linguistic Policies for Portuguese and for Multilingualism in the Cyberspace (in Portuguese)
Gilvan Müller Oliveira, Instituto Internacional da Língua Portuguesa (IILP/CPLP)
10h00 Languages and Technologies within the European Digital Agenda - A broad overview
Roberto Cencioni, European Commission
10h25 - 10h50 Session D1 - Debates
Standup multilateral debates start (fuelled by coffee and cookies)
10h25 - 10h50 Session C1 - Posters of the Portuguese CLARIN network
10h50 - 11h40 Session P2 - Presentations
10h50 The Portuguese Varieties at the Transition to Digital Support
José Afonso Furtado, Fundação Calouste Gulbenkian
11h15 Ideas, Doubts and Realities
Helder Coelho, Universidade de Lisboa
11h40 - 12h05 Session D2 - Debates
Standup multilateral debates continue (around coffee)
11h40 - 12h05 Session C2 - Posters of the Portuguese CLARIN network
12h05 - 13h20 Session P3 - Presentations
12h05 CLARIN: Language Technology for the Humanities
Steven Krauwer, CLARIN Research Infrastructure
12h30 A Strategic Research Agenda for Multilingual Europe
Hans Uszkoreit, German Research Center for Artificial Intelligence (DFKI) and
META-NET European R&D Network of Excellence
12h55 The Portuguese Language in the Digital Shock: challenges and opportunities
António Branco, Universidade de Lisboa and
METANET4U Project
13h20 - 13h30 Wrapping address
13h30 - 14h30 Session D3 - Debates
Standup multilateral debates conclude (supported by a light meal)
13h30 - 14h30 Session C3 - Posters of the Portuguese CLARIN network
14h30 Farewell
Speakers
Rui Vaz
Head of the Division for Programming, Certification and Education
Camões, IP (former Instituto Camões)
Rui Vaz is a major driving force in Camões for the exploration of ICT technologies to support language learning and dissemination.
He is the Coordinator of the Camões Virtual Center, a reference online center of services and resources for Portuguese culture and language dissemination.
Head of the Division for Programming, Certification and Education
Camões, IP (former Instituto Camões)
Rui Vaz is a major driving force in Camões for the exploration of ICT technologies to support language learning and dissemination.
He is the Coordinator of the Camões Virtual Center, a reference online center of services and resources for Portuguese culture and language dissemination.
Gilvan Müller Oliveira
Executive Director
Instituto Internacional da Língua Portuguesa (IILP), of the Comunidade dos Países de Língua Portuguesa (CPLP)
Gilvan Oliveira is a pioneer in designing and implementing language policies for multilingualism and endangered languages in Brazil.
He was born in Santa Maria, Rio Grande do Sul, Brazil. He is a 1985 graduate in Linguistics from Unicamp. In 1990, he did his Master degree in Linguistics, Philosophy and History at the Universität Konstanz, in Germany. In 2004, he received his Ph.D. in History of Portuguese Language at Unicamp and made his Postdoctoral degree about Promotion of International languages at Universidad Autónoma Metropolitana Iztapalapa, in Mexico.
He is a professor at the Federal University of Santa Catarina since 1994 and, in 1999, he was one of the founders of IPOL - Institute for Research and Development in Language Policy in Florianopolis, Brazil, which he ran for six years. He was involved on the Working Group of the Linguistic Policies Educational Sector of Mercosul (SEM). Since 2010, he heads the International Institute of the Portuguese Language, of the Community of Portuguese Language Countries (CPLP) in Praia, Cape Verde.
Executive Director
Instituto Internacional da Língua Portuguesa (IILP), of the Comunidade dos Países de Língua Portuguesa (CPLP)
Gilvan Oliveira is a pioneer in designing and implementing language policies for multilingualism and endangered languages in Brazil.
He was born in Santa Maria, Rio Grande do Sul, Brazil. He is a 1985 graduate in Linguistics from Unicamp. In 1990, he did his Master degree in Linguistics, Philosophy and History at the Universität Konstanz, in Germany. In 2004, he received his Ph.D. in History of Portuguese Language at Unicamp and made his Postdoctoral degree about Promotion of International languages at Universidad Autónoma Metropolitana Iztapalapa, in Mexico.
He is a professor at the Federal University of Santa Catarina since 1994 and, in 1999, he was one of the founders of IPOL - Institute for Research and Development in Language Policy in Florianopolis, Brazil, which he ran for six years. He was involved on the Working Group of the Linguistic Policies Educational Sector of Mercosul (SEM). Since 2010, he heads the International Institute of the Portuguese Language, of the Community of Portuguese Language Countries (CPLP) in Praia, Cape Verde.
Roberto Cencioni
Adviser to the Director General of Directorate General for Communications Networks, DG Connect (former Directorate General for the Information Society, DG Infso)
European Commission, DG CONNECT
Roberto Cencioni is the leading figure and a most respected voice in the European Commission in what concerns research and innovation policies for multilingualism, and speech and language technologies.
He graduated from the University of Rome in 1974. He joined the European Commission in 1977 and worked initially on a large-scale machine translation project. He then managed several teams developing distributed office and communication systems until the early 1990s, when he was charged with the coordination of research programmes in the area of speech and language technologies.
Roberto Cencioni headed the unit entrusted with R&D activities in the field of online content, interactive media and knowledge technologies until June 2008. He has managed the human-language technologies unit of Directorate General for the Information Society (DG INFSO) until mid-2012, when he was appointed as adviser to the director general.
Adviser to the Director General of Directorate General for Communications Networks, DG Connect (former Directorate General for the Information Society, DG Infso)
European Commission, DG CONNECT
Roberto Cencioni is the leading figure and a most respected voice in the European Commission in what concerns research and innovation policies for multilingualism, and speech and language technologies.
He graduated from the University of Rome in 1974. He joined the European Commission in 1977 and worked initially on a large-scale machine translation project. He then managed several teams developing distributed office and communication systems until the early 1990s, when he was charged with the coordination of research programmes in the area of speech and language technologies.
Roberto Cencioni headed the unit entrusted with R&D activities in the field of online content, interactive media and knowledge technologies until June 2008. He has managed the human-language technologies unit of Directorate General for the Information Society (DG INFSO) until mid-2012, when he was appointed as adviser to the director general.
José Afonso Furtado
Member of the Advisory Board
Fundação Calouste Gulbenkian and
Professor
Universidade Católica Portuguesa
José Afonso Furtado is a renowned expert in digital edition, libraries, publishing and the information society. He is the author of several books on these subjects, published both in Portugal and abroad, and is also well known for being considered by the Times Magazine as one of the most interesting people to follow on Twitter (ranked 50th worldwide), where he shares much appreciated information on books, the publishing world and new technologies.
He graduated in Philosophy at the Universidade de Lisboa, was for many years the Director of the Art Library of the Fundação Calouste Gulbenkian and remains a member of the Advisory Board for Digital Reading of this Institution.
He teaches at the Post Graduation on “Edition - Books and new digital supports” at the Universidade Católica Portuguesa. Member of the High Comission for the National Reading Plan.
Member of the Advisory Board
Fundação Calouste Gulbenkian and
Professor
Universidade Católica Portuguesa
José Afonso Furtado is a renowned expert in digital edition, libraries, publishing and the information society. He is the author of several books on these subjects, published both in Portugal and abroad, and is also well known for being considered by the Times Magazine as one of the most interesting people to follow on Twitter (ranked 50th worldwide), where he shares much appreciated information on books, the publishing world and new technologies.
He graduated in Philosophy at the Universidade de Lisboa, was for many years the Director of the Art Library of the Fundação Calouste Gulbenkian and remains a member of the Advisory Board for Digital Reading of this Institution.
He teaches at the Post Graduation on “Edition - Books and new digital supports” at the Universidade Católica Portuguesa. Member of the High Comission for the National Reading Plan.
Helder Coelho
Full Professor of Artificial Intelligence
Universidade de Lisboa, Departamento de Informática
Helder Coelho is a well-known and distinguished pioneer of the field of Artificial Intelligence in Portugal.
He is a full professor of the University of Lisbon, in the Department of Informatics of the Faculty of Sciences, since August 1995. He worked in Nuclear Physics (LFEN), Civil Engineering (LNEC), Electronics (IST/UTL), Informatics (LNEC), and Economics and Management (ISEG/UTL).
He is a permanent and elected member of the National Academy of Engineering (since 1999); European Coordinating Committee for Artificial Intelligence (ECCAI) fellow (since 2002); Chair of the Executive Board of the Iberoamerican Society for Artificial Intelligence (IBERAMIA) in 1996-2010, and Member of its Advisory Council (since 2010); Editor of the International Journal of Artificial Intelligence and of the Progress in Artificial Intelligence journals; Head of LabMAC and LabMAg, two R&D units supported by the Portuguese FCT; and President of the Institute for Complexity Sciences (in 2004-08).
Full Professor of Artificial Intelligence
Universidade de Lisboa, Departamento de Informática
Helder Coelho is a well-known and distinguished pioneer of the field of Artificial Intelligence in Portugal.
He is a full professor of the University of Lisbon, in the Department of Informatics of the Faculty of Sciences, since August 1995. He worked in Nuclear Physics (LFEN), Civil Engineering (LNEC), Electronics (IST/UTL), Informatics (LNEC), and Economics and Management (ISEG/UTL).
He is a permanent and elected member of the National Academy of Engineering (since 1999); European Coordinating Committee for Artificial Intelligence (ECCAI) fellow (since 2002); Chair of the Executive Board of the Iberoamerican Society for Artificial Intelligence (IBERAMIA) in 1996-2010, and Member of its Advisory Council (since 2010); Editor of the International Journal of Artificial Intelligence and of the Progress in Artificial Intelligence journals; Head of LabMAC and LabMAg, two R&D units supported by the Portuguese FCT; and President of the Institute for Complexity Sciences (in 2004-08).
Steven Krauwer
Coordinator
European Research Infrastructure: "CLARIN - Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences"
Steven Krauwer is the leader of key European organizations in the research area of language resources and technology.
He is the Executive Director of the recently created intergovernmental legal entity CLARIN ERIC, which is the governing body CLARIN - the Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences.
He is the Coordinator and Chairman of the Board of ELSNET, the European Network in Human Language Technologies and Member of the Executive Committee of the Foundation for Endangered Languages (FEL). He is a Researcher and Project Manager in language and speech technology at the Utrecht institute of Linguistics UiL OTS, Utrecht University, The Netherlands.
Coordinator
European Research Infrastructure: "CLARIN - Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences"
Steven Krauwer is the leader of key European organizations in the research area of language resources and technology.
He is the Executive Director of the recently created intergovernmental legal entity CLARIN ERIC, which is the governing body CLARIN - the Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences.
He is the Coordinator and Chairman of the Board of ELSNET, the European Network in Human Language Technologies and Member of the Executive Committee of the Foundation for Endangered Languages (FEL). He is a Researcher and Project Manager in language and speech technology at the Utrecht institute of Linguistics UiL OTS, Utrecht University, The Netherlands.
Hans Uszkoreit
Coordinator
META-NET European R&D Network of Excellence
Hans Uszkoreit is an internationally leading scholar and entrepreneur in research and innovation on natural language and knowledge technologies.
He is Professor and Chair of the Department of Computational Linguistics and Phonetics at Saarland University, Germany; Scientific Director at the German Research Center for Artificial Intelligence (DFKI), where he heads the DFKI Language Technology Lab; Professor of the Computer Science Department and Principal Investigator of the Cluster of Excellence on Multimodal Computing and Interaction of the DFG (German Science Foundation).
He is Permanent Member of the International Committee of Computational Linguistics (ICCL); Honorary Professor at Technische Universität Berlin; Coordinator of the International Erasmus Mundus Masters Program in Language and Communication Technologies; Member of the European Academy of Sciences; Past Member of the Board of the European Language Resources Association (ELRA); Past President of the European Association for Logic, Language and Information; Past Member of the Executive Board of the European Network of Language and Speech (ELSNET); and serves on several international editorial and advisory boards.
He is a co-founder of XtraMind Technologies GmbH, Saarbruecken (now part of attensity inc.), acrolinx gmbh, Berlin and Yocoy Technologies GmbH, Berlin. From 2005-2011, he served as Chairman of the Board of Directors of the international initiative dropping knowledge.
Coordinator
META-NET European R&D Network of Excellence
Hans Uszkoreit is an internationally leading scholar and entrepreneur in research and innovation on natural language and knowledge technologies.
He is Professor and Chair of the Department of Computational Linguistics and Phonetics at Saarland University, Germany; Scientific Director at the German Research Center for Artificial Intelligence (DFKI), where he heads the DFKI Language Technology Lab; Professor of the Computer Science Department and Principal Investigator of the Cluster of Excellence on Multimodal Computing and Interaction of the DFG (German Science Foundation).
He is Permanent Member of the International Committee of Computational Linguistics (ICCL); Honorary Professor at Technische Universität Berlin; Coordinator of the International Erasmus Mundus Masters Program in Language and Communication Technologies; Member of the European Academy of Sciences; Past Member of the Board of the European Language Resources Association (ELRA); Past President of the European Association for Logic, Language and Information; Past Member of the Executive Board of the European Network of Language and Speech (ELSNET); and serves on several international editorial and advisory boards.
He is a co-founder of XtraMind Technologies GmbH, Saarbruecken (now part of attensity inc.), acrolinx gmbh, Berlin and Yocoy Technologies GmbH, Berlin. From 2005-2011, he served as Chairman of the Board of Directors of the international initiative dropping knowledge.
António Branco
Coordinator
METANET4U European R&D project
António Branco is a leading researcher on the Computational Processing of the Portuguese Language.
He obtained his PhD in Informatics in 1999 from the University of Lisbon. He is the (co-) author of over 100 publications in the area of language science and technology and has participated in over 20 R&D national and international projects, five of which he was the Principal Investigator. He is a member of several program committees for international conferences and a reviewer for international conferences and journals. Currently he is the coordinator of the European project METANET4U, integrating the R&D network of excellence META-NET. He is coordinator of the MSc program on Cognitive Science of the University of Lisbon, a faculty member of the Department of Informatics, Faculty of Sciences, University of Lisbon.
He was the founder and is the head of research of the NLX Group, the Natural Language and Speech Group of the Department of Informatics, University of Lisbon. Recently he has been mostly involved in the development of a deep linguistic processing grammar of Portuguese and its companion set of dynamic treebanks. He is the national coordinator of the Portuguese CLARIN research infrastructure project.
Coordinator
METANET4U European R&D project
António Branco is a leading researcher on the Computational Processing of the Portuguese Language.
He obtained his PhD in Informatics in 1999 from the University of Lisbon. He is the (co-) author of over 100 publications in the area of language science and technology and has participated in over 20 R&D national and international projects, five of which he was the Principal Investigator. He is a member of several program committees for international conferences and a reviewer for international conferences and journals. Currently he is the coordinator of the European project METANET4U, integrating the R&D network of excellence META-NET. He is coordinator of the MSc program on Cognitive Science of the University of Lisbon, a faculty member of the Department of Informatics, Faculty of Sciences, University of Lisbon.
He was the founder and is the head of research of the NLX Group, the Natural Language and Speech Group of the Department of Informatics, University of Lisbon. Recently he has been mostly involved in the development of a deep linguistic processing grammar of Portuguese and its companion set of dynamic treebanks. He is the national coordinator of the Portuguese CLARIN research infrastructure project.
Venue
Fundação Calouste Gulbenkian
Room 2, Headquarters Building
Avenida de Berna, 45A
1067-001 Lisboa
Getting there:
(stations visible in the map below)
METRO (Subway): S. Sebastião station (Blue and Red lines); Praça de Espanha (Blue); Campo Pequeno (Yellow)
CARRIS (Buses) number: 716, 718, 726, 742, 746, 756
Trains: Entrecampos station
Fundação Calouste Gulbenkian
Room 2, Headquarters Building
Avenida de Berna, 45A
1067-001 Lisboa
Getting there:
(stations visible in the map below)
METRO (Subway): S. Sebastião station (Blue and Red lines); Praça de Espanha (Blue); Campo Pequeno (Yellow)
CARRIS (Buses) number: 716, 718, 726, 742, 746, 756
Trains: Entrecampos station
Organization
Contact
"contact" concatenated with "@", and then with "metanet4u.eu"
Organization committee
Ana Tavares, Universidade de Lisboa, METANET4U Project
António Branco, Universidade de Lisboa, Departamento de Informática (DI/FCUL)
Scientific committee
António Branco, Universidade de Lisboa, Departamento de Informática (DI/FCUL)
Amália Mendes, Universidade de Lisboa, Centro de Linguística (CLUL)
Isabel Trancoso, INESC-ID, IST
This is an event in the International Series of Workshops on New Technologies and the Future of Languages.
It is organized by the European project METANET4U.
This is a project in the European R&D Network of Excellence META-NET.
Funding
Background
What is Language Technology?
Language technology — sometimes also referred to as human language technology — is an emerging technology that comprises computational systems that are specialized for analyzing, producing or modifying texts and speech. It is the engineering branch of an intensively interdisciplinary scientific area at the confluence of a number of disciplines and their sub-disciplines such as Computer Science, Linguistics, Electrical Engineering, Psychology, Artificial Intelligence, Computational Linguistics, Machine Learning, Speech technology, Logic, Philosophy of Language, Psycholinguistics, among many others.
Computers that communicate with people
This technology is being put into place for improving human-machine interaction. Today's computers do not yet understand our natural languages and ever evolving specialized computer languages are difficult to master by laypersons and common users. Even if the language fragment the machines understand and its domain of discourse are made very restricted, the unconstraint use of human language increases the productivity of computational systems and greatly empower its users with better solutions for their work and everyday life.
Friendly software that listens and speaks
Natural language interfaces enable the users to communicate with computers in Portuguese, English, Chinese, or other human language. Some applications of such interfaces include e.g. database queries, information retrieval from texts, expert systems, mobile information services, or robot control, etc. which coupled together with the current advances in the recognition and production of unrestricted speech are improving and widening the usability of these systems.
Helping people to communicate with each other
Much older than communication problems between human beings and machines are those between people with different mother idioms. One of the aims of language technology has always been fully automatic translation between human languages. Although we are still far away from achieving the ambitious goal of translating unrestricted texts, it is already possible to create software systems that simplify the work of human translators and clearly improve their productivity. Less than perfect automatic translations on the spot are also of great help e.g. to information seekers who have to search through large amounts of texts in foreign languages, or to a traveller in a foreign country whose language he does not master, among endless other use cases.
Exploring language as a gateway to the web
The rapid growth of the Internet/WWW and the emergence of the information society offer exciting challenges and opportunities to language technology. Although the new media combine text, graphics, sound and movies, the whole cyberspace of multimedia information and social networks can only be structured, indexed and navigated through language. For browsing, navigating, filtering and processing the multilingual information on the web, we need software that gets at the contents of documents. Systems for crosslingual information and knowledge management supported by language technology help to surmount language barriers for e-commerce, education and international cooperation.
Your mother language and your citizenship in the information society
In past technological revolutions involving natural language (e.g. advent of writing systems, printing press, etc.), many human languages lost their relevance, and some got eventually extinct, as their native speakers could not benefit from those technological breakthroughs. For a language to thrive into the forthcoming digital age, it is necessary that it is technologically equipped so that it can be used to get access to all the people, facilities and services that will be uniquely made available in the information society. Language technology is the new disrupting factor supporting the next technological revolution for natural language, which will be of unprecedented magnitude. Only language technology specifically developed for and adapted to the specifics of your mother language will ensure that it thrives into the digital age and you and your culture have full citizenship in the information society.
Ambitious goals and useful applications
While the fully successful simulation of human language competence and performance is the ultimate goal, there are numerous realistic short-term applications involving the design, realization and maintenance of systems which facilitate everyday work, including e.g. grammar checkers for word processing programs, automatic video subtitling, document categorization software, machine translation, automatic text summarization, tools for opinion mining in the social web, systems for question answering from the web, among many others.
What is Language Technology?
Language technology (LT) is fostered by the growing need for user-friendly software and for innovative solutions for multilingualism. It spans a wide spectrum of ambitious tasks ranging, from the scientific study of human language and thought via the development of novel computational techniques, all the way to the marketing of profitable innovative solutions, services and products that will help to unfold the information society.
It is only starting now.