Mark James Carman


email:
phone: +41 58 666 4310

location:
University of Lugano
(Università della Svizzera italiana)

Faculty of Informatics
Via Giuseppe Buffi, 13
CH-6904 Lugano, Switzerland


About me

I am a postdoc researcher in the Information Retrieval group within the Informatics Faculty at the University of Lugano, Switzerland.

I work primarily on the problem of discovering, modeling and personalizing access to news feeds, blogs, and Hidden Web databases. To tackle this problem I combine ideas from machine learning, distributed information retrieval and data integration. In particular I make use of supervised and semi-supervised learning techniques, as well as language modeling and topic modeling techniques.

In general, my research interests are quite varied and include: Prior to moving to Lugano, I studied in the Information Integration Group at the Information Sciences Institute of USC and also at the Fondazione Bruno Kessler (formerly IRST). My PhD thesis is from the University of Trento. For my thesis, I developed a system for learning semantic descriptions of online information sources. The aim of the work was to allow for the automated discovery and integration of new sources into existing integration systems (such as information mediators or simple mashups).

News!

9 Apr, 2009:
I presented a paper at ECIR09 called A Topic-based Measure of Resource Description Quality for Distributed Information Retrieval that Mark Baillie, Fabio Crestani and I wrote. Here are the slides from the talk.
8 Apr, 2009:
Shima Gerani presented a paper we wrote together at the European IR conference (ECIR09) called Investigating Learning Approaches for Blog Post Opinion Retrieval.
30 Oct, 2008:
Fabio Crestani presented a paper on Tag Data and Personalized Information Retrieval at the CIKM 2008 Workshop on Search in Social Media that I wrote together with him and Mark Baillie. Here are the slides that he presented.
5 Jan, 2008:
I have started blogging(?!) about interesting research articles I read, as well as my musings on research in Artificial Intelligence / Information Retrieval / Databases / etc. I will also post any presentations I give or papers I write. You can check it out here. Feel free to agree/disagree with me in the comments!
29 Nov, 2007:
I gave a quick talk today (along with Giovanni Toffetti and Monica Landoni) to the new PhD students at the University of Lugano on life during and after a PhD. - Finally my chance to tell the new students all the stuff I wish somebody had told me when I started! I'm posting my advice/musings online in case somebody else finds it useful.
11 Sep, 2007:
Craig Knoblock and I published an article "Learning Semantic Definitions of Online Information Sources" in the Journal of Artificial Intelligence Research (JAIR). The article provides a more detailed description of our work on inducing service descriptions that we presented at IJCAI.

Publications

Below are some selected publications of mine. If you can't find the paper you're after please send me a . Here is a bibtex file with most of my publications. Some of my more recent papers can also be found on the IR group website.
A Statistical Comparison of Tag and Query Logs
Mark Carman, Mark Baillie, Robert Gwadera and Fabio Crestani.
32nd Annual International ACM SIGIR Conference on Research & Development on Information Retrieval (SIGIR 2009), 2009. (to appear)
Blog Distillation using Random Walks
Mostafa Keikha, Mark Carman and Fabio Crestani.
32nd Annual International ACM SIGIR Conference on Research & Development on Information Retrieval (SIGIR 2009), 2009. (poster, to appear)
A Topic-based Measure of Resource Description Quality for Distributed Information Retrieval
Mark Baillie, Mark Carman and Fabio Crestani.
31st European Conference on Information Retrieval (ECIR 2009), Toulouse, France, 2009
Investigating Learning Approaches for Blog Post Opinion Retrieval
Shima Gerani, Mark Carman and Fabio Crestani.
31st European Conference on Information Retrieval (ECIR 2009), Toulouse, France, 2009
Tag Data and Personalized Information Retrieval
Mark J. Carman, Mark Baillie and Fabio Crestani.
CIKM 2008 Workshop on Search in Social Media (SSM 2008), 2008
Learning Semantic Definitions of Online Information Sources
Mark James Carman and Craig A. Knoblock.
Journal of Artificial Intelligence Research (JAIR), volume 30, pages 1-50, 2007
Learning Semantic Descriptions of Web Information Sources
Mark James Carman and Craig A. Knoblock.
Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07). Hyderabad, India, January 2007
Learning Semantic Definitions of Information Sources on the Internet
Mark James Carman.
Doctorate Thesis, (Advisors: Paolo Traverso and Craig A. Knoblock),
Department of Information and Communication Technologies, University of Trento, August 2006
Inducing Source Descriptions for Automated Web Service Composition
Mark James Carman and Craig A. Knoblock.
AAAI 2005 Workshop on Exploring Planning and Scheduling for Web Services, Grid, and Autonomic Computing. 2005
Web Service Composition as Planning
Mark Carman, Luciano Serafini and Paolo Traverso,
ICAPS'03 Workshop on Planning for Web Services, Trento, Italy, June 2003
Planning for Web Services the Hard Way
Mark Carman and Luciano Serafini,
SAINT'03 Workshop on Service Oriented Computing, Orlando, USA, January 2003
Towards an Economy-Based Optimisation of File Access and Replication on a Data Grid
Mark Carman, Floriano Zini, Luciano Serafini and Kurt Stockinger,
International Workshop on Agent based Cluster and Grid Computing at International Symposium on Cluster Computing and the Grid (CCGrid'2002), Berlin, Germany, May 2002

Software

EIDOS (Efficiently Inducing Definitions for Online Sources) is a system for learning semantic descriptions of online information sources (such as these RSS feeds). The descriptions are used to automatically integrate the sources into (mediator based) information integration systems. A complete description of the purpose and functionality of the system can be found in my thesis. You can also have a look at the slides I presented at my defense. The software can be downloaded from the ISI website. It is royalty-free for research purposes and comes with all the source code. Here is the latest documentation. Feel free to contact me with installation questions.

Blogs

I have a research blog, which I post my papers, presentations and thoughts to occasionally. Otherwise, I recommend the blogs of a number of other computer science researchers: Greg Linden, Matthew Hurst, Paolo Massa, Jonathan Elsas, Alon Halevy, William Cohen, Panos Ipeirotis.

A very brief Bio

I grew up in Adelaide, Australia and received an Australian Student's Prize upon graduating from high school in 1995.
In 1998, I spent six months studying in Stuttgart, Germany, where I met my wife, Daniela (who comes from Piacenza, Italy).
I received a Bachelor's Degree (with First Class Honours) in Electrical and Electronic Engineering and Arts from the University of Adelaide in 1999.
In 2000, I worked in the E-Commerce division of Telstra Research Labs in Sydney, Australia.
In 2001, I moved to Trento, Italy, and started working at IRST (now the Fondazione Bruno Kessler) in the Automated Reasoning Systems division.
In July 2006, I received a Ph.D. in Computer Science from the University of Trento, Italy.
In August 2007, I moved to Lugano, Switzerland to take up a PostDoc position in the Informatics Faculty of the University of Lugano.