Rutgers logo

Greater New York Area DB/IR Day

April 28, 2006

AT&T logo
Hosted and Sponsored by
Rutgers University
Department of Computer Science and
School of Communication, Information & Library Science
&
AT&T
Sponsored By
ATT logo

The Greater New York Area DB/IR Day will bring together database and information retrieval researchers and students from academic and research institutions across the tristate area and beyond for an exciting technical workshop and the opportunity to network.

The DB/IR Day will be hosted by Rutgers University and AT&T on Friday, April 28, 2006 from 9:45 am to 5:01 pm.

Provisional schedule:

Date: April 28, 2006
Time: 9:45am  to 5:01pm
Place: The Auditorium in the Fiber Optics Lab at Rutgers University. Parking is available without permit in lots 54, 59 and 68. See directions.
Agenda: 9:45am - 10:15am Registration and coffee
  10:15am - 10:45am Welcome
Introductions: brief overviews of participating research groups
  10:45am - 12:00noon Invited Talk - Dennis Shasha (NYU)
  12:00noon - 2:00pm Lunch and posters
  2:00pm - 3:15pm Invited Talk - David Karger (MIT)
  3:15pm - 3:45pm Coffee break
Poster prizes
  3:45pm - 5:00pm Invited Talk - William Cohen (CMU)
  5:00pm - 5:01pm Closing remarks

Invited Talks:

Dennis Shasha (NYU)

Safe Data Management with Untrusted Servers

Imagine that you and your friends want to share information in a database because you want concurrency control, recovery, and query processing, but you don't trust the database administrator. You want to protect data from being observed (privacy). You want to make unauthorized modifications evident (a form of safety). You want to force the server to deliver a consistent picture to all honest users or be discovered (a form of liveness). Encryption and signatures make the first two possible. Liveness is another matter since the database administrator could "fork" the database into several copies, keeping some of your friends ignorant of your latest updates and you ignorant of theirs. In joint work with David Mazieres and some great students, we have worked out how to achieve these properties for file systems. This talk presents a design for database systems that integrates these goals with query processing, concurrency control, and recovery.

David Karger (MIT)

Why everyone should be their own database administrator, UI designer,
application developer, and web site builder, and how they can (PDF)

The desktop applications and web sites we use often seem ill-matched to our particular information management tasks. They fail to display, or even to record, some aspect of the information that we need. They clutter their presentations with distracting inessential aspects.
Information is fragmented over multiple applications and sites, making it hard to record, visualize, or navigate important connections. The operation we want to apply to data locked inside
some web site is only available at another site, or inside one of our applications.

End users can fix many of these problems themselves, if they are given the right levers for reshaping information management tools and repositories to suit their needs. The Haystack system offers three such levers: a simple, structured-but-sloppy, information model for
holding whatever information a particular user considers important; a user-interface framework that can flex to present that kind of information; and tools that let end users "edit" (rather than program) and share information views and workspaces that are appropriate for the data they want to record and the tasks they want to perform. Our approach offers a way to build personalized information management applications for users' own data, and to create useful aggregations and visualizations of information dispersed over the standard and Semantic Web.

William Cohen (CMU)

On Beyond Hypertext: Searching in Graphs Containing Documents, Words, and Actual Data (PDF)

Similarity measures for text have historically been an important tool for solving information retrieval problems. Here I will describe similarity metrics for graphs containing a heterogeneous mixture of textual and non-textual objects. These similarity metrics are based on a lazy graph walk, and they allow certain types of queries to be easily formulated by a naive user. In one instantiation of this framework, a user's personal information is represented as a graph containing messages, calendar information, social network information,
and a timeline. Here graph-based similarity search can be used to find identifiers for people mentioned in an email, or people likely to attend a meeting. In another instantiation, a graph is built that contains Medline abstracts, the output of a gene-name recognizer on these abstracts, a dictionary of gene synonyms, and previously curated abstracts. Here graph-based search can be used to find identifiers of genes mentioned in an abstract. In each of these cases, performance of the graph-based similarity search is competitive with or better than baseline approaches to the problem. In many cases, performance can be further improved by using appropriate learning techniques.

This is joint work with Einat Minkov and Andrew Ng.


Organization Committee:

To contact the organizing committee use:
dbir2006 sakai.rutgers.edu

Registration:

In order to help us estimate the number of participants and order lunch, please register for this event. If you register before April 25, you are guaranteed free lunch. You can also register on the day, but in that case you are not guaranteed lunch.

We encourage students to submit posters (the poster itself or a one-two page description of the poster); at least the lead author must be a student). The posters accepted will be entered in a poster competition with prizes.

The representatives of participating research groups will have the opportunity to present a brief overview of their group and research projects. Between April 24-27 they will be able to upload a Powerpoint file with a couple of slides to support their presentation. These slides will be made public in this website after the event.

*** Register !!

Although on the spot registration will be available on the day of the event, we would appreciate early online registration (ideally by Monday, April 24), so that we can estimate how much food and coffee is needed.

*** Submit poster or group slides !!

Submit for review either the poster file, or an abstract. The poster dimensions should be 3'x4'.
Participating groups can submit their Powerpoint file (with 1-2 slides) using the same submission page; if something goes wrong, please email the file to the organizing committee.


Accommodation

If you intend to stay in New Brunswick overnight, before and/or after the workshop, a couple of hotels to consider are:

Posters

Authors Contact email Poster title Prize
Sumeet Bajaj, Radu Sion - Stony Brook University
sumeetbajaj@yahoo.com
N3S – Networked Secure Searchable Storage
Best poster
Paul Fodor - Stony Brook University
pfodor@cs.sunysb.edu
Dynamic Portlet Wrappers using JavaScript and Portal Voice Interface using RSS/Atom Streams
2nd place
Siddharth Bhatt, Radu Sion - Stony Brook University
sbhatt@cs.sunysb.edu
Personal Digital Rights Management in Mobile Frameworks
3rd place
Wisam Dakka - Columbia University wisam.dakka@gmail.com Summarization-aware search for Online News ARticles 4th place (tie)
Nicholas Taylor
Zachary Ives - U. Pennsylvania
netaylor@seas.upenn.edu
ORCHESTRA - Reconciliation of Dynamic Shared Data
4th place (tie)
Wei Zhuang, Graham Cormode, S. Muthukrishnan - Rutgers University; Telcordia
weiz@paul.rutgers.edu
What’s Different: Distributed Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams
4th place (tie)
Eric Silfen, Chintan Patel, Eneida Mendonca, Carol Friedman - Columbia University
cop2101@columbia.edu
ZebraHunter: Searching Rare Medical Diagnoses and Retrieving Relevant Citations  
Palakorn Achananauparp,
Hyoil Han - Drexel University
pkorn@drexel.edu
Using Semantic Similarity to Improve User Modeling in Web Personalization Systems
 
Eric C. Jensen, Steven M. Beitzel, Ophir Frieder, Abdur Chowdhury - Illinois Institute of Technology
ej@ir.iit.edu
A Framework for Determining Necessary Query Set Sizes to Evaluate Web Search Effectiveness
 
Lan Nie, Brian D. Davison - Lehigh University
lan2@lehigh.edu
Topical Link Analysis
 
Xiaoguang Qi, Brian D. Davison - Lehigh University
xiq204@lehigh.edu
Knowing a Web Page by the Company It Keeps
 
Gal Oestreicher-Sinegr - New York University
goestrei@stern.nyu.edu
Network Structure in Electronic Commerce
 
Bing Bai, Nicu Cornea, Paul Kantor, Deborah Silver - Rutgers University
bbai@cs.rutgers.edu
IR Principles for Brain Image Libraries
 
Yuelin Li, Xiangmin Zhang - Rutgers University
lynnlee@scils.rutgers.edu
Trained vs. Untrained Searchers’ Interaction with Search Features in Digital Libraries
 
Yihua Wu, S. Muthukrishnan, Eric van den Berg - Rutgers University
yihwu@cs.rutgers.edu
Sequential Change Detection
 
Mohamed A. Sharaf, Panos K. Chrysanthis, Alexandros Labrinids, Kirk Pruhs - University of Pittsburgh
msharaf@cs.pitt.edu
Metrics and Algorithms for Processing Multiple Continuous Queries  

 

Previous DB/IR Days: