Einladung zur Disputation

25. April 2013 um 14:30 7 Kommentare

Im Januar habe ich endlich meine Dissertation abgegeben und werde sie am Freitag, den 31. Mai verteidigen. Die Disputation findet um 16 Uhr im Jacob-und-Wilhelm-Grimm-Zentrum im Videokonferenzraum 1 ’312 statt (siehe auch die offizielle Einladung [PDF]). Der Titel meiner Dissertation lautet Describing data patterns. A general deconstruction of metadata standards. Meine Gutachter sind Prof. Dr. Stefan Gradmann, Prof. Dr. Felix Sasaki und Prof. Dr. William L. Honig.

Die Veranstaltung ist öffentlich, allerdings ist der Raum nicht sehr groß und in der Bibliothek (d.h. Jacken, Mäntel, Taschen etc. müssen an der Garderobe abgegeben werden). Da Prof. Honig in Chicago ist, wird der Vortrag per Videokonferenz übertragen und aufgezeichnet. Ob weitere Teilnehmer (per H.239/H.323) möglich sind und ob/wann die Aufzeichnung online gestellt werden kann, weiß ich derzeit noch nicht. Die anschließende Veröffentlichung der Arbeit erfolgt im Laufe des Jahres wahrscheinlich auf dem Dokumenten- und Publikationsserver der HU sowie ggf. per Print-on-Demand. Hier erstmal Abstract bzw. Zusammenfassung der Arbeit:

Many methods, technologies, standards, and languages exist to structure and describe data. The aim of this thesis is to find common features in these methods to determine how data is actually structured and described. Existing studies are limited to notions of data as recorded observations and facts, or they require given structures to build on, such as the concept of a record or the concept of a schema. These presumed concepts have been deconstructed in this thesis from a semiotic point of view. This was done by analysing data as signs, communicated in form of digital documents. The study was conducted by a phenomenological research method. Conceptual properties of data structuring and description were first collected and experienced critically. Examples of such properties include encodings, identifiers, formats, schemas, and models. The analysis resulted in six prototypes to categorize data methods by their primary purpose. The study further revealed five basic paradigms that deeply shape how data is structured and described in practice. The third result consists of a pattern language of data structuring. The patterns show problems and solutions which occur over and over again in data, independent from particular technologies. Twenty general patterns were identified and described, each with its benefits, consequences, pitfalls, and relations to other patterns. The results can help to better understand data and its actual forms, both for consumption and creation of data. Particular domains of application include data archaeology and data literacy.

Diese Arbeit behandelt die Frage, wie Daten grundsätzlich strukturiert und beschrieben sind. Im Gegensatz zu vorhandenen Auseinandersetzungen mit Daten im Sinne von gespeicherten Beobachtungen oder Sachverhalten, werden Daten hierbei semiotisch als Zeichen aufgefasst. Diese Zeichen werden in Form von digitalen Dokumenten kommuniziert und sind mittels zahlreicher Standards, Formate, Sprachen, Kodierungen, Schemata, Techniken etc. strukturiert und beschrieben. Diese Vielfalt von Mitteln wird erstmals in ihrer Gesamtheit mit Hilfe der phenomenologischen Forschungsmethode analysiert. Ziel ist es dabei, durch eine genaue Erfahrung und Beschreibung von Mitteln zur Strukturierung und Beschreibung von Daten zum allgemeinen Wesen der Datenstrukturierung und -beschreibung vorzudringen. Die Ergebnisse dieser Arbeit bestehen aus drei Teilen. Erstens ergeben sich sechs Prototypen, die die beschriebenen Mittel nach ihrem Hauptanwendungszweck kategorisieren. Zweitens gibt es fünf Paradigmen, die das Verständnis und die Anwendung von Mitteln zur Strukturierung und Beschreibung von Daten grundlegend beeinflussen. Drittens legt diese Arbeit eine Mustersprache der Datenstrukturierung vor. In zwanzig Mustern werden typische Probleme und Lösungen dokumentiert, die bei der Strukturierung und Beschreibung von Daten unabhängig von konkreten Techniken immer wieder auftreten. Die Ergebnisse dieser Arbeit können dazu beitragen, das Verständnis von Daten — das heisst digitalen Dokumente und ihre Metadaten in allen ihren Formen — zu verbessern. Spezielle Anwendungsgebiete liegen unter Anderem in den Bereichen Datenarchäologie und Daten-Literacy.

Jetzt muss ich nur noch anfangen, den Vortrag vorzubereiten…

TPDL 2011 Doctoral Consortium – part 3

25. September 2011 um 17:36 Keine Kommentare

See also part 1 and part 2 of conference-blogging and #TPDL2011 on twitter.

My talk about general patterns in data was recieved well and I got some helpful input. I will write about it later. Steffen Hennicke, another PhD student of my supervisor Stefan Gradman, then talked about his work on modeling Archival Finding Aids, which are possibly expressed in EAD. The structure of EAD is often not suitable to answer user needs. For this reason Hennicke analyses EAD data and reference questions, to develope better structures that users can follow to find what they look for in archives. This is done in CIDOC-CRM as a high-level ontology and the main result will be an expanded EAD model in RDF. To me the problem of “semantic gaps” is interesting, and I think about using some of Hennicke data as example to explain data patterns in my work.

The last talk by Rita Strebe was about aesthetical user experience of websites. One aim of her work is to measure the significance of aesthetical perception. In particular her hypothesis to be evaluated by experiments are:

H1: On a high level, the viscerally perceived visual aesthetics of websites effects
approach behaviour.
H2: On a low level, the viscerally perceived visual aesthetics of websites effects
avoidance behaviour.

Methods and preliminary results look valid, but the relation to digital libraries seems low and so was the expertise of Strebe’s motivation and methods among the participants. I suppose her work better fits to Human-Computer Interaction.

After the official part of the program Vladimir Viro briefly presented his music search engine peachnote.com, that is based on scanned muscial scores. If I was working in or with musical libraries, I would not hesitate to contact Viro! I also though about a search for free musical scores in Wikimedia framework. The Doctoral Consortium ended with a general discussion about dissertation, science, libraries, users, and everything, as it should be :-)

TPDL 2011 Doctoral Consortium – part 2

25. September 2011 um 12:42 Keine Kommentare

The TPDL 2011 Doctoral Consortium, which I already blogged about in part 1, continued with 15 minutes of delay: Christopher Gibson also talked about eBooks – I wonder why his talk was not combined with Luca Colombo’s work in eBook reading experiences. Gibson’s specific topic is eBook lending services in UK public libraries. To quote the research questions from his paper:

Q1. How have public libraries addressed ebook service provision in the UK?
Q2. What challenges and opportunities exist in incorporating ebook lending into other reader services?
Q3. Is it feasible to lend ebook reading devices from public libraries?
Q4. How can the effectiveness of ebook lending services be measured?
Q5. How do library users view the provision of ebook lending services?
Q6. How can effective ebook lending services be developed?

To me an interesting aspect of his methodology was the use of targeted FOI (freedom of information) requests to gather data about eBook lending services. I cannot image this in this Germany where “Informationsfreiheit” is still in its infancy. One result from another survery done by Gibson: most eBooks are not included in library catalogs. I think this failure is found in German libraries too. In summary the PhD project looked very profound with some real practical values for libraries. On the other hand, the theoretical contribution, for instance the question what “lending” can mean in a digital library work, was only added in the discussion afterwards.

The next presenting PhD student was Adam Sofronjievic. I am sorry that I could not fully concentrate on his talk about a New Paradigm of Library Collaboration although it seemed very interesting. My talk is next :-)

TPDL 2011 Doctoral Consortium

25. September 2011 um 11:21 2 Kommentare

Today the International Conference on Theory and Practice of Digital Libraries 2011 started with tutorials and a Doctoral Consortium that I participate with a talk. The seven talks and discussions on ongoing PhD topics were rather diverse and interesting. I tried to briefly summarize at least some of them.

Luco Colombo started with his work on developing and evaluating eBook reading experience for children. Reading “traditional” books has been extensively investigated – this is not true for eBooks. Especially children are little involved in eBook studies. Colombo explained how the eBook reading experience is different because it directly involves searching, browsing, sharing, and recommending, among other arguments. A good reading experience results in a “flow state” where the reading gets positively lost in a book. Colombo’s method is a cooperative inquiry. It is not clear whether and by what eBooks are more engaging to children (age 9-11 in this study) than traditional books – maybe this PhD will show. The following discussion was dominated by the participating mentors Jose Borbinha, Milena Dobreva, Stefan Gradmann and Giuseppina Vullo.

In the second talk Krassimira Ivanova presented her dissertation on (content-based) image retrieval utilizing color models. Image retrieval on art images is difficult because it includes very different aspects (artistic styles, depicted objects etc.). Even aspects of color (contrasts, intensity, diversity, harmony etc.) are manifold – maybe this is why philosophy of color has a long history. Nevertheless Ivanova developed several machine learning methods for this color aspects that can be used for image retrieval. I am not sure whether the resulting APICAS system (“Art Painting Image Colour Aesthetics and Semantics”) has been evaluated with a user study. Similar to the first talk, the focus could be improved by more narrowing down and making clear the specific contribution. Finally we had some real discussion, but little time.

