Difference between revisions of "ULB presentation"
(→Living archive) |
|||
(21 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
+ | [[Category:Website development]] | ||
== Evolution through homepage == | == Evolution through homepage == | ||
− | *Evolution of Constant parallels the evolution of its website and the different datastructures produced during this evolution. | + | *Evolution of Constant '''parallels''' the evolution of its website and the different datastructures produced during this evolution. |
− | *We want to discuss an approach to datastructure which is not pre-planned, which is organic and that reorganises itself within 'chaos'. Which is learned by doing rather than learned and applied. Which allows for experiment and follows the life of an organisation through its changes rather than through its fixed image. | + | |
+ | Constants visitors numbers have grown over time, and currently attracts ca. 2000 p/day | ||
+ | |||
+ | Website as Constants' 'window display', marking as much changes in use of technology, relations of content to visualisation and representation. | ||
+ | |||
+ | *'''Constant'''? [Associaton/foundation for art and media] | ||
+ | |||
+ | Constant is a non-profit association, based and active in Brussels since 1997 in the fields of feminism, copyright alternatives and working through networks. Constant develops radio, electronic music and database projects, by means of migrating from cultural work to work places and back again. | ||
+ | |||
+ | *We want to discuss an approach to datastructure which is '''not pre-planned''', which is organic and that reorganises itself within 'chaos'. Which is learned by doing rather than learned and applied. Which allows for experiment and follows the life of an organisation through its changes rather than through its fixed image. | ||
+ | |||
+ | *Made use of archive.orgs' '''wayback machine'''. -- a service that allows people to visit archived versions of Web sites. | ||
+ | |||
+ | "The original idea for the Internet Archive Wayback Machine began in 1996, when the Internet Archive first began archiving the web. Now, five years later, with over 100 terabytes and a dozen web crawls completed, the Internet Archive has made the Internet Archive Wayback Machine available to the public. The Internet Archive has relied on donations of web crawls, technology, and expertise from Alexa Internet and others. The Internet Archive Wayback Machine is owned and operated by the Internet Archive." The Internet Archive Wayback Machine contains approximately 1 petabyte of data and is currently growing at a rate of 20 terabytes per month. This eclipses the amount of text contained in the world's largest libraries, including the Library of Congress." | ||
+ | http://www.archive.org/web/web.php | ||
=== Leaflet era === | === Leaflet era === | ||
Line 10: | Line 25: | ||
*1999-mid 2003 | *1999-mid 2003 | ||
− | |||
*top categories correspond to organised events in 'real space' | *top categories correspond to organised events in 'real space' | ||
− | *events are organised by a central team | + | *events are organised by a '''central team''' |
− | *one webdesigner and not part of the team | + | *one webdesigner and '''not part of the team''' |
− | *all content is static | + | *all content is '''static''' |
*produced with proprietary software(dreamweaver) and using commercial plugins(quicktime, realplayer, flash) | *produced with proprietary software(dreamweaver) and using commercial plugins(quicktime, realplayer, flash) | ||
− | === Site as | + | === Site as platform === |
+ | |||
+ | *Constant '''hosts''' | ||
[[Image:production-scr.jpg|left|thumb|Constants'homepage 2002]] | [[Image:production-scr.jpg|left|thumb|Constants'homepage 2002]] | ||
Line 23: | Line 39: | ||
*mid 2003 - mid 2005 | *mid 2003 - mid 2005 | ||
− | *Own server -> cyberfeminism | + | *ref:http://web.archive.org/web/20050324060205/www.constantvzw.com/index.php |
− | *Free software -> copy.cult | + | *top categories are lower on the page and the feeds of the different weblogs take precedence. |
− | *Dynamic content | + | *weblogs as private spaces of '''connected souls'''. |
− | *In house webdesigners + programming -- | + | *An image randomly pulled out of the weblogs appears on the homepage. And '''links directly to a blog''' index page. |
− | *DIY | + | *'''Own server''' -> cyberfeminism |
− | + | *'''Free software''' -> copy.cult | |
− | + | *'''Dynamic content''' | |
+ | *'''In house''' webdesigners + programming -- self-taught | ||
+ | *'''DIY''' | ||
=== Site as project space === | === Site as project space === | ||
+ | |||
+ | *Site becomes a place for production and research | ||
[[Image:project-space-scr.jpg|left|thumb|Urban(e)(istiques) Anomalie(ën)(s)]] | [[Image:project-space-scr.jpg|left|thumb|Urban(e)(istiques) Anomalie(ën)(s)]] | ||
<br clear="all"> | <br clear="all"> | ||
+ | |||
+ | *ref:http://www.constantvzw.com/westenberg/ | ||
*weblogs that started as publications become projects that are 'produced' in real space. | *weblogs that started as publications become projects that are 'produced' in real space. | ||
*weblogs as projects scratchpad | *weblogs as projects scratchpad | ||
+ | *example of Urban Anomalies: hosted space / weblog turned into our collaboration with Peter that influenced VJ9 and will continue in Routes and routines. | ||
=== Projects have their own archiving mechanism === | === Projects have their own archiving mechanism === | ||
+ | |||
+ | *Start to understand better the relation between archive, production, presentation and representation. | ||
+ | *Theme VJ7: "Imagine someone who uses digital means to produce sound or visual creations. What are his/her choices on a technical level? in terms of collective or individual work? Where does s/he place her body in relation to this technology, does s/he show it, hide it? Is s/he a thief, a collector, an author, a dancer, a copier ? What are his/her choices for circulation: broadcasting, on line, on stage, via the public ? What contracts does s/he have with the public, within the group ? What are their sources : meetings, chance meetings, corresponding, accidents?" | ||
+ | http://www.constantvzw.com/vj7/ | ||
[[Image:Cuisine.jpg|left|thumb|Cuisine Interne Keuken]] | [[Image:Cuisine.jpg|left|thumb|Cuisine Interne Keuken]] | ||
<br clear="all"> | <br clear="all"> | ||
+ | |||
+ | *ref:http://www.constantvzw.com/cuisine | ||
*Cuisine Interne, the structure of the project is a list of questions | *Cuisine Interne, the structure of the project is a list of questions | ||
*The project is thought of by understanding what is a database. | *The project is thought of by understanding what is a database. | ||
+ | *A culinary jukebox of some sorts | ||
+ | |||
+ | "We mean what a works, an organisation is made of: the components ingredients, the tools utensils, workplace, and work and creation processes recipes" | ||
+ | |||
+ | The Cuisine Interne Keuken project was initiated in 2004 at Jonctions/Verbindingen 7, a yearly festival around art, technology and ethics organised by Constantvzw, Brussels. | ||
+ | We selected 15 questions around the ingredients and recipes of cultural work. Some of these questions are quite straightforward, and some leave more space for interpretation or even evasion. The idea is to put practice, tools and conditions at the same level, so to question their interrelation. | ||
+ | |||
+ | Cuisine Interne Keuken started out of the desire to render visible the internal organisation of the cultural world we work in, with its written and unwritten laws, decision making processes and value systems. In our thinking about interdisciplinary cultural practices, we did not want to leave out the question of economy. | ||
+ | |||
+ | We started out by simply interviewing each guest to the Jonction/Verbindingen festival but after a while we found ourselves branching out to other contacts or people we met at conferences and lectures. The list of people interviewed therefore forms in a way the internal kitchen of the organisation Constant itself, and continues to expand and change. | ||
+ | |||
+ | Using the same set of questions over and over again helped us to focus a series of intimate conversations, and allowed a chain of interviewers in different settings and in multiple languages to bring their own background to the discussions. As a result in some cases answers were combined, questions not asked, or questions not answered but these 'flaws' actually became a marker of a rich and multi-layered process of research. | ||
+ | |||
+ | With more than 35 interviews soon accessible on this site, Cuisine Interne Keuken combines the rigor you might expect from an anthropological survey with the intimacy of private conversations. The adaptation of cooking language forms a deliberate move away from a more sensational "look behind the scenes", and from efficiency driven or technocratic terms in which design and art practices are mostly described in, if at all. It smells, it's messy, tickles the taste buds, fills the stomach but there's always a lot of dishes too. We offer them here to you for consulting, remixing, inspiration, learning and listening pleasure: a culinary jukebox of some sorts. | ||
=== Living archive === | === Living archive === | ||
Line 53: | Line 96: | ||
*past and present are together on the homepage | *past and present are together on the homepage | ||
+ | *Finding ways to horizontally traverse the sites contennt | ||
+ | *Trying to connect without losing the richness of the material produced (rich = content, language, softwares used, code) | ||
+ | |||
+ | |||
+ | ==== Method 1: interlinking ==== | ||
+ | |||
+ | *Show connections between VJ8, CIK and Participants | ||
+ | |||
+ | http://www.constantvzw.com/cn_core/vj8/index.php >> someone in vj8 who was interviewed for CIK: | ||
+ | Mammique >> http://www.constantvzw.com/cn_core/vj8/guests.php?id=104 | ||
+ | |||
+ | http://www.constantvzw.com/cn_core/guests/g2.php?&var_in=49 + Link to CIK | ||
+ | |||
+ | ==== Method 2: tours ==== | ||
− | + | *Subjective | |
+ | *Across timeline | ||
+ | *Across projects | ||
[[Image:Viewpoint-scr.jpg|left|thumb|A view point by Kris Rutten]] | [[Image:Viewpoint-scr.jpg|left|thumb|A view point by Kris Rutten]] | ||
<br clear="all"> | <br clear="all"> | ||
+ | Kris Rutten wrote a series of view points that link together different parts of the past/present activities of Constant. Starting points to jump across the Constant's timeline and to find one's way across the different projects. | ||
− | ==== harvesting/mining the website ==== | + | ref:http://www.constantvzw.com/view_points.php?id=25&mode=detail |
+ | |||
+ | ==== Method 3: harvesting/mining the website ==== | ||
+ | |||
+ | *This is where we are now.... | ||
[[Image:mining-scr.jpg|left|thumb|The dictionary of terms used by the search engine]] | [[Image:mining-scr.jpg|left|thumb|The dictionary of terms used by the search engine]] | ||
+ | <br clear="all"> | ||
+ | |||
+ | *ref:http://www.constantvzw.com/cn_core/dict/dictionary_view.php | ||
+ | *using the search as a dictionary, using the search to learn about our data, using the search to build a taxonomy | ||
+ | |||
+ | snipped from: http://twenteenthcentury.com/uo/index.php/LivingArchiveWorkshop2Minutes | ||
+ | |||
+ | Nicolas: We are trying to create an archive for Constant. We have lots of different interests and projects that have become multi-layered over the years, and it's now very difficult to envisage a 'general purpose' system to manage this heterogeneous, distributed information. | ||
+ | |||
+ | Since different people have contributed to this mess, we don't even know what's there to begin with. The first step we need to take, then, is to discover what we already have. | ||
+ | |||
+ | Step 1: Try to optimise the search engine: to see what's been indexed by the search engine, to see what's been found, and how to extract a negative/shadow structure from the data itself without having to index our site, page by page, content-type by content-type. We're not using the search engine to automatically produce a finished taxonomy, but to find some information, some clues to help us start chasing one. | ||
+ | |||
+ | We are trying to define our needs, then define the skills we will need, then trying to structure the development, so there's nothing finished to present right now, but hopefully you'll be able to see where we're going. | ||
+ | |||
+ | The first interface is our search engine, that we've had installed on the site for years. It's an open source programme called 'phpdig' - it generates a flat index of the site, it ranks results and we've been using it straight 'out of the box'. | ||
+ | |||
+ | http://www.phpdig.net/ | ||
+ | |||
+ | The first experiment was to try to analyse it and understand how it works a bit better. We analysed the way it indexed the data: it creates a 'dictionary' of all the terms that recur regularly on separate pages in separate sections of the site- word frequency analysis on separate directories. The relevance is weighted towards term repetition across different directories. Any more than 10 reps per document is no longer considered as relevant. | ||
+ | |||
+ | At the moment, the idea is not to make a 'better search engine', but to dig out a kind of vocabulary from the indicental action of the search engine. Should we include words like 'constant', 'avec', 'http' etc.. are part of a discussion to use later on. Since it's not humanly possible to tag each page, we have to find a starting point that can be automated. We are also concentrating on documenting our process, so that we can give away a process for how others might do the same thing. Just knowing where to start will be the main output. | ||
+ | |||
+ | We have tried to find different ways to visualise this data. Because this is just a starting point, we need to be able to generate overviews that allow us to discriminate between approaches through a number of views. At the moment the result from this data gathering is of dubious value, we need to test it in context and see what comes of it. Again, this is the basis for discussion, not an attempt to generate objective descriptions or truisms about the site and our data. | ||
+ | |||
+ | Graph 1: Weight vs. Path | ||
+ | |||
+ | * The vertical access is the frequency of word occurance | ||
+ | * The horizontal access is the distribution accross folders on the site | ||
+ | |||
+ | This relies on the search engine itself, so we went a bit further to see what other kinds of information we can find about what's on our site. At the moment this is described by a Venn Diagram: | ||
+ | |||
+ | * circle 1: internal queries: our hitdig installation | ||
+ | * circle 2: external queries: webalizer search stats | ||
+ | * circle 3: and dictionary. | ||
+ | |||
+ | In the intersections and lists of exclusions we find out what (to follow the metaphor) we are saying and not saying, and what people are hearing and not hearing. This process allows us to discuss the differences between what we're communicating and what we think we're communicating. | ||
+ | |||
+ | We could also extend this to more sets of results - for example: google searches co-branded with google and limited to results from our site (this is a service google offers), which would give a different set of results (searches people make on our site using google) - but this isn't a factor so far. | ||
+ | |||
+ | That's the extent of our preparatory discussions and research: evaluating search processes. | ||
+ | |||
+ | The next step is to see what we can do to generate classifications: making keywords. Based on the occurance of a term, for example: 'image'. If 'photo' and 'JPG' occurs regularly in the same pages, we can in a way try to make groups of these kinds of things by topics. That's just a kind of automated output - we can learn from it, accepting it's kind of stupid, but it's a start. | ||
+ | |||
+ | The other item under discussion is how to define relationships between, say, different items in the database which are not bilaterally linked A to B. For example, when we had a comission for a website the most interesting bit was to try and link participants, events and venues and define the kinds of relationships they could have. The idea initially was to make a direct kind of link - 'participant can be related to programs'. | ||
+ | |||
+ | The problem is that we had a kind of flat model. All the links | ||
+ | |||
+ | were anonymous. A is linked to B in an overly direct way. | ||
+ | |||
+ | (IMAGE: snapshot of flat data visualisation) | ||
+ | |||
+ | We had lots of data, multiply interlinked in a non-useful way. | ||
+ | |||
+ | The discussion we then made was how to qualify the meaning of the links - to differentiate them. The model we were looking for was inspired by the semantic web (RDF) notion that you can define relationships with triples: Subject - Predicate - Object. | ||
+ | |||
+ | http://www.w3.org/TR/rdf-concepts/#section-triples | ||
+ | |||
+ | We tried to create qualifiers (Predicates) to bridge the links between events, publications, collaborators etc. For example, if you describe a cinema programme we can say that 'A cinema is composed of films, and films have authors'. If you have a list of films, we have relationships only with film makers. So, if you see the website, usually they gather films where the people have precisely defined roles: runner, director, editor etc. Having this kind of information allows us to infer a lot more about the cinema programme than if we just have titles of films and names of directors. | ||
+ | |||
+ | We added a relationship tool to enable the graph to grow in a more diversified way. Until now, the commissionner has made a low-level usage of the tool. In their use of the tool, they tended to flatten relationships to generic roles. The relationship tool has a 'role' index allowing them to create a controlled vocabulary, and then apply it to relationships between types of objects. | ||
− | + | The hope was, at the end of this project, to generate a kind of map, or world view of how the commissionner works and understands itself. |
Latest revision as of 09:41, 3 May 2006
Contents
Evolution through homepage[edit]
- Evolution of Constant parallels the evolution of its website and the different datastructures produced during this evolution.
Constants visitors numbers have grown over time, and currently attracts ca. 2000 p/day
Website as Constants' 'window display', marking as much changes in use of technology, relations of content to visualisation and representation.
- Constant? [Associaton/foundation for art and media]
Constant is a non-profit association, based and active in Brussels since 1997 in the fields of feminism, copyright alternatives and working through networks. Constant develops radio, electronic music and database projects, by means of migrating from cultural work to work places and back again.
- We want to discuss an approach to datastructure which is not pre-planned, which is organic and that reorganises itself within 'chaos'. Which is learned by doing rather than learned and applied. Which allows for experiment and follows the life of an organisation through its changes rather than through its fixed image.
- Made use of archive.orgs' wayback machine. -- a service that allows people to visit archived versions of Web sites.
"The original idea for the Internet Archive Wayback Machine began in 1996, when the Internet Archive first began archiving the web. Now, five years later, with over 100 terabytes and a dozen web crawls completed, the Internet Archive has made the Internet Archive Wayback Machine available to the public. The Internet Archive has relied on donations of web crawls, technology, and expertise from Alexa Internet and others. The Internet Archive Wayback Machine is owned and operated by the Internet Archive." The Internet Archive Wayback Machine contains approximately 1 petabyte of data and is currently growing at a rate of 20 terabytes per month. This eclipses the amount of text contained in the world's largest libraries, including the Library of Congress." http://www.archive.org/web/web.php
Leaflet era[edit]
- 1999-mid 2003
- top categories correspond to organised events in 'real space'
- events are organised by a central team
- one webdesigner and not part of the team
- all content is static
- produced with proprietary software(dreamweaver) and using commercial plugins(quicktime, realplayer, flash)
Site as platform[edit]
- Constant hosts
- mid 2003 - mid 2005
- ref:http://web.archive.org/web/20050324060205/www.constantvzw.com/index.php
- top categories are lower on the page and the feeds of the different weblogs take precedence.
- weblogs as private spaces of connected souls.
- An image randomly pulled out of the weblogs appears on the homepage. And links directly to a blog index page.
- Own server -> cyberfeminism
- Free software -> copy.cult
- Dynamic content
- In house webdesigners + programming -- self-taught
- DIY
Site as project space[edit]
- Site becomes a place for production and research
- ref:http://www.constantvzw.com/westenberg/
- weblogs that started as publications become projects that are 'produced' in real space.
- weblogs as projects scratchpad
- example of Urban Anomalies: hosted space / weblog turned into our collaboration with Peter that influenced VJ9 and will continue in Routes and routines.
Projects have their own archiving mechanism[edit]
- Start to understand better the relation between archive, production, presentation and representation.
- Theme VJ7: "Imagine someone who uses digital means to produce sound or visual creations. What are his/her choices on a technical level? in terms of collective or individual work? Where does s/he place her body in relation to this technology, does s/he show it, hide it? Is s/he a thief, a collector, an author, a dancer, a copier ? What are his/her choices for circulation: broadcasting, on line, on stage, via the public ? What contracts does s/he have with the public, within the group ? What are their sources : meetings, chance meetings, corresponding, accidents?"
http://www.constantvzw.com/vj7/
- ref:http://www.constantvzw.com/cuisine
- Cuisine Interne, the structure of the project is a list of questions
- The project is thought of by understanding what is a database.
- A culinary jukebox of some sorts
"We mean what a works, an organisation is made of: the components ingredients, the tools utensils, workplace, and work and creation processes recipes"
The Cuisine Interne Keuken project was initiated in 2004 at Jonctions/Verbindingen 7, a yearly festival around art, technology and ethics organised by Constantvzw, Brussels. We selected 15 questions around the ingredients and recipes of cultural work. Some of these questions are quite straightforward, and some leave more space for interpretation or even evasion. The idea is to put practice, tools and conditions at the same level, so to question their interrelation.
Cuisine Interne Keuken started out of the desire to render visible the internal organisation of the cultural world we work in, with its written and unwritten laws, decision making processes and value systems. In our thinking about interdisciplinary cultural practices, we did not want to leave out the question of economy.
We started out by simply interviewing each guest to the Jonction/Verbindingen festival but after a while we found ourselves branching out to other contacts or people we met at conferences and lectures. The list of people interviewed therefore forms in a way the internal kitchen of the organisation Constant itself, and continues to expand and change.
Using the same set of questions over and over again helped us to focus a series of intimate conversations, and allowed a chain of interviewers in different settings and in multiple languages to bring their own background to the discussions. As a result in some cases answers were combined, questions not asked, or questions not answered but these 'flaws' actually became a marker of a rich and multi-layered process of research.
With more than 35 interviews soon accessible on this site, Cuisine Interne Keuken combines the rigor you might expect from an anthropological survey with the intimacy of private conversations. The adaptation of cooking language forms a deliberate move away from a more sensational "look behind the scenes", and from efficiency driven or technocratic terms in which design and art practices are mostly described in, if at all. It smells, it's messy, tickles the taste buds, fills the stomach but there's always a lot of dishes too. We offer them here to you for consulting, remixing, inspiration, learning and listening pleasure: a culinary jukebox of some sorts.
Living archive[edit]
Mixing Past and Present[edit]
- past and present are together on the homepage
- Finding ways to horizontally traverse the sites contennt
- Trying to connect without losing the richness of the material produced (rich = content, language, softwares used, code)
Method 1: interlinking[edit]
- Show connections between VJ8, CIK and Participants
http://www.constantvzw.com/cn_core/vj8/index.php >> someone in vj8 who was interviewed for CIK: Mammique >> http://www.constantvzw.com/cn_core/vj8/guests.php?id=104
http://www.constantvzw.com/cn_core/guests/g2.php?&var_in=49 + Link to CIK
Method 2: tours[edit]
- Subjective
- Across timeline
- Across projects
Kris Rutten wrote a series of view points that link together different parts of the past/present activities of Constant. Starting points to jump across the Constant's timeline and to find one's way across the different projects.
ref:http://www.constantvzw.com/view_points.php?id=25&mode=detail
Method 3: harvesting/mining the website[edit]
- This is where we are now....
- ref:http://www.constantvzw.com/cn_core/dict/dictionary_view.php
- using the search as a dictionary, using the search to learn about our data, using the search to build a taxonomy
snipped from: http://twenteenthcentury.com/uo/index.php/LivingArchiveWorkshop2Minutes
Nicolas: We are trying to create an archive for Constant. We have lots of different interests and projects that have become multi-layered over the years, and it's now very difficult to envisage a 'general purpose' system to manage this heterogeneous, distributed information.
Since different people have contributed to this mess, we don't even know what's there to begin with. The first step we need to take, then, is to discover what we already have.
Step 1: Try to optimise the search engine: to see what's been indexed by the search engine, to see what's been found, and how to extract a negative/shadow structure from the data itself without having to index our site, page by page, content-type by content-type. We're not using the search engine to automatically produce a finished taxonomy, but to find some information, some clues to help us start chasing one.
We are trying to define our needs, then define the skills we will need, then trying to structure the development, so there's nothing finished to present right now, but hopefully you'll be able to see where we're going.
The first interface is our search engine, that we've had installed on the site for years. It's an open source programme called 'phpdig' - it generates a flat index of the site, it ranks results and we've been using it straight 'out of the box'.
The first experiment was to try to analyse it and understand how it works a bit better. We analysed the way it indexed the data: it creates a 'dictionary' of all the terms that recur regularly on separate pages in separate sections of the site- word frequency analysis on separate directories. The relevance is weighted towards term repetition across different directories. Any more than 10 reps per document is no longer considered as relevant.
At the moment, the idea is not to make a 'better search engine', but to dig out a kind of vocabulary from the indicental action of the search engine. Should we include words like 'constant', 'avec', 'http' etc.. are part of a discussion to use later on. Since it's not humanly possible to tag each page, we have to find a starting point that can be automated. We are also concentrating on documenting our process, so that we can give away a process for how others might do the same thing. Just knowing where to start will be the main output.
We have tried to find different ways to visualise this data. Because this is just a starting point, we need to be able to generate overviews that allow us to discriminate between approaches through a number of views. At the moment the result from this data gathering is of dubious value, we need to test it in context and see what comes of it. Again, this is the basis for discussion, not an attempt to generate objective descriptions or truisms about the site and our data.
Graph 1: Weight vs. Path
* The vertical access is the frequency of word occurance * The horizontal access is the distribution accross folders on the site
This relies on the search engine itself, so we went a bit further to see what other kinds of information we can find about what's on our site. At the moment this is described by a Venn Diagram:
* circle 1: internal queries: our hitdig installation * circle 2: external queries: webalizer search stats * circle 3: and dictionary.
In the intersections and lists of exclusions we find out what (to follow the metaphor) we are saying and not saying, and what people are hearing and not hearing. This process allows us to discuss the differences between what we're communicating and what we think we're communicating.
We could also extend this to more sets of results - for example: google searches co-branded with google and limited to results from our site (this is a service google offers), which would give a different set of results (searches people make on our site using google) - but this isn't a factor so far.
That's the extent of our preparatory discussions and research: evaluating search processes.
The next step is to see what we can do to generate classifications: making keywords. Based on the occurance of a term, for example: 'image'. If 'photo' and 'JPG' occurs regularly in the same pages, we can in a way try to make groups of these kinds of things by topics. That's just a kind of automated output - we can learn from it, accepting it's kind of stupid, but it's a start.
The other item under discussion is how to define relationships between, say, different items in the database which are not bilaterally linked A to B. For example, when we had a comission for a website the most interesting bit was to try and link participants, events and venues and define the kinds of relationships they could have. The idea initially was to make a direct kind of link - 'participant can be related to programs'.
The problem is that we had a kind of flat model. All the links
were anonymous. A is linked to B in an overly direct way.
(IMAGE: snapshot of flat data visualisation)
We had lots of data, multiply interlinked in a non-useful way.
The discussion we then made was how to qualify the meaning of the links - to differentiate them. The model we were looking for was inspired by the semantic web (RDF) notion that you can define relationships with triples: Subject - Predicate - Object.
http://www.w3.org/TR/rdf-concepts/#section-triples
We tried to create qualifiers (Predicates) to bridge the links between events, publications, collaborators etc. For example, if you describe a cinema programme we can say that 'A cinema is composed of films, and films have authors'. If you have a list of films, we have relationships only with film makers. So, if you see the website, usually they gather films where the people have precisely defined roles: runner, director, editor etc. Having this kind of information allows us to infer a lot more about the cinema programme than if we just have titles of films and names of directors.
We added a relationship tool to enable the graph to grow in a more diversified way. Until now, the commissionner has made a low-level usage of the tool. In their use of the tool, they tended to flatten relationships to generic roles. The relationship tool has a 'role' index allowing them to create a controlled vocabulary, and then apply it to relationships between types of objects.
The hope was, at the end of this project, to generate a kind of map, or world view of how the commissionner works and understands itself.