Digital Public Space prototype demonstration

advertisement
Seminar 3: Creative use of archive
‘Digital Public Space prototype demonstration’
–Jake Berger, Programme Manager, Digital Public Space, BBC
Bill Thompson, Chair and Head of Partnership Development, BBC Archives:
Now I’d like to ask Jake Berger from the BBC to come and talk to us about the
digital public space data model and what’s we’ve been working on.
Jake Berger, Programme Manager, Digital Public Space, BBC:
It’s, er... hopefully it’ll be a little bit more exciting than it sounds!
Bill Thompson:
The digital public space date model is so exciting. You and I love meta-data.
Jake Berger:
We do, yeah.
Bill Thompson:
These people just don’t understand the sheer glory of triples...
Jake Berger:
Yes.
Bill Thompson:
And linked open data.
Jake Berger:
And I promise I won’t mention the word meta-data, or triples, in this entire talk. But
erm, it’s quite interesting. I’m the fourth man in a dark suit to stand up in front of
you today, but you can tell we’re from the creative sector ‘cause none of us are
actually wearing a tie.
Right, that’s me, that’s what I do, that’s who I work for.
I’d like you to imagine that every museum archive, gallery, library, theatre, and
studio in the country could all be found next to each other. And that they each had
every single item in their collections on display. And imagine if the smallest
organisation’s archives and objects had exactly the same level of visibility and
accessibility as the big nationals. And imagine that all of this material and
information were linked together. Now, hold that thought for a moment. This is a
picture of the internet.
Now, the web finds us stuff, it shows us loads of stuff, and it links to loads of other
stuff. So it’s great at linking things together, but it’s not yet great at making real
meaningful connections between all of the things. That’s left up to the humans to
do. It’s not very good at saying that this thing is like this other thing, or this thing is
different to this other thing. These are not the same things. Paris Hilton is not the
same as the Paris Hilton. But, if you try and find that picture by typing in ‘Paris
Hilton’ or ‘Hilton in Paris’, you have to work your way, as I found a couple of days
ago, through about a hundred thousand of these before you get that. Most people
are probably looking for the one on the left, but I was looking for the one on the
right. We need to do something about this. It shouldn’t just be what’s popular that
is always first. But all of this is possible. This is about as technical as I will get
today; this is the vision of linked data and the semantic web.
It’s a bit hard to see here but that says ‘door’.
Now if we can tell the web what each thing is and what each thing isn’t... rule
number one: never let Rene Magritte do the tagging.
[Audience laughter]
And if we tell it what set, if you remember back to your Venn diagrams at school,
or group of things it’s part of, and how all these sets relate to each other, then we
can ask new kinds of questions and we can get new kinds of answers.
Such as [shows audience picture] or [shows audience picture]. Probably get zero
results for that, but it’d be worth a try. Or more simply [shows audience picture].
So you should get the idea.
So, how can we tell the web what these things are? Well let’s call them entities.
Now I think, and I’m very happy to be proved wrong, that we can understand every
entity in the world as being either a person, a place, a collection, an event or a
thing. Now, things are the kind of ‘get out of jail free’ card because it captures
everything that doesn’t fit into all the previous ones. These entities are often
associated with a time, a moment or period in the past, the present, or the future,
and assertions that are made about these things by people and by machines.
These assert us: we’re all assigned various levels of authority and perspective, so
a curator, an expert, a witness, a creator. I’m sure some of you will be used to
making assertions, and I’m sure some people will actually believe them. But all
entities have some sort of physical representation in the real world, whether that’s
a statue, a recording, a video, some ones and zeros on a memory stick, or a
server, or a flash card. And some of these entities are going to have emotional
states associated with them. These can be very different depending on which
character you play in the story.
Each physical representation is held somewhere – that’s the Amazon warehouse
by the way, if you wonder where all your stuff comes from – or is displayed
somewhere. And all of these entities will sit somewhere on a spectrum of
availability and affordability, somewhere between free and open, or closed and
expensive.
So, how can we make this vision of connected availability happen?
Starting with the material that we have in our own archives, in our collections, and
the data, if we can classify or tag all of it, if we can digitise it, do this in a semantic
web friendly manner, following some very basic, simple rules and approaches –
there’s nothing more complex than the grammar you would learn in your first few
years of secondary school – make them available and open, then people can find
our stuff. They can make their own assertions about it; they can rate it to other
things. They can tell us things that we don’t know about our own material, which
then adds to the find-ability, the interestingness and the usefulness of it. It’s a
positive feedback loop, it’s a positive cycle. You’re probably thinking this, and,
quite rightly so.
So what are we doing? Well we, the BBC in conjunction with partners, many of
whom are represented in this audience, are trying to create a framework that
makes all of this thing feasible for any organisation, small or large. We’ve drafted
an overarching data model in conjunction with a number of organisations – this lot
at the moment, but we have many more who are interested. The data model
simply brings together a whole load of different catalogues, classifies and identifies
them in a constant way, picks out themes within and types and sets in
relationships, maps out those connections.
Now this next slide. If you are of a nervous disposition I’d ask you to look away
now, but I’ll only keep it up there for a couple of seconds. This is the data model
which you can’t see there, thank you lights.
[Audience laughter]
Bill’s actually got it tattooed on his inner thigh if you’re interested, and I’m selling
posters at the end at very reasonable prices, so come and see me.
So, turning this vision into something that’s useable and interesting, well we’ve
created a prototype system that aggregates for all of these data sets, and
translates them into the categories of people, place, collections, events and things,
and starts to make connections between them, and will eventually enable all of the
other things that I’ve talked about. But at the moment it’s relatively basic.
So I’d like to show you this system, but I’m afraid I can’t because my developer
broke it last week while ingesting 10 million records of the national archives. I’ll
have something that I can show you soon. I can show you a slightly shaking
version of it in the break, or come and talk to us afterwards.
But actually the visible bit isn’t the important bit of this; the important bit is bringing
all of these data sets together, being able to translate them. The really clever bit of
a few alga-rhythms that create and associate all of these different things in ways
that a human being could do if they had, I don’t know, 10 million years at their
disposal.
What I was going to show you is a couple of example interfaces that we’ve built
over the last few months that demonstrate the kind of thing you can do on this
platform. They would have looked like this, so here’s the view of the Royal Opera
House; it shows a few things you can explore. Don’t know why Southend Pier is up
there, but there you go.
This is a person page for Winston Churchill. You see it’s just pulling in information
from other sources.
A place. A thing. An event.
And if you can see at the top, it’s beginning to group these things together so the
event is part of tourism ceremonies, trade events, Royal Festival Hall. None of this
has been hand created or curated, all of this is linked, structured data and algarhythms that are saying, ‘this thing here is probably like that thing, and if that
thing’s related to those things then this thing’s probably related to those other
ones’.
[Laughter]
I’ll draw you a diagram.
We also wanted to have a bit more of a kind of video-friendly version of it, so
here’s an interface which finds a whole load of videos related to Enid Blyton; when
we’ve got this hooked up with the British Library’s collection it will show you books.
This is a time view which lets you jump from millennium to century, to decade, to
year, to month, to day, and pulling bits of information from everyone’s collections
that relate to that particular moment in time.
Here are the results from the first database for Swan Lake, breaking them down
into things, events, collections, places. This has only got about probably half of 1%
of the amount of material that it will eventually have. So if we can get some interest
in connections across different people’s collections with the 1%, imagine what
happens when we multiply that by a factor of one hundred.
And then this just lets you kind of create your own view of it, or see what other
people are interested in, in a kind of ‘my favourite things’ page.
But this can only work if it’s much, much bigger and broader than the BBC. All
we’re really trying to do is create standards, frameworks and tools for other people
to use. We can do this because we are funded by you and, you know, 60 million
other people. We should do this because we have engineers, we have archivists,
we have producers, and they’re all generally pretty busy and there’ll be a few less
of them today after today’s announcement, but we feel it’s a fundamental thing that
the BBC should be doing, in the same way that it makes sure that your radio would
work from the peak of the highest Scottish mountain to the lowest valley, maybe.
It must work for everyone – for the smallest organisation or individual due, you
know, down, or up, up to the biggest behemoth. So we want people to contribute
data and media to make it available, we can help you understand easy ways to do
that. We need people to play with what we’re creating, try and break it, tell us how
to make it better. Tell us, ‘Ooh, if only it did this thing, suddenly that would fit my
world’.
And we want people to think about how they could use what we’re creating to
supplement the stuff that you’re already creating. Everything that we would pull
together here we would like to be usable by, you know, small websites, by small
exhibitions, by school kids’ projects through to massive national projects.
If you’re still interested, then come and talk to us. I didn’t realise Tony was gonna
be here, otherwise I wouldn’t have used that picture, but sorry Tony, it was the
second one that came up on Google.
If you’re really important, then you can talk to Roly.
Thank you for listening.
[Applause]
Download