Electronic publishing

advertisement
James Danforth Quayle
Wikipedia – Dan Quayle entry
………..“Contributing greatly to the perception of Quayle's incompetence
was his tendency to make public statements which were either
Self-contradictory
"We don't want to go back to tomorrow, we want to go forward“
Obvious
"For NASA, space is still a high priority“
Geographically wrong
“I love California. I practically grew up in Phoenix."
Fallacious
“It's time for the human race to enter the solar system"
Or painfully confused and inappropriate, as when he addressed the United Negro College Fund,
whose slogan is "A mind is a terrible thing to waste,"
and said
"You take the United Negro College Fund model that what a waste it is to lose one's mind or
not to have a mind is being very wasteful. How true that is."
The Rehabilitation of Dan Quayle
Prediction is very difficult, especially
about the future
Niels Bohr (1885 - 1962)
The future will be better tomorrow
Dan Quayle (1947 -
Electronic Publishing
Pratt Library School
Univ College London
Chris Beckett VP Sales and Marketing Atypon
Leader in the Provision of E-publishing Solutions
www.atypon.com
Agenda
 The perils of prediction
 What is e-content hosting and how does it fit
into the publishing landscape?
 What do companies like Atypon do
 Understanding technical aspects of hosting
 How readers navigate to content
 Information overload today
 Morphing content
 The Scale challenge
Future Trends
Leader in the Provision of E-publishing Solutions
www.atypon.com
The Perils of Prediction
Electronic publishing
 ……Paper can be replaced as the storage and
transport mechanism. We can move to
electronic publishing and distribution of
journals, newsletters, indexes, bibliographies,
and so forth. This transition will increase the
speed and lower the cost of dissemination.
……
 The journey from electronic publishing to multimedia publishing is short. Journal articles of
the future could include audio and video clips
when they are appropriate for the situation.
Management Information Systems Quarterly Vol 18, No. 3, September, 1994
Creating and Sustaining Global Community of Scholars
Richard Watson
Capabilities and the need to invest
 http://www.misq.org/
Multimedia capabilities
University of Chicago press site
 http://www.journals.uchicago.edu/doi/full/10.10
86/343751
Annual Reviews site
 http://arjournals.annualreviews.org/doi/suppl/10
.1146/annurev.biochem.75.103004.142647
What is required to achieve the potential of
electronic publishing?
INVESTMENT
How Much?
Open Journal Systems (OJS)
 “$400,000 U.S….. is enough to purchase
hosting and support services, using Open
Journal Systems , for 785 journals!
 $509 U.S. per year

Heather Morrison The Imaginary Journal of Poetic Economics
Thursday, January 25, 2007 Stop fighting the inevitable - and
free funds for OA!
International Consortium for the
Advancement of Academic Publications
 Journal creation from $3,600 Cdn (about
$3,000 US; ongoing journal hosting,
maintenance and conversion from 1,840 Cdn
per year (about $1,500 U.S)
 $1500 per year
The Aggregators
 MetaPress, Ingenta Connect, Atypon Link
 Probably in the area of $2,000-$5,000 per year
per title
 May include a percentage of any ecommerce
revenues
 Share a single platform
 Share functionality
 Limited bespoke capabilities at this price point.
The customised builders
 HighWire, Scitation (AIP), Atypon, MetaPress
 Pricing more mysterious
 In the case of Atypon at least based on a
combination of functional complexity and
amount of content.
 Possible with any of these to spend $500k-$1m
a year on a site and it might only have one title
So why the disparity ?
$500 - $500k?
Capabilities and scalability
CAPABILITIES
What is the electronic content partner
offering?
Providing sophisticated and flexible
e-publishing solutions
Widen readership
Create new business models
Grow revenue
Take control of your content
Interact with the reader more
effectively
Print vs. Electronic Publishing









E-publishing provides an entirely new set of challenges and
opportunities when compared to print publishing:
No physical storage required in library
No postage or shipping
No constraints on issue size or how articles are “packaged”
Users can access from anywhere (no need to visit library)
Many users can access simultaneously
Content is easier to find due to wide metadata distribution
Creates richer, faster research environment (e.g. easy to follow
both forward and backward references/citations to other
content)
Tech maintenance, changes to production process, archiving
responsibilities, new skill sets required for staff, costs, changes
to library processes
Slide courtesy of Gary Coker – MetaPress
What is a hosting “platform”?
 The core online system upon which hosting
features are built
 Hardware and software
 Differentiated by:
 Integration with production systems,
 types of content supported
 features offered beyond just basic content
access
Slide courtesy of Gary Coker – MetaPress
Example platforms







Wiley InterScience (home grown)
Elsevier ScienceDirect (home grown)
Blackwell Synergy (Atypon)
SpringerLink (MetaPress)
OECD (Ingenta)
OUP Journals (HighWire)
New England Journal of Medicine (HighWire –
moving to Atypon)
Services Offered by Hosting Providers
 Processing of e-content metadata and full text
 Normalization & validation
 Reference parsing & link generation
 Metadata distribution to 3rd parties
 Content indexing
 Browse and search user interfaces
 Access website design, customization, and
branding
 Access control and license management
Slide courtesy of Gary Coker – MetaPress
Services Offered by Hosting Providers….
 Online sales of subscriptions and individual content
objects
 Articles; Figures; Chapters, Chapter Sections; books;
video
 Reports: usage, accounting, sales
 Online package management, online subscriber
management
 Marketing features, such as subscriber messaging and
RSS feeds
 Subscriber (library) features, such as authentication,
linking, alerting, and usage reporting
Slide courtesy of Gary Coker – MetaPress
Management Tools
 On the desktop
 Real Time
 Control of
 Business models; the matching of content
bundles, and user and contracts
 Look and feel
 Content
 Loading
 Marketing
Model differences – Off the peg vs. Bespoke
Vanilla vs. Tutti Frutti
 The platforms
 Atypon Link; Ingenta Connect; MetaPress;
HighWire(?)
 The bespoke builders
 Atypon, HighWire, Scitation, MetaPress (?);
Ingenta(?)
Some examples…..









www.ingenta.com
www.metapress.com
www.springerlink.com
www.atypon-link.com
http://highwire.stanford.edu/
http://content.nejm.org/
http://www.annualreviews.org/
http://www.cfapubs.org/
http://www.aluka.org/
Processing Digital Content
 The publisher’s production process produces metadata
and full text files, which are transmitted to the hosting
provider:
 Files are produced in standard formats that the
hosting provider can process:#Metadata: SGML,
XML, etc.
 Full text: PDF, HTML, TeX, MathML, etc.
 There is a great deal of variation in the specific
metadata fields provided by the publisher and how
those fields are processed by the hosting provider,
requiring a high level of integration between the
publisher’s production process and the hosting
provider
Slide courtesy of Gary Coker – MetaPress
Metadata
 Metadata is all the information attributes that describe an
article
 Article title, author names, author affiliations, journal
name, chronology / enumeration, abstract, subjects,
keywords, etc.
 NOT the full text
 Most users discover content via metadata,so the richer
the metadata provided, the more likely users will find the
content
 Trends:
 Wider distribution of metadata (CrossRef, Google
Scholar, etc.)
 User-generated metadata (tagging, ratings, reviews,
etc.)
Slide courtesy of Gary Coker – MetaPress
Full Text








Full text refers to the entire contents of an e-journal article,
including the actual text as well as images, graphs, tables, etc
.
Adobe’s PDF format is the de facto standard full text format
Exactly duplicates the print journal, providing consistency
Self-contained (a single downloadable file, unlike HTML)
But easier to share with others in violation of license
agreements
Trends:
Move towards FT XML in the production process allows for
disaggregated product delivery.
Content-specific formats that provide richer media possibilities
Slide courtesy of Gary Coker – MetaPress
Processing Digital Content
 For content that is not “born digital”, such as historical
backfiles, digitization of print is often necessary
 Digitization can be performed by the publisher, by the
hosting provider, or by a 3rdparty digitization service
 Human-vs. computer-based digitization
 Metadata and full text files are transferred regularly from
the publisher (or their production provider) to the hosting
provider:
 Most transfers occur online, such as via FTP
 Large “initial loads” may be delivered via physical
media, such as DVDs
 Speed of processing by the hosting provider is critical
Slide courtesy of Gary Coker – MetaPress
Processing Digital Content
- Production
 Once metadata and full text files are in-house
at the hosting provider, they must be processed
into a form that allows online access:
 Normalization and validation software
transforms text files into records in the
hosting platform’s content database
 A metadata index is created for end user
browsing and searching of metadata
 A full text index is created for end user
searching of full text content
 Reference linking is enabled
Slide courtesy of Gary Coker – MetaPress
Processing Digital Content – Discoverability
 Distribution of processed content to 3rd parties
 Full text or metadata
 Aggregators e.g. database vendors, econtent gateways ( a mix of FT and
Metadata)
 Search engines (e.g. Google)
 Updates to subscribers’ access control records
 Generation of alerts and RSS feeds to inform
users of the contents’ availability
 Creation and indexing according to taxonomies
Discoverability – Recommended articles
Discoverability – Recommended articles
Marketing
 Inclusion of an additional content
 Supplementary data files, video, podcast
 Provision of sophisticated business model
support, beyond simple subscriptions in order
to support non-institutional sales channels
 Branding and site editing capabilities
 Portal building capabilities.
 E-commerce capabilities
Marketing - Multimedia capabilities
University of Chicago press site
 http://www.journals.uchicago.edu/doi/full/10.10
86/343751
Annual Reviews site
 http://arjournals.annualreviews.org/doi/suppl/10
.1146/annurev.biochem.75.103004.142647
Marketing - Multimedia capabilities
Marketing - Multimedia capabilities
Access Control & License Management
 Access control ensures that only authorized users gain
access to only the content to which they are entitled
 Hosting providers provide support for a multitude of
access control models, including support for free content
and pay-per-view
 User name and password
 IP range
 Shibboleth
 Athens
 Open Id
 Integration with 3rdparty access control entities
(subscription agents, libraries, ATHENS, etc.)
How Readers Navigate to Content
Simon Inger and Chris Beckett
Gateways and Hosts
Google or other
search engine
Subject A&I
Primary
Scholarly
Content
Web Gateway
(Sub
Agent, Ingenta, Portals)
Library Web space
When you need to find a specific online journal article and when you already have a reference or citation where
do you start your search?
4.5
4
3.5
Relative Score
3
2.5
2
1.5
1
0.5
0
A specialist
bibliographic
database
Library web
pages
A specialist site A publisher’s
for your
web site….
subject…
The journal’s
homepage
A journals
gateway….
A general web
search engine
A Scholarly
Society web
page
When you wish to view the latest issues of your core journals, how do you navigate on the web to those
journals?
5.00
4.50
4.00
Relative Score
3.50
3.00
2.50
2.00
1.50
1.00
0.50
0.00
A specialist
bibliographic
database
Library w eb
pages
A specialist
site for your
subject…
A key
research
group…
A
A publisher’s
departmental w eb site….
listing…
Email based
alerts
The journal’s
homepage
A journals
gatew ay….
A general
w eb search
engine
A Scholarly
Society w eb
page
When you need to do a search for articles on a specific subject, where on the web do you start that search?
5.00
4.50
4.00
Relative Score
3.50
3.00
2.50
2.00
1.50
1.00
0.50
0.00
A specialist
bibliographic
database
Library w eb
pages
A specialist
site for your
subject…
A key
research
group…
A
A publisher’s
departmental w eb site….
listing…
Email based
alerts
The journal’s
homepage
A journals
gatew ay….
A general
w eb search
engine
A Scholarly
Society w eb
page
Subject A&I Gateway: CSA Illumina
http://www.csa.com
Search Engine Gateway: Google
http://www.google.com
Sub Agent Gateway: SwetsWise
http://www.swetswise.com
Library Gateway: Chicago, USA
http://www.lib.uchicago.edu/e/db/eresources.html
Three Types of “Aggregator”
 Hosts – the content owners’ or their service
providers’ web sites
 Gateways – web sites that index and link to the
hosts
 Full text aggregators – companies that license
publishers’ content, host it and sell it as a
collection
 See paper on aggregators:
http://dx.doi.org/10.1087/095315101753141383
Wrap-Up
 The world has settled on
 A relatively small number of scholarly
content hosts: aggregations of journals and
books representing many publishers, and
publisher web sites
 A relatively small number of gateways to
content with different unique selling points
 Library web sites, which are effectively
gateways themselves
Role of the librarian
Gateways and Hosts
Google or other
search engine
Subject A&I
Primary
Scholarly
Content
Web Gateway
(Sub
Agent, Ingenta, Portals)
Library Web space
The Role of the Librarian
 For most scholarly e-resources, the great
majority of usage comes from library-enabled
site-licences
 There are two distinct points to be made here:
 Library web environments are a major
starting point for readers
 Libraries are also responsible for arranging
licences so that the content appears to be
free of charge to the user
 The user can access the content in
addition to merely discovering it
What Librarians are Trying to Achieve
(Broadly)
 Provide and direct patrons to appropriate
resources
 Appropriate, in this context, means quality,
reliable sources of information
 Maximise usage and return on investment in
these resources
Gateways and Hosts
Google or other
search engine
Subject A&I
Primary
Scholarly
Content
Web Gateway
(Sub
Agent, Ingenta, Portals)
Library Web space
Information overload today
 The first learned societies and journals could encompass all
disciplines
 Today:
 ~90% of the scientists who have ever lived are alive today
 c. 3% annual growth in articles since 1981
 c. 1.5M articles published in 2007 (Mabe, 2007)
 Specialisation is inevitable
 Even then, in most fields a researcher can’t cover it all
Slide courtesy of Sally Morris – Morris Associates
Output of science & technology articles
800,000
700,000
600,000
500,000
400,000
300,000
200,000
100,000
0
European Union
United States
Asia-10
World total
1995
Slide courtesy of Mike Mabe - STM
2005
Making it easier to do what you already do
THEN
NOW
 Letter, phone
 Paper submission, post to/from
publisher and reviewers
 Print journals
 Printed indexes
 Looking up citations
 Recommendation, citation
 Email
 Online journal submission and
peer review
 E-journals
 Online search engines
 Citation linking
 Human (or machine) tagging
Slide courtesy of Sally Morris – Morris Associates
Making it possible to do completely new
things
 Distributed data collection, sharing and
analysis
 New forms of communication: blogs, wikis etc
 Developing ideas as a group/community
 The possibilities of behavioural tracking
Slide courtesy of Sally Morris – Morris Associates
The changing role of data
 Data collection
 Distributed/international
 Previously unimaginable scale
 E.g. Global Ocean Observing System
(http://www.aoml.noaa.gov/phod/goos.php)
 Data sharing
 E.g. Dataverse Project (http://thedata.org/)
 Data analysis
 Previously unimaginable speed and complexity
 E.g. SETI project (http://setiathome.berkeley.edu)
 Data (and text) mining (e.g. BioText,
http://biotext.berkeley.edu/)
Slide courtesy of Sally Morris – Morris Associates
Where do journals and articles fit in all of this?
 The individual article
 Should we insist on a single, fixed, ‘version of
record’?
Slide courtesy of Sally Morris – Morris Associates
Where do journals and articles fit in all of this?
 The individual article
 Should we insist on a single, fixed, ‘version of
record’?
 …or should it be an evolving agglomeration
of data, descriptive text, discussion, and other
relevant media?
Slide courtesy of Sally Morris – Morris Associates
The end
cbeckett@atypon.com
Download