New Frontiers in Auto-translation: The HAH* Solution An ISyracuseHigh Joint Initiative Helen Szigeti, ISI Abby Goodrum, Syracuse University Helen Atkins, Highwire Press * HAH: Helen, Abby, and another Helen Issue: Citedness • Why aren’t JASIST authors more highly cited than they are? Problem: Incomprehensibility • No one can understand articles in JASIST • Hence, no one cites JASIST • JASIST authors do not receive large amounts of grant money, lucrative speaking engagements, a smooth path to tenure, or invitations for guest appearances on Oprah Evidence of the problem • 1999 JASIS article by HB Babs, HMS Trix, and A Bala: “The synthesis of specialty narratives from co-citation clusters. Part 1: Utilization of a real-time self organizing approach to term co-occurrence and word frequency analysis through collaborative filtering of multidimensional databases.” Hypothesis: Comprehension is time-consuming • By the time a reader reaches the end of a JASIST article with a full understanding of the ideas and issues presented s/he has forgotten why s/he was reading the article in the first place Goal: Reduce the time needed to understand a JASIST article Solution: HAH Trans-JASIST Device sm • Automatically parses out pseudo-scholarly info-babble leaving only root concepts, stop words, and thinly veiled polysyllabic expletives.* • “Corporate” Version (2.0; in beta) can also reversetranslate from a simple executive memorandum to a quality scholarly paper suitable for publication in any information science journal. * Note: ISyracuseHigh is currently working on a related parser that will be capable of capitalizing on these expletives as a means of generating a new method of relevance ranking Elements of the Solution: part 1 • HAH Redundancy Reducer (HAR-HAR) - Occupational tendency for information scientists to utilize the same data set to publish multiple papers - The HAR-HAR takes a work or a corpus of work by a single author and reduces it to a single paragraph (or in some cases, a single phrase) Elements of the Solution: part 2 • HAH Suess-O-Mapper (HAH-SOMMore) - Our research uncovered a fundamental linguistic key* that underlies all scholarly communication/ publication patterns worldwide - The HAH-SOMMore uses concept mapping algorithms against the output from the HAR-HAR redundancy reducer to generate a comprehensible, natural language alternative to the original text. * From the seminal work by Dr. Suess entitled One Fish, Two Fish, Red Fish, Blue Fish. Demonstrations of the System • Academic paper to natural language • Corporate memo to academic paper Academic paper to natural language • “The synthesis of specialty narratives from co-citation clusters. Part 1: Utilization of a real-time self organizing approach to term co-occurrence and word frequency analysis through collaborative filtering of multidimensional databases.” (Babs, Trix, and Bala) • After reduction: synthesis self-ego to group visual word and free ISI science data through from grant of no-tenure wine damn damn damn • After mapping to natural language... Academic paper to natural language • “A pretty picture we drew by putting ISI data (which we got for free) into visualization software to show that medicine can be considered a sub-category of life sciences (who’da thunk?): We would have done more but we blew our grant money on Merlot and DVDs.” Corporate memo to academic paper • “Subject: Unauthorized use of telephone, fax, and email for personal reasons.” • After reverse translation: “Policy analysis for topical consensus on the roles, rights, and responsibilities of individuals toward digital materials and communication protocols within the corporate learning organization: Optimization of transactional analysis to benchmark performance measures in a networked environment.” Results • Although our translation engine has a 93% success rate it does not solve the problem initially identified by the research team • Original hypothesis: If readers could understand JASIST articles within a shorter time period then citations to these articles would increase • Actual outcome: Once fully comprehended in a reasonable time frame, JASIST articles are even less frequently cited because no worthwhile data, methodologies, or conclusions are discernable The HAH Axiom: Comprehension works against citedness. Conclusion • Do not try to be clear -- just keep doing what you’re doing. Thank you! ISyracuseHigh contact information: Helen Szigeti helen.szigeti@isinet.com Abby Goodrum aagoodru@syracuse.edu Helen Atkins something@highwire.org?