<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Kommentare zu: Citation parsing</title>
	<atom:link href="http://jakoblog.de/2008/01/24/citation-parsing/feed/" rel="self" type="application/rss+xml" />
	<link>http://jakoblog.de/2008/01/24/citation-parsing/</link>
	<description>Das Weblog von Jakob Voß</description>
	<lastBuildDate>Mon, 06 Feb 2012 18:28:02 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
	<item>
		<title>Von: erik senst</title>
		<link>http://jakoblog.de/2008/01/24/citation-parsing/comment-page-1/#comment-131936</link>
		<dc:creator>erik senst</dc:creator>
		<pubDate>Thu, 11 Dec 2008 11:26:26 +0000</pubDate>
		<guid isPermaLink="false">http://jakoblog.de/2008/01/24/citation-parsing/#comment-131936</guid>
		<description>Half a year ago I developed a fuzzy based concept for extracting citation elements (article/chapter title, journal title / book title, author, Vol, No., pages). Citations can be checked title by title in Google Scholar/Books and SFX for example and can beconverted to endnote (for import by Zotero or Citavi) Some parts of the concept can be tested in the free flash/php web tool:

&lt;a href=&quot;http://www.erik-senst.de/literaturlistenpruefer&quot; title=&quot;Free Citation Parser&quot;&gt;

You can paste your own list instead of the example list (after refreshing the page)&lt;/a&gt;</description>
		<content:encoded><![CDATA[<p>Half a year ago I developed a fuzzy based concept for extracting citation elements (article/chapter title, journal title / book title, author, Vol, No., pages). Citations can be checked title by title in Google Scholar/Books and SFX for example and can beconverted to endnote (for import by Zotero or Citavi) Some parts of the concept can be tested in the free flash/php web tool:</p>
<p><a href="http://www.erik-senst.de/literaturlistenpruefer" title="Free Citation Parser"></p>
<p>You can paste your own list instead of the example list (after refreshing the page)</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>Von: Jakob</title>
		<link>http://jakoblog.de/2008/01/24/citation-parsing/comment-page-1/#comment-51146</link>
		<dc:creator>Jakob</dc:creator>
		<pubDate>Mon, 12 May 2008 19:27:57 +0000</pubDate>
		<guid isPermaLink="false">http://jakoblog.de/2008/01/24/citation-parsing/#comment-51146</guid>
		<description>I project that I am going to do since a long time is parsing citations in Wikipedia articles. There will be a paper on citation analysis and Wikipedia at &lt;a href=&quot;http://wikimania2008.wikimedia.org&quot;&gt;this years Wikimania&lt;/a&gt;, but it is still pretty limited. Most citations in Wikipedia are in free text with some formatting like in any other paper. I will try ParCit but not before autum I think.</description>
		<content:encoded><![CDATA[<p>I project that I am going to do since a long time is parsing citations in Wikipedia articles. There will be a paper on citation analysis and Wikipedia at <a href="http://wikimania2008.wikimedia.org">this years Wikimania</a>, but it is still pretty limited. Most citations in Wikipedia are in free text with some formatting like in any other paper. I will try ParCit but not before autum I think.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Von: Min-Yen Kan</title>
		<link>http://jakoblog.de/2008/01/24/citation-parsing/comment-page-1/#comment-50972</link>
		<dc:creator>Min-Yen Kan</dc:creator>
		<pubDate>Mon, 12 May 2008 08:01:26 +0000</pubDate>
		<guid isPermaLink="false">http://jakoblog.de/2008/01/24/citation-parsing/#comment-50972</guid>
		<description>ParsCit (mentioned in your P.P.S.) is now used to power CiteSeer^x.  I (and past colleagues) have been collaborating with the PSU team extensively to do extensive field testing of the system.

ParsCit has been successfully used to parse over 20 million citations within the CiteSeer^x framework (I&#039;m quoting Isaac Councill on this figure).

We (especially me) would be very happy to further get feedback on how the ParsCit package can help others in the community.

Cheers,

Min</description>
		<content:encoded><![CDATA[<p>ParsCit (mentioned in your P.P.S.) is now used to power CiteSeer^x.  I (and past colleagues) have been collaborating with the PSU team extensively to do extensive field testing of the system.</p>
<p>ParsCit has been successfully used to parse over 20 million citations within the CiteSeer^x framework (I&#8217;m quoting Isaac Councill on this figure).</p>
<p>We (especially me) would be very happy to further get feedback on how the ParsCit package can help others in the community.</p>
<p>Cheers,</p>
<p>Min</p>
]]></content:encoded>
	</item>
	<item>
		<title>Von: Jonathan Rochkind</title>
		<link>http://jakoblog.de/2008/01/24/citation-parsing/comment-page-1/#comment-50464</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Thu, 08 May 2008 13:48:36 +0000</pubDate>
		<guid isPermaLink="false">http://jakoblog.de/2008/01/24/citation-parsing/#comment-50464</guid>
		<description>ParCit seems pretty exciting to me. 

My interest in this is with the Umlaut open source link resolver--I&#039;d like to be able to allow users to paste in natural language citations, and then have Umlaut services trigger. Right now, in every link resolver I know of, they need to enter fields themselves.</description>
		<content:encoded><![CDATA[<p>ParCit seems pretty exciting to me. </p>
<p>My interest in this is with the Umlaut open source link resolver&#8211;I&#8217;d like to be able to allow users to paste in natural language citations, and then have Umlaut services trigger. Right now, in every link resolver I know of, they need to enter fields themselves.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Von: Peter</title>
		<link>http://jakoblog.de/2008/01/24/citation-parsing/comment-page-1/#comment-28279</link>
		<dc:creator>Peter</dc:creator>
		<pubDate>Fri, 25 Jan 2008 11:26:34 +0000</pubDate>
		<guid isPermaLink="false">http://jakoblog.de/2008/01/24/citation-parsing/#comment-28279</guid>
		<description>Funny, some weeks ago I did some experimentation with the Citebase module. There was a great variety in the quality of the parsed results: Regular journal articles (in APA style) don&#039;t seem to be problematic but the results get significantly worse for proceedings or chapters in edited books. I&#039;d like to know what algorithms Google, CiteSeer (or ISI) are using...

Peter</description>
		<content:encoded><![CDATA[<p>Funny, some weeks ago I did some experimentation with the Citebase module. There was a great variety in the quality of the parsed results: Regular journal articles (in APA style) don&#8217;t seem to be problematic but the results get significantly worse for proceedings or chapters in edited books. I&#8217;d like to know what algorithms Google, CiteSeer (or ISI) are using&#8230;</p>
<p>Peter</p>
]]></content:encoded>
	</item>
	<item>
		<title>Von: till</title>
		<link>http://jakoblog.de/2008/01/24/citation-parsing/comment-page-1/#comment-28262</link>
		<dc:creator>till</dc:creator>
		<pubDate>Fri, 25 Jan 2008 07:46:51 +0000</pubDate>
		<guid isPermaLink="false">http://jakoblog.de/2008/01/24/citation-parsing/#comment-28262</guid>
		<description>How is &lt;a href=&quot;http://en.wikipedia.org/wiki/Institute_for_Scientific_Information&quot;&gt;ISI&lt;/a&gt; doing that nowadays? Their business is build on citations... When I visited them about 9 years ago they processed mostly paper articles (that must have changed?!?). They scanned them and marked citations on the images manually, then did OCR on those parts and (if I remember correctly) parsed those OCR results automatically with manual postprocessing (to check results) after that. I was quite impressed by the highly ergonomic software they had developed for the whole process.
What are they doing today to get citations? I think they must have good parsers...</description>
		<content:encoded><![CDATA[<p>How is <a href="http://en.wikipedia.org/wiki/Institute_for_Scientific_Information">ISI</a> doing that nowadays? Their business is build on citations&#8230; When I visited them about 9 years ago they processed mostly paper articles (that must have changed?!?). They scanned them and marked citations on the images manually, then did OCR on those parts and (if I remember correctly) parsed those OCR results automatically with manual postprocessing (to check results) after that. I was quite impressed by the highly ergonomic software they had developed for the whole process.<br />
What are they doing today to get citations? I think they must have good parsers&#8230;</p>
]]></content:encoded>
	</item>
</channel>
</rss>

