Ground truth

From DigitWiki
Jump to: navigation, search

Impact Bulgarian Demonstrator Dataset

The Bulgarian ground truth produced by National Library of Bulgaria (NLB) in the frame of the EU funded Impact project consists of 1.276 pages in PAGE XML format with an accuracy of 99.95%, that is a maximum rate of error of 5 characters wrong on 10.000 pages. Read more.

Impact Czech Demonstrator Dataset

The Czech ground truth produced by Národní knihovna České republiky (National Library of Czech Republic - NKC) in the frame of the EU funded Impact project consists of 5.049 pages in PAGE XML format with an accuracy of 99.95%, that is a maximum rate of error of 5 characters wrong on 10.000 pages. Read more.

Impact Dutch Demonstrator Dataset

The Dutch ground truth produced by Koninklijke Bibliotheek (KB) in the frame of the EU funded Impact project consists of 3.439 pages in PAGE XML format with an accuracy of 99.95%, that is a maximum rate of error of 5 characters wrong on 10.000 pages. Read more.

Impact Polish Demonstrator Dataset

The Polish ground truth produced by Poznań Supercomputing and Networking Center (PSNC) in the frame of the EU funded Impact project consists of 4.693 pages in PAGE XML format with an accuracy of 99.95%, that is a maximum rate of error of 5 characters wrong on 10.000 pages. Read more.

Impact Slovene Demonstrator Dataset

The Slovene ground truth produced by the National and University Library of Slovenia (NUK) in the frame of the EU funded Impact project consists of 4.937 pages in PAGE XML format with an accuracy of 99.95%, that is a maximum rate of error of 5 characters wrong on 10.000 pages. Read more.

Impact Spanish Demonstrator Dataset

The Spanish ground truth produced by Universidad de Alicante (UA) in the frame of the EU funded Impact project consists of 11.444 pages in PAGE XML format with an accuracy of 99.95%, that is a maximum rate of error of 5 characters wrong on 10.000 pages. Read more.