Post on 23-Mar-2023
Global Dynamics
of Social Policy CRC 1342
Gabriella Skitalinskaya Nils Düpont
OCR Report
Bremen, February 2021
No.
7W
eSI
S —
Tec
hnic
al p
aper
s
Gabriella Skitalinskaya, Nils DüpontOCR ReportSFB 1342 Technical Paper Series, 7Bremen: SFB 1342, 2021
SFB 1342 Globale Entwicklungsdynamiken von Sozialpolitik / CRC 1342 Global Dynamics of Social Policy
Postadresse / Postaddress: Postfach 33 04 40, D - 28334 Bremen
Website: https://www.socialpolicydynamics.de
[DOI https://doi.org/10.26092/elib/1517][ISSN 2700-0389]
Gefördert durch die Deutsche Forschungsgemeinschaft (DFG) Projektnummer 374666841 – SFB 1342
in alphabetical orderNils Düpont 0000-0002-4766-9540Gabriella Skitalinskaya 0000-0001-5006-6196
[3]SFB 1342 WeSIS – Technical Papers No. 7
OCR RepORtOCR RepORt
Gabriella Skitalinskaya, Nils Düpont
Index
1.IntROduCtIOn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
2.tOOls OveRvIew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
3.expeRImental setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
3.1. Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1. Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2. Output files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.COmpaRatIve analysIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
4.1. Plain text extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2. hOCR extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3. Table extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.COnClusIOn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
6. pRepROCessIng COnsIdeRatIOns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
7.OtheR tOOls and lInks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
[4]
OveRvIew and ReCOmmendatIOnsOveRvIew and ReCOmmendatIOns
Based on sample files of the first co-creation workshop on “Optical Character Recognition(OCR)” on October 30th, 2018, we compare the results of different tools that are avail-able for OCR as of April 2019. Our comparison of optical character recognition software includes:
» OCR engines that do the actual character identification » Layout analysis software that divides scanned documents into zones suitable for OCR » Graphical interfaces to one or more OCR engines
Depending on the capability of each solution, the tools had to perform three tasks:
» Extract plain text » Recognize the text style and its structure (hOCR) » Extract tables incl. the proper layout and corresponding cell data
In general, OCR remains a challenge in computer science and requires a huge amount of training algorithms to achieve sufficient accuracy. While some packages claim to achieve between 97% and 99% accuracy, note that these rates are based on character errors, not word errors.
The tools that were tested are Google Docs OCR, Tesseract, ABBYY FineReader, PDF OCR X, and Adobe Acrobat. They differ widely ranging from command-line tools to ones with advanced graphical user interfaces; some include the option to preprocess images, others not; likewise, some are free and open-source tools, some are commercial software. Depending on the task at hand, different tools achieve satisfiable results: for plain text ex-traction Adobe Acrobat performed quite well. While it is easy to apply, the advantage is that Acrobat is already installed on many CRC member’s working machines. Regarding hOCR, ABBYY FineReader outperforms other tools. The same is true for extracting tables. Being more than 20 years on the market, ABBYY is the most sophisticated and advanced OCR tool offering the most flexibility. Yet, it is also the most expensive one.
Tesseract as a free and open-source tool may perform sufficiently well if the projects are able to control the workflow right from the beginning. If projects plan on digitizing sources and even can control the scanning thereby ensuring a high quality of the source material, Tesseract may be a choice as well.
As a general rule, the better the quality of the source material the better the OCR results. Projects are therefore well advised to acquire the best source material as possible and to clearly define the objective and desired output of their OCR task. Depending on the goal, the amount of material, the standardization and repeatability of the task, projects are also well advised to see if OCR is worth the effort at all – or if the same goal could be achieved by human codings as well. As OCR requires a huge amount of resources for training the al-gorithms to achieve accuracy and quality, A01 cannot contribute to the development of new OCR engines or algorithms per se but can help in finding a proper solution for a project’s tasks and assist in setting up a workflow.
[5]SFB 1342 WeSIS – Technical Papers No. 7
1.1.IntROduCtIOn IntROduCtIOn
On October 30, 2018, the first co-creation session on “Optical Character Recognition” (OCR) took place. During the workshop three points became clear:
1. Vanilla text OCR is not considered a major problem. Projects already have achieved satisfying results for plain text extraction with the tools they have available (i.e. Adobe Acrobat).
2. Extracting figures from PDFs is of interest.3. Extracting tables is the main issue with which the projects are struggling.
OCR in general is an emerging field in computer science and numerous tools are already available. Most of them are commercial tools especially geared towards OCR (e.g. ABBY FineReader), others are “in-built” functions of broader applications (e.g. Adobe Acrobat). Recently, there has been increased research and interest in open-source solutions (e.g. tesseract). All tools differ widely regarding their main objective, their flexibility and their user-friendliness ranging from simple but powerful command-line tools to advanced graphical user-interfaces (GUI).
As OCR requires a large amount of training of the algorithm(s) with many images of each character, simple text recognition has matured to a high degree of accuracy. The difficulty, however, still lies in identifying the structure of a text (font size, captions, two-column layouts etc.) or recognizing tables. Furthermore, the success of an OCR heavily depends on the quality of the source material. For this reason, some tools apply preprocessing techniques to increase picture quality, some can be used to build a workflow around them. Yet, they cannot add what is already lost when having distorted images and blurry scans.
Finally, each approach comes at costs: tools with an advanced GUI are very flexible to fit different images and adjust what is to be recognized. This, however, crosses with recognizing many images at once in a batch-mode. Batch-mode, on the other hand, increases the risk of erroneous OCR if not all images are of same quality. In sum, the “best tool” for OCR heavily depends on the project’s task at hand and the given source material.
For this reason, we will focus our report on comparing tools based on the examples pro-vided by the projects during the workshop by the participants. We will highlight the pros and cons of each tool providing examples of text/table recognition for the files provided.
We have selected the following tools for analysis: Google Docs OCR, Tesseract, ABBYY FineReader, PDF OCR X, Adobe Acrobat. To validate the accuracy, reliability, and perfor-mance of the chosen packages several experiments on real data collected from the projects were performed. In Chapter 4 we discuss the experimental setup and the obtained dataset. In Chapter 5 the qualitative OCR results achieved by the OCR packages/services are re-ported. Subsequently, in Chapter 6 we discuss the results. Chapter 7 contains recommenda-tions based on the outcomes of the analysis. Finally, in Chapter 8 we provide useful links to all the mentioned packages and software tools. The report, thus, gives sufficient information, so that each project may look for the best solution based on their needs.
2.2.tOOls OveRvIewtOOls OveRvIew
Performance of OCR software packages depend on many factors e.g. language, script, imaging conditions, camera orientation, lighting, quality of the printing, handwritten versus types, cursive versus non-cursive. Also, one must take into account the main goals of the
[6]
OCR, whether it is pure OCR (converting documents/images with typed words into machine-readable text) or extracting structured data (layout or tables).
Different OCR engines have different strengths; in this report we therefore explore the ca-pabilities of the following solutions: Google Docs OCR, Tesseract, ABBYY FineReader, PDF OCR X, and Adobe Acrobat. They include free and commercial products and represent high quality solutions in different tasks.
Table 1. Considered OCR packages overview
OS System License Languages Output Formats Table recognition Notes
Tesseract1 WindowsMacLinux
Apache 100+ Text, hOCR, PDF, others
No
Google Drive OCR
Browser Free 200+ Text No Files have to be less than 2 Mb. Output as Google Document.
ABBYY FineReader2
WindowsMacLinux
Proprietary 192,+Fraktur
DOC, DOCX, XLS, XLSX, PPTX, RTF, PDF, HTML, CSV, TXT, ODT, DjVu, EPUB, FB2
Yes
Adobe Acrobat WindowsMac
Proprietary 46 TXT, PDF, hOCR, XLS, PPTX, XML
Yes
gImageReader WindowsMacLinux
GPL 100+ TXT, hOCR, PDF No Front End to Tesseract OCR engine
PDF OCR X Community Edition
WindowsMac
Proprietary 100++ Fraktur
TXT,PDF No Drag and drop UI. Works only for one-pagers.
OCRopus MacLinux
Apache All languages using Latin script,+ Fraktur
TXT, hOCR, PDF No Pluggable framework under active development
Tabula WindowsMacLinux
MIT - CSV Yes Requires JavaConverts tables to csv only for previously OCRed files
pdfsandwich3 MacLinux
GPL 100+ TXT, PDF No Wrapper over tesseract, unpaper
Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License, Version 2.0, and development has been spon-sored by Google since 2006. Tesseract is considered one of the most accurate open-source OCR engines.
Version 4 adds a recurrent neural networks-based OCR engine and models for many ad-ditional languages and scripts, bringing the total to 116 languages. Additionally, scripts for 37 languages are supported making it possible to recognize a language by using the script it is written in.
1 https://opensource.google.com/projects/tesseract2 https://www.abbyy.com/en-eu/finereader/3 http://www.tobias-elze.de/pdfsandwich/
[7]SFB 1342 WeSIS – Technical Papers No. 7
Tesseract is executed from the command-line interface. While Tesseract is not supplied with a GUI, there are many separate projects which provide a GUI for it. Common examples are OCRFeeder or gImageReader. In addition, R and Python wrapper libraries exist.
Google Docs OCR is an easyto-use and highly available OCR service offered by Google within the Google Drive service. One can convert different types of image data into editable text data using Google Drive. Once we upload an image or a PDF file to the Google Drive, we can start the OCR conversion by right-clicking on the file to select “Open with Google Docs” item, then the image is inside a Google Doc document and the extracted text is right below the image.
One can convert .JPEG, .PNG, .GIF, or PDF (multipage documents) files, though the file size should be 2 MB or less. For best results the text should be at least 10 pixels high. Docu-ments must be right-side up. If the image is facing the wrong way, rotating it before upload-ing it to Google Drive is necessary.
Google Drive automatically detects the language of the document.
ABBYY FineReader as an advanced OCR software system considered to be the most diverse and functional commercial option available. It has been improving the main functionalities of optical character recognition for many years, providing promising results in text retrieval from digital images. The underlying algorithms of ABBYY FineReader have not yet been il-lustrated to the research community, and the package is not available as open-source code. Researchers and developers can access the ABBYY FineReader OCR by two different ways:
» The ABBYY FineReader SDK which is available at https://www.abbyy.com/resp/promo/ocr-sdk/, and
» employing a web browser to try it over the Internet at https://finereaderonline.com/en-us/Tasks/Create
Adobe Acrobat Pro is a software suite to foremost work on PDFs but including an OCR system as well. It is used to convert scanned documents, PDF documents, and image docu-ments into editable/searchable documents. It comes in different versions (the most recent version is Acrobat XI Pro). Though it has fewer language options than ABBYY FineReader, Adobe Acrobat Pro is a more pervasive software, partially because it is less academic, and more business-oriented. In addition, on many computers in the CRC it is pre-installed as part of the university’s “Grundaustattung”.
gImageReader is a simple front-end to Tesseract.On top of the features provided by Tesseract gImageReader includes:
» manual recognition area definition; » recognized text displayed directly next to the image; » post-processing the recognized text including spellchecking.
PDF OCR X Community Edition is a simple drag-and-drop utility that converts single-page PDFs and images into text documents or searchable PDF files. It uses advanced OCR tech-nology to extract the text of the PDF even if that text is contained in an image. The Commu-nity Edition supports single-page PDFs and images. For multi-page PDFs and batch conver-sion features one needs to upgrade to the Enterprise Edition.
OCRopus is a free document analysis and optical character recognition system released under the Apache License v2.0 with a very modular design using command-line interfaces. The main components of OCRopus are:
» analysis of the document layout;
[8]
» optical character recognition; » use of statistical language models.
By default, OCRopus comes with a model for English texts and a model for text in Fraktur. These models refer to the script and are largely independent of the actual language. New characters or language variants can be trained either new or in addition. Recent text recog-nition is based on recurrent neural networks (LSTM) and does not require a language model.
Tabula is a drop and drop tool which allows to extract data in CSV-format, through a simple web interface. Windows & Linux users will need a copy of Java installed. Though, Tabula has some limitations:
» Scanned PDFs: Tabula only works on text-based PDFs, not scanned documents. » Multi-line rows: PDFs with multi-line rows (word wrapped text) are often mis-detected,
particularly in tables without graphic row separators. » Automatic table detection is not available. For now, the user has to do a manual
rectangular selection around the candidate tables.
pdfsandwich is a command line tool which is useful for OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries: unpaper, convert, gs, hocr2pdf, and tesseract. It is known to run on Unix systems and has been tested on Linux and MacOS X. It supports parallel processing on multiprocessor systems.
pdfsandwich provides a number of preprocessing procedures to enhance the quality of the scanned pages before text recognition. Most OCR specific preprocessing options are provided via the program unpaper, such as layout optimization, removal of dark edges, and straightening of skewed scans (deskewing). The layout specification, which is switched off by default, allows the options single for each pdf page containing a single scanned page, and double. Note that in certain circumstances rigid preprocessing options can substantially deteriorate results. For instance, if the wrong layout specifications are provided, whole sub-pages might get filtered out, or figures might be considered visual noise and disappear. If the scan quality is good, it is advisable to completely switch off preprocessing. Simple deskewing of a rotated page (without sub-pages) can be obtained by convert, without any involvement of unpaper.
3.3.expeRImental setupexpeRImental setup
For the experiments we have used a dataset consisting of files which each project shared with the group during the workshop. The provided examples are available upon request. Since most of the documents contained multiple pages, we have selected up to four representative pages from each document. This was done to simplify the comparison of the output of each tool.
3.1. Dataset
The following table summarizes the data files used for the comparison and gives additional information about characteristics that affect the results of the OCR.
[9]SFB 1342 WeSIS – Technical Papers No. 7
Table 2. Data characteristics
Incl. Tables DPI Quality Language
A01-1 world-book-1928.pdf - 100 scan English(main) + other
A01-2 world-book-1950.pdf - scan English(main) + other
A01-3 world-book-1968.pdf - 100 scan English(main) + other
A01-4 world-book-1988.pdf - 150 scan English(main) + other
A02-1 ISSA1949.pdf textual 300 scan English
A02-2 ISSA1981.pdf textual 600 scan English
A02-3 USSSA_Replacement rates pensions 1965-75.pdf
numeric 200 scan English
A02-4 USSSA_SSC Expenditures_sel.countries_1990.pdf
numeric 300 scan English
A02-5 USSSA_World Trends SSB 1935-55.pdf
numeric 300 scan English
A02-6 USSSA_World Trends SSC 1976.pdf
numeric 200 scan English
A04-1 Grafiken.pdf numeric 200 scan German
A04-2 Foto.jpg numeric Handheld photo (blur, skew)
English
A04-3 Seiten aus Reichstagspro-tokolle _1882-83 kopie-4.pdf
- 72 scan German Fraktur
A04-4 The-Logic-2_Prezworski.pdf - 300 scan English
A04-5 trendshealthinsur-ance1968_2015.pdf
numeric 300 Text-based pdf English
A06-1 Auto_Outline of Foreign Social Insurance.pdf
textual 200 scan English
A06-2 Outline of Foreign Social In-surance_1940.pdf
textual 200 scan English
A06-3 Pre_Prim_Geneva_1961.pdf - 200 scan English
The provided PDF files include noisy images, blurred images, and multilingual text images. As it can be seen, many files contain tables. Parts of them contain only textual information, others contain numeric data. The majority of the examples are in English, with the exception for a few documents in German and Spanish.
The next table gives an overview about the output formats that were chosen. Plain text focuses on extracting only the characters, whereas hOCR encodes not only the text, but also style properties such as font family, bold or italic, font size, layout information, recognition confidence metrics and other information using Extensible Markup Language (XML) in the form of Hypertext Markup Language (HTML) or XHTML. Tables try to reproduce the table layout and the corresponding cell content.
Table 3. Recommended outputs for each data file
Text hOCR Tables
A01-1 world-book-1928.pdf + + -
A01-2 world-book-1950.pdf + + -
A01-3 world-book-1968.pdf + + -
A01-4 world-book-1988.pdf + + -
A02-1 ISSA1949.pdf + - +
[10]
Text hOCR Tables
A02-2 ISSA1981.pdf + - +
A02-3 USSSA_Relplacement rates pensions 1965-75.pdf + - +
A02-4 USSSA_SSC Expenditures_sel.countries_1990.pdf + - +
A02-5 USSSA_World Trends SSB 1935-55.pdf + - +
A02-6 USSSA_World Trends SSC 1976.pdf + - +
A04-1 Grafiken.pdf + - +
A04-2 Foto.jpg - - +
A04-3 Seiten aus Reichstagsprotokolle _1882-83 kopie-4.pdf + + -
A04-4 The-Logic-2_Prezworski.pdf + - -
A04-5 trendshealthinsurance1968_2015.pdf - - +
A06-1 Auto_Outline of Foreign Social Insurance.pdf + - +
A06-2 Outline of Foreign Social Insurance_1940.pdf + - +
A06-3 Pre_Prim_Geneva_1961.pdf + + -
3.1. Preprocessing
We did not perform preprocessing of the provided data (such as layout optimization, re-moval of dark edges, and deskewing of scans), unless the package tools support them (for example ABBYY FineReader automatically preprocesses files).
Note: For those interested in building their own workflow, the unpaper package is an open source post-processing tool for scanned sheets of paper, especially for book pages that have been scanned from previously created photocopies. The main purpose is to make scanned book pages better read-able on screen after conversion to PDF. Additionally, unpaper might be useful to enhance the quality of scanned pages before performing optical character recognition (OCR). Simple deskewing may also be performed using the convert package.
3.2. Output files
All files are stored in subfolders named after each project and are available upon request. In each folder, the naming convention of the files is as follows:
» The original files (.pdf) shared by each project are listed “as is”. » For each file there may be several processed files with the following extensions: .txt
(for plain text), .xlsx (for table extraction), and/or .html (for hOCR). If the output for one file consists of multiple files, they were added to a subfolder, and the folder’s name indicates the output type.
» To distinguish which tool produced the output, the following prefixes have been added to the original file names: › ABBY – ABBYY FineReader › TESS – Tesseract › ADOBE – Adobe Acrobat › PDF_OCR_X – PDF OCR X Community Edition › TABULA – Tabula
[11]SFB 1342 WeSIS – Technical Papers No. 7
4.4.COmpaRatIve analysIsCOmpaRatIve analysIs
We further analyzed and compared the accuracy and reliability of the Google Drive OCR, Tesseract, ABBYY FineReader, and PDF OCR X using the dataset described in Table 2.
We have divided the experiments into three groups based on the output format:
1. Plain text (TXT, Searchable PDF)2. hOCR (HTML, RTF)3. Table extraction (CSV, XLS, XLSX)
Plain text focuses on extracting only the characters, whereas hOCR encodes not only the text, but also style properties, such as font family, whether the character is bold, italic, font size, layout information, recognition confidence metrics and other information using Extensible Markup Language (XML) in the form of Hypertext Markup Language (HTML) or XHTML. In table extraction we are interested not only the extraction of characters, but also in the extrac-tion of the table structure.
In the table below it is shown which type of output format is supported by each tool.
Table 4. Supported output format by considered packages
Plain text hOCR Tables
Tesseract X X -
ABBYY X X X
Google Drive OCR X - -
Adobe - - X
PDF OCR X CE X - -
Tabula - - X
4.1. Plain text extraction
In this section we will summarize the findings described in the Plain_text_comparison.docx file. The file contains a detailed comparison of the OCR outcomes for each of the selected files in the dataset.
Text extraction. Adobe, Tesseract and PDF OCR X require some post-processing, including deleting unnecessary line splits and merging words with dashes. This is not an issue, since it can be easily fixed with simple scripts in post-processing. On the other hand, Google OCR and ABBYY do not require such post-processing, though in the case of Google, the words which for example contained dashes are not properly merged, the dashes are replaced with spaces (con-solidation becomes con solidation), making it harder to fix such errors. Such limitations are neglectable if one only needs a searchable PDF as output, not a plain .txt document.
Being the most flexible open-source and free tool, Tesseract has been more closely exam-ined. It was found though, that the accuracy was quite low and the most common mistakes are recognizing short words as digits (for example, is – 18 or 15, in – 1n, Minister – IVIinister).
When analyzing the plain text extraction performance, the software tools can be ranked as follows: ABBYY, Google Drive OCR, Adobe, PDF OCR X, Tesseract. For specific tasks/scripts/settings this ranking may slightly change.
[12]
Multilingual files support. Another complication is the ability to select the language of the document and dealing with multi-lingual documents.
Multilingual support is provided only by ABBYY and Tesseract. In Adobe and PDF OCR X only one main language is supported. Google Drive OCR automatically selects the lan-guage of the document, so it is unclear, whether it supports multilingual documents. Judging by the outcome for Spanish documents, where the words with accents were not properly recognized, we conclude that only one main language is supported. No multilingual support poses a problem for documents where important information is present in several languages.
Structured layout. Documents containing more complex layouts, for example documents containing tables or multicolumn layouts appeared to be problematic for different tools with Google Drive OCR struggling the most. Though Google Drive OCR was able to extract text segments and properly capture a two-column layout, it outputs a document in which each paragraph was repeated for 2-4 times. This seems to be a bug. Other tools have performed better, with subtle differences in the order of the extracted text segments.
Examples A02-1, A02-2, A02-5, A02-6 represent documents which consist of tables containing textual data. Such structure poses a huge problem for OCR tools in general. Adobe, PDF OCR X and Google Drive OCR approach the document line-by-line; thus, the structure of the table is unrecognized and the results are non-informative and useless. On the other hand, Tesseract and ABBYY are able to extract the structure and retrieve the data from segments, unlike the line-by-line approach. Documents containing tables with textual data will be analyzed in more detail in section 4.3.
Historic fonts. Another challenge faced by the OCR tools was historic font recognition, such as Fraktum or Gothic fonts used in old German documents. Only Google Drive OCR was able to recognize the characters at an adequate level. The second best is PDF OCR X which recognized the font, but with many errors. ABBYY has an extension4 (not included in the version we have used for testing) which supports historic fonts, and is widely used for this purpose. Adobe and Tesseract completely failed the task, probably due to the models not being trained on such characters.
Fractions extraction. Even though fractions are not common for the majority of the files, it might be useful for some of the projects. In our dataset the file “A04-1 -Grafiken.pdf” con-tains a table with fractions. Looking at the .xlsx output files using ABBYY and Adobe it can be clearly seen that only Adobe was able to capture the fractions properly. Tesseract has also failed to recognize them, and PDF OCR completely ignores tables, so no results are available.
4.2. hOCR extraction
Within the task of hOCR, we would like to analyze the extraction of not only textual informa-tion but also style and layout information. Such information might be useful for further pro-cessing of the data, for example splitting the document by country using the country name or other keywords. Splitting just by the country name may result in unnecessary splits if the country is mentioned somewhere else in the text. Splitting by font size or style (if the headers are visibly different, i.e. a greater font size or bold) may also reduce or completely eliminate unwanted splits. Another example would be simply saving the existing segmentation which
4 https://www.frakturschrift.com/en:start
[13]SFB 1342 WeSIS – Technical Papers No. 7
is lost when converting to plain text files. The next table gives an overview for the documents selected for hOCR.
For hOCR, we considered ABBYY, Adobe and Tesseract, since only they support hOCR.
Table 5. Selected documents for plain text extraction
Comments
A01-1 world-book-1928.pdf -
A01-2 world-book-1950.pdf -
A01-3 world-book-1968.pdf Handwriting or stamp on page messes up output
A01-4 world-book-1988.pdf -
A06-3 Pre_Prim_Geneva_1961.pdf -
Capturing fonts styles. Even though Adobe is better at capturing the font styles, Tesseract is better at extracting the paragraphs and layout structure, and quality of the texts per se is bet-ter. Yet, ABBYY performs best in terms of both capturing font styles and text quality.
Capturing layout segments ordering. Sometimes the ordering of the text segments is mixed up, for example a table will be added lastly to the page, not in the order as it was in the original text. In Adobe and Tesseract the user cannot influence the ordering, but in ABBYY the user can assign the order of each text fragment for every page of the input file using the user interface.
4.3. Table extraction
The difference in the quality of the text extracted by the OCR tools has been discussed in section 4.1. Here, we will mainly focus on the ability of the considered tools to capture the structure of tables. We considered ABBYY and Adobe, since only they support the extraction of tables. Each tool supports the following outputs:
» the whole file as a single xls file; » each page as a separate xls file; » each table as a separate xls file.
To minimize the number of generated files we have selected only the first two available options.In Table 6. it is shown whether the tool managed to find the tabular structure and properly
extract it. It can be seen that in general ABBBYY is able to extract the tables in all the con-sidered examples, even in challenging files such as “A04-2 Foto.jpg”, where the file itself represents a handheld photo of a skewed page with blurry edges.
Another important point to be highlighted is that when using Adobe, the user cannot tune the output. Thus, if the software fails to capture the structure properly nothing can be done. This limitation is overcome in ABBYY. Even though ABBYY can detect rows automatically by re-lying on black separators and white gaps, the interface also allows the user to edit the extracted structure and specify how the table should be divided into rows, for example by adding new separators or merging existing cells. The flip side of this flexibility is the manual adjustment it requires for every table. Depending on the number of tables to be extracted it then becomes a question of weighing the available resources against the efforts the OCR requires.
A final tool we mentioned in the overview is Tabula. Tabula only works on text-based PDFs, not scanned documents. Thus, it would work for examples like A04-5, but not for the rest of the examples unless they have already been OCRed. The quality of the extracted
[14]
structures for such documents is limited though and comparable to the output produced by Adobe. The next table summarizes our result whereby “+” means that the tool has achieved meaningful results for the document, while “-“ means that the outcome was of poor quality.
Table 6. Performance of selected tools for table extraction by file
ABBYY ADOBE
A02-1 ISSA1949.pdf + -
A02-2 ISSA1981.pdf + -
A02-3 USSSA_Relplacement rates pensions 1965-75.pdf + +
A02-4 USSSA_SSC Expenditures_sel.countries_1990.pdf + +
A02-5 USSSA_World Trends SSB 1935-55.pdf + +
A02-6 USSSA_World Trends SSC 1976.pdf + -
A04-1 Grafiken.pdf + +
A04-2 Foto.jpg + -
A04-5 trendshealthinsurance1968_2015.pdf + +
A06-1 Auto_Outline of Foreign Social Insurance.pdf + -
A06-2 Outline of Foreign Social Insurance_1940.pdf + -
5.5.COnClusIOnCOnClusIOn
We performed a qualitative comparative analysis of four OCR solutions, including Google Docs OCR, Tesseract, ABBYY FineReader, and PDF OCR X using a dataset containing typical cases of interest of the workshop participants.
Based on our experimental evaluations using the mentioned dataset without employing advanced image processing procedures (e.g. denoising, image registration), the Google Docs OCR and ABBYY FineReader produced the most promising results in plain text OCR. Though one of the main limitations of Google Drive OCR is the 2Mb file size limit.
In the task of hOCR, ABBYY is considered the best solution in terms of text and layout quality, and it is more sensitive to the font styles than Adobe or Tesseract. The second-best solution is Adobe Acrobat: though the text quality is lower, the extracted layout and styles are close to the results produced by ABBYY. Neither Adobe Acrobat nor Tesseract have the option to set the order of the extracted text fragments of pages manually.
Looking at the results of table extraction, it can be seen that the accuracy of ABBYY is higher than that of Adobe. Most importantly, ABBYY supports a user-interface which allows to easily correct the automatically extracted table structure before outputting the result. This allows more flexibility when dealing with challenging files, though at the cost of preventing batch-processing a large number of tables at once.
In sum, the performance of OCR solutions depends on many factors ranging from the quality of the source material to the objective of OCR. Thus said, there is no “one fits all” solution. Even the most advanced software packages may not be able to perform well in different tasks and settings.
Looking at the results it can be seen that the ABBYY FineReader either outperforms other solutions or provides more flexibility to tune the results to the user’s needs. Though it is quite advanced it is one of the most expensive software solutions available. Its quality is underlined by the fact that it is also used by the Staats- und Universitätsbibliothek Bremen. Thus, in case
[15]SFB 1342 WeSIS – Technical Papers No. 7
projects decide to choose ABBYY for their work, they may contact and discuss their task with the library.
If the quality provided by Adobe Acrobat is sufficient, the CRC projects should be able to use it as CRC members should already have a copy installed on their working machines.
When comparing commercial products to open-source solutions it can be seen that there is still a gap in text recognition and flexibility. If the projects decide to proceed with Tesseract, A01 may help in establishing workflows based on short scripts for R or Python and guidelines to simplify the OCR processing and preprocessing of images or recommend already existing user interfaces.
6.6. pRepROCessIng COnsIdeRatIOns pRepROCessIng COnsIdeRatIOns
As we have seen in the experiments, the quality of input images has a great impact on the OCR outputs. All of the examined OCR tools have performed worse when dealing with skewed, blurred, noisy images and texts with fonts smaller than 10pt. Sharp images with even lighting and clear contrasts work best. A resolution of 300 DPI is also recommended. Hence, consider the two most important factors affecting the accuracy of OCR:
» Textual Considerations Special fonts (typewriter, fraktum), super small fonts (6pt), and low contrast text can all decrease the accuracy of the OCR software. Sometimes, OCR software will not be helpful to use at all.
» Scanning Considerations Getting a quality image is the first step in having the best and most accurate OCR experience. Consider such things as resolution, brightness, straightness, and discol-oration before you scan your text. Rather refrain from performing OCR on handheld photos and low quality images.
Remember that software packages may boast between 97% and 99% accuracy. However, these rates are based on character errors, not word errors!
7.7.OtheR tOOls and lInksOtheR tOOls and lInks
Europeana Newspapers - http://www.europeana-newspapers.eu/public-materials/tools/ - The Europeana Newspapers project has developed a number of free and open source software tools, which help digitize historical newspapers. Projects interested in the OCR of old documents and manual document segmentation should take a look at the tools and interfaces they offer.
Online OCR - https://ocr.space/ - The OCR.space Online OCR service converts scans or (smartphone) images of text documents into editable ones. They claim that the recognition quality is comparable to commercial OCR SDK software. You can test some files online for free, though file size limitations may apply. The engine supports creating searchable PDF, plain text files and json files. Via testing we found out that it struggles with files that have multiple columns and contain tables at the same time. In this case the tool analyses the file line by line, creating useless content.
[16]
OCR examplesOCR examplesFi
le A
06 -
Pre_
Prim
_Gen
eva_
1961
Ad
obe
TESS
PD
F OC
R X
ABBY
Y Go
ogle
Driv
e OC
R Th
is fil
e ex
ampl
e co
ntai
ns o
nly
text
. Th
e pa
ges a
re b
adly
crop
ped,
thus
the
last
lin
e is
som
etim
es ch
oppe
d in
hal
f or
miss
ing.
It
can
be se
en th
at T
ESS
and
OCR
X re
quire
som
e po
st-p
roce
ssin
g, in
cludi
ng
dele
ting
unne
cess
ary
line
split
s and
mer
ging
w
ords
with
das
hes.
On th
e ot
her A
dobe
, Go
ogle
OCR
and
ABB
YY d
o no
t req
uire
such
po
st-p
roce
ssin
g.
In th
e ot
her 1
7 co
untr
ies t
here
are
eith
er
only
pub
lic p
re-p
rimar
y es
tab-
lishm
ents
(in
10
coun
trie
s, na
mel
y Al
bani
a,
Bulg
aria
, Bye
loru
ssia
, Cze
cho-
slova
kia,
Hu
ngar
y, P
olan
d, R
uman
ia, U
krai
ne, U
SSR
and
Yugo
slavi
a) o
r onl
y pr
ivat
e es
tabl
ishm
ents
(in
7 co
untr
ies,
nam
ely
Ceyl
on, I
cela
nd, L
eban
on, F
eder
atio
n of
M
alay
a, N
icara
gua,
Tur
key,
Uni
on o
f Bu
rma)
. Inc
iden
tally
, in
Aust
ralia
and
Ca
nada
alm
ost a
ll, a
nd in
New
Zea
land
all
the
esta
blish
-men
ts fo
r chi
ldre
n un
der 5
ye
ars o
f age
bel
ong
to th
e pr
ivat
e ca
tego
ry.
As re
gard
s the
diff
eren
t kin
ds o
f es
tabl
ishm
ent,
it is
mor
e di
fficu
lt to
cla
ssify
the
coun
trie
s sin
ce m
ost o
f the
m
have
diff
eren
t typ
es o
f ins
titut
ion
vary
ing
in n
ame
but w
hich
are
muc
h th
e sa
me
in
all t
he co
untr
ies.
If on
ly e
duca
tiona
l es
tabl
ishm
ents
be
cons
ider
ed (t
hat i
s, ex
cludi
ng cr
eche
s, da
y nu
rser
ies a
nd th
e ho
mes
for c
hild
ren)
, the
mos
t usu
al
term
s em
ploy
ed a
re k
inde
rgar
tens
and
nu
rser
y sc
hool
s. In
a fe
w ca
ses t
he te
rms
infa
nt sc
hool
s or i
nfan
t cla
sses
are
ad
opte
d.
Of th
e 52
coun
trie
s whi
ch sp
eak
of
kind
erga
rten
s in
thei
r rep
lies,
two-
th
irds a
pply
this
term
to a
ll th
eir p
re-
prim
ary
esta
blish
men
ts. T
he o
ther
r.0
11nt
riPc:
11
111
lv m
lrP
rlic:
:.tln
rtiA
n hP
tu,P
Pn th
P lri
nrlP
ro!lr
tPn
-:ln
rl th
P n1
1rP
r"
In th
e ot
her 1
7 co
untr
ies t
here
are
eith
er
only
pub
lic p
re-p
rimar
y es
tab-
lis
hmen
ts (i
n 10
coun
trie
s, na
mel
y Al
bani
a, B
ulga
ria, B
yelo
russ
ia, C
zech
o-
slova
kia,
Hun
gary
, Pol
and,
Rum
ania
, Uk
rain
e, U
SSR
and
Yugo
slavi
a) o
r on
ly p
rivat
e es
tabl
ishm
ents
(in
7 co
untr
ies,
nam
ely
Ceyl
on, I
cela
nd,
Leba
non,
Fe
dera
tion
of M
alay
a, N
icara
gua,
Tur
key,
Un
ion
of B
urm
a). I
ncid
enta
lly,
in A
ustr
alia
and
Can
ada
alm
ost a
ll, a
nd in
Ne
w Z
eala
nd a
ll th
e es
tabl
ish-
men
ts fo
r chi
ldre
n un
der 5
yea
rs o
f age
be
long
to th
e pr
ivat
e ca
tego
ry.
As re
gard
s the
diff
eren
t kin
ds o
f es
tabl
ishm
ent,
it is
mor
e di
flicu
lt to
cla
ssify
the
coun
trie
s sin
ce m
ost o
f the
m
have
diff
eren
t typ
es o
f ins
titut
ion
vary
ing
in n
ame
but w
hich
are
muc
h th
e sa
me
in a
ll th
e co
untr
ies.
If on
ly'
educ
atio
nal e
stab
lishm
ents
be
cons
ider
ed (t
hat i
s, ex
cludi
ng cr
éche
s, da
y nu
rser
ies a
nd th
e ho
mes
for c
hild
ren)
, th
e m
ost u
sual
term
s em
ploy
ed a
re
kind
erga
rten
s and
nur
sery
scho
ols.
In a
fe
w ca
ses t
he te
rms i
nfan
t sch
ools
or in
fant
clas
ses a
re a
dopt
ed.
Of th
e 52
coun
trie
s whi
ch sp
eak
of
kind
erga
rten
s in
thei
r rep
lies,
two-
th
irds a
pply
this
term
to a
ll th
eir p
re-
prim
ary
esta
blish
men
ts. T
he o
ther
(‘n
nnh‘
iPc n
mm
llv m
ain:
2 d
ictin
r‘fin
n lw
hvr-m
n H1
9 lzi
nrlp
rnar
fpn
and
flu:
nnrc
m-v
In th
e ot
her 1
7 co
untr
ies t
here
are
ei
ther
onl
y pu
blic
pre-
prim
ary
esta
b-
lishm
ents
(in
10 co
untr
ies,
nam
ely
Alba
nia,
Bul
garia
, Bye
loru
ssia
, Cz
echo
- slo
vaki
a, H
unga
ry, P
olan
d, R
uman
ia,
Ukra
ine,
USS
R an
d Yu
gosla
via)
or
only
priv
ate
esta
blish
men
ts (i
n 7
coun
trie
s, na
mel
y Ce
ylon
, Ice
land
, Le
bano
n,
Fede
ratio
n of
Mal
aya,
Nica
ragu
a,
Turk
ey, U
nion
of B
urm
a).
Incid
enta
lly,
in A
ustr
alia
and
Can
ada
alm
ost a
ll,
and
in N
ew Z
eala
nd a
ll th
e es
tabl
ish-
men
ts fo
r chi
ldre
n un
der 5
yea
rs o
f ag
e be
long
to th
e pr
ivat
e ca
tego
ry.
As re
gard
s the
diff
eren
t kin
ds o
f es
tabl
ishm
ent,
it is
mor
e di
fficu
lt to
cla
ssify
the
coun
trie
s sin
ce m
ost o
f th
em h
ave
diffe
rent
type
s of
inst
itutio
n va
ryin
g in
nam
e bu
t whi
ch a
re m
uch
the
sam
e in
all
the
coun
trie
s. If
only
ed
ucat
iona
l est
ablis
hmen
ts b
e co
nsid
ered
(tha
t is,
exclu
ding
cr
éche
s, da
y nu
rser
ies a
nd th
e ho
mes
for
child
ren)
, the
mos
t usu
al te
rms
empl
oyed
are
ki
nder
gart
ens a
nd n
urse
ry sc
hool
s. In
a
few
case
s the
term
s inf
ant s
choo
ls or
infa
nt cl
asse
s are
ado
pted
. Of
the
52 co
untr
ies w
hich
spea
k of
ki
nder
gart
ens i
n th
eir r
eplie
s, tw
o-
third
s app
ly th
is te
rm to
all
thei
r pre
-pr
imar
y es
tabl
ishm
ents
. The
oth
er
coun
trie
s nen
ally
mak
e a
dict
inet
inn
hetw
een
the
Linde
roar
ten
and
the
nurc
erw
In th
e ot
her 1
7 co
untr
ies t
here
are
ei
ther
onl
y pu
blic
pre-
prim
ary
esta
blish
men
ts (i
n 10
coun
trie
s, na
mel
y Al
bani
a, B
ulga
ria,
Byel
orus
sia, C
zech
oslo
vaki
a,
Hung
ary,
Pol
and,
Rum
ania
, Ukr
aine
, US
SR a
nd Y
ugos
lavi
a) o
r onl
y pr
ivat
e es
tabl
ishm
ents
(in
7 co
untr
ies,
nam
ely
Ceyl
on, I
cela
nd, L
eban
on,
Fede
ratio
n of
Mal
aya,
Nica
ragu
a,
Turk
ey, U
nion
of B
urm
a).
Incid
enta
lly, i
n Au
stra
lia a
nd C
anad
a al
mos
t all,
and
in N
ew Z
eala
nd a
ll th
e es
tabl
ishm
ents
for c
hild
ren
unde
r 5
year
s of a
ge b
elon
g to
the
priv
ate
cate
gory
. As
rega
rds t
he d
iffer
ent k
inds
of
esta
blish
men
t, it
is m
ore
diffi
cult
to
class
ify th
e co
untr
ies s
ince
mos
t of
them
hav
e di
ffere
nt ty
pes o
f in
stitu
tion
vary
ing
in n
ame
but w
hich
ar
e m
uch
the
sam
e in
all
the
coun
trie
s. If
only
edu
catio
nal
esta
blish
men
ts b
e co
nsid
ered
(tha
t is,
exc
ludi
ng cr
eche
s, da
y nu
rser
ies
and
the
hom
es fo
r chi
ldre
n), t
he
mos
t usu
al te
rms e
mpl
oyed
are
ki
nder
gart
ens a
nd n
urse
ry sc
hool
s. In
a
few
case
s the
term
s inf
ant s
choo
ls or
infa
nt cl
asse
s are
ado
pted
. Of
the
52 co
untr
ies w
hich
spea
k of
ki
nder
gart
ens i
n th
eir r
eplie
s, tw
o-th
irds a
pply
this
term
to a
ll th
eir p
re-
prim
ary
esta
blish
men
ts. T
he o
ther
m
nntr
ipc l
isnal
lv m
at?
n rli
ctin
r-tin
n V\
pt\x
7f»F
»r>
tin*
Vinr
lf»rr
»art
£>n
or>A
tlif>
tinr
cpm
In th
e ot
her 1
7 co
untr
ies t
here
are
ei
ther
onl
y pu
blic
pre-
prim
ary
esta
b lis
hmen
ts (i
n 10
coun
trie
s, na
mel
y Al
bani
a, B
ulga
ria, B
yelo
russ
ia, C
zech
o slo
vaki
a, H
unga
ry, P
olan
d, R
uman
ia,
Ukra
ine,
USS
R an
d Yu
gosla
via)
or
only
priv
ate
esta
blish
men
ts (i
n 7
coun
trie
s, na
mel
y Ce
ylon
, Ice
land
, Le
bano
n, F
eder
atio
n of
Mal
aya,
Ni
cara
gua,
Tur
key,
Uni
on o
f Bur
ma)
. In
ciden
tally
, in
Aust
ralia
and
Can
ada
alm
ost a
ll, a
nd in
New
Zea
land
all
the
esta
blish
men
ts fo
r chi
ldre
n un
der 5
ye
ars o
f age
bel
ong
to th
e pr
ivat
e ca
tego
ry.
As re
gard
s the
diff
eren
t kin
ds o
f es
tabl
ishm
ent,
it is
mor
e di
fficu
lt to
cla
ssify
the
coun
trie
s sin
ce m
ost o
f th
em h
ave
diffe
rent
type
s of
inst
itutio
n va
ryin
g in
nam
e bu
t whi
ch
are
muc
h th
e sa
me
in a
ll th
e co
untr
ies.
If on
ly e
duca
tiona
l es
tabl
ishm
ents
be
cons
ider
ed th
at is
, ex
cludi
ng cr
èche
s, da
y nu
rser
ies a
nd
the
hom
es fo
r chi
ldre
n), t
he m
ost
usua
l ter
ms e
mpl
oyed
are
ki
nder
gart
ens a
nd n
urse
ry sc
hool
s. In
a
few
case
s the
term
s inf
ant s
choo
ls or
infa
nt cl
asse
s are
ado
pted
. Of
the
52 co
untr
ies w
hich
spea
k of
ki
nder
gart
ens i
n th
eir r
eplie
s, tw
o th
irds a
pply
this
term
to a
ll th
eir p
re-
prim
ary
esta
blish
men
ts. T
he o
ther
co
untr
ies u
sual
ly m
ake
a di
stin
ctio
n be
twee
n th
e ki
nder
gart
en a
nd th
e nu
rser
y
[17]SFB 1342 WeSIS – Technical Papers No. 7
File
A02
-
Ado
be
TESS
PD
F O
CR X
A
BBYY
G
oogl
e D
rive
OCR
Th
is r
epre
sent
s a
tabl
e co
ntai
ning
text
ual
data
, thu
s po
ses
a hu
ge p
robl
em to
OCR
to
ols.
It c
an b
e se
en th
at th
e A
dobe
, PD
F O
CR X
and
Goo
gle
appr
oach
the
docu
men
t lin
e by
line
, thu
s th
e st
ruct
ure
of th
e ta
ble
is u
nrec
ogni
zed
and
the
resu
lts
are
non-
info
rmat
ive
and
usel
ess.
On
the
othe
r ha
nd, T
esse
ract
and
ABB
YY a
re a
ble
to
extr
act t
he s
truc
ture
and
ret
riev
e th
e da
ta
from
cel
l, un
like
the
line
by li
ne a
ppro
ach.
Th
ough
the
orde
r of
the
cells
may
not
be
corr
ect,
thou
gh in
the
case
of A
BBYY
this
ca
n be
man
ually
adj
uste
d, if
nec
essa
ry, a
s w
ell a
s ad
just
ing
the
tabl
e st
ruct
ure(
i.e.
addi
ng/r
emov
ing
colu
mn/
row
spl
itter
s,
mer
ging
/spl
ittin
g ce
lls)
It c
an b
e se
en th
at T
ESS
and
OCR
X
requ
ire
som
e po
st-p
roce
ssin
g, in
clud
ing
dele
ting
unne
cess
ary
line
split
s an
d m
ergi
ng w
ords
with
das
hes.
On
the
othe
r A
dobe
, Goo
gle
OCR
and
ABB
YY d
o no
t re
quir
e su
ch p
ost-
proc
essi
ng.
Bene
fit A
mou
nt Q
ualif
ying
con
ditio
ns
Adm
inis
trat
ion
Dur
atio
n Pa
rtia
l un
empl
oym
ent Q
ualif
ying
per
iod
Wai
ting
perio
d D
isqu
alifi
catio
ns
Bene
fits
for t
otal
une
m- N
one
spec
ified
___
··-··
·-·-··
Non
e re
quir
ed
for
wor
ker
Non
e. N
o be
nefit
is p
aid
Tem
pora
ry: R
efus
al o
f Min
iste
r of
Labo
r an
d pl
oym
ent a
re p
aid
on a
cu
rren
tly c
over
ed in
for
1 da
y's
unem
ploy
- sui
tabl
e w
o r
k, 4
-13
Soci
al
Wel
fare
, gen
eral
dai
ly b
asis
. gen
eral
so
cial
sec
urity
men
t in
wee
k, u
nles
s it
wee
ks (l
onge
r if
re- s
uper
visi
on.
Nat
iona
l Onl
y fu
ll da
ys•
of u
nem
- pr
ogra
m. i
s th
e fir
st o
r la
st d
ay
peat
ed);
volu
ntar
y qu
it- S
ocia
l Sec
urity
O
ffic
e, p
loym
ent a
re n
orm
ally
Per
sons
no
t cur
rent
ly c
on- o
f a 3
-day
per
iod
of
un- t
ing
with
out g
ood
rea-
pub
lic-la
w
agen
cy in
con
side
red
days
of u
n-
trib
utin
g ar
e en
title
d to
em
ploy
men
t,
eon,
4-1
3 w
eeks
(lon
ger M
inis
try,
co
llect
s co
nem
ploy
men
t, b
ut h
alf-
be
nefit
if a
ble
to w
ork
if re
peat
ed);
in
com
- tri
butio
ns fo
r al
l pro
days
can
be
adde
d to
and
if th
ey w
ere
mem
- ple
te,
in c
orre
ct,
or g
ram
s an
d di
sbur
ses
form
full
days
if p
art o
f her
s of
an
unem
ploy
- lat
e de
clar
atio
n by
in•
them
to n
atio
nal a
da fi
xed
patt
ern
of
un- m
ent f
und
betw
een
Jan-
sur
ed, 1
-1
3 w
eeks
min
istr
ativ
e bo
dies
. em
ploy
men
t due
"to
in- u
ary
1, 1
938,
an
d M
ay (l
onge
r if r
epea
ted)
; ob-
Fun
d fo
r Mai
nten
ance
term
itten
t wor
k or
ro
ta- 1
0, 1
940,
or w
ho p
aid
at ta
inin
g fa
lse
stam
ping
of t
he U
nem
ploy
ed,
tion
of w
ork�
Bene
fits
for
tota
l une
m—
pl
oyrn
ent a
re p
aid
on a
da
ily b
asis
. ‘
Onl
y fu
ll da
ys/o
f une
m-
ploy
men
t are
nor
mal
ly
cons
ider
ed d
ays
of u
n-
empl
oym
ent,
but
hal
f-
tion
of w
ork;
For
por
t—
wor
kers
, hal
f—da
ys m
ay
be c
onsi
dere
d fo
r be
nefit
. W
hen
part
ial \
mem
ploy
- m
ent o
f 1 d
ay a
wee
k oc
curs
, thi
s is
con
side
red
a da
y of
une
mpl
oym
ent
whe
n ac
com
pani
ed b
y at
leas
t 2 a
dditi
onal
da
ys in
sam
e w
eek.
Sp
ecia
l reg
ulat
ions
ove
rn
seas
onal
and
, om
e w
orke
rs. T
he la
tter
m
ust h
ave
‘1 fu
ll w
eek
of _
une
mpl
oym
ent
obta
in b
enefi
t. ‘
Bene
fit
Am
ount
Qua
lifyi
ng c
ondi
tions
A
dmin
istr
atio
n oo
Dur
atio
n te
nt P
artia
l une
mpl
oym
ent
Qua
lifyi
ng p
erio
d W
aitin
g pe
riod
D
isqu
alifi
catio
ns
BS a
s Be
nefit
s fo
r to
tel u
nem
-|N
one
spec
ified
.__.
_....
. Non
e re
quir
ed fo
r w
orke
r|N
one.
No
bene
fit is
pa
id|T
empo
rary
: Ref
usal
of|
Min
iste
r of
Lab
or -a
nd
r In
is- p
loym
ent a
re p
aid
on a
cu
rren
tly c
over
ed in
| fo
r 1
day’
s un
empl
oy-|
sui
tabl
e w
ork,
4-1
3]
Soci
al W
elfa
re, g
ener
al
ais
daily
bas
is: :
gen
eral
soc
ial
secu
rity
] men
t in
wee
k, u
nles
s it}
w
eeks
“(lo
nger
if .
re-|
sup
ervi
sion
. N
atio
nal
“al y
| O
nly
-ful
l -da
ys:o
f une
m-|
pr
ogra
m. i
s th
e fir
st o
r la
st d
ay|
peat
ed);
volu
ntar
y qu
it-|
Soci
al
Secu
rity
Off
ice,
P
Ow
-| p
loym
ent a
re n
orm
ally
, Pe
rson
s no
t cur
rent
ly c
on-|
of a
3-
day
peri
od o
f un-
|. ti
ng w
ithou
t goo
d re
a-|
publ
ic-la
w a
genc
y in
- c
onsi
dere
d :d
ays_
of u
n- .
trib
utin
g ar
e en
title
d to
] em
ploy
men
t. s
on, 4
-13
wee
ks: (
long
er in
istr
y, c
olle
cts
con-
it:
2 e
mpl
oym
ent,
.-bu
t®-h
aif-
| be
nefit
if a
ble
to w
ork’
if
repe
ated
);.in
com
-| tr
ibut
ions
for
‘all
pro-
da
ys“c
an -b
e ad
ded
to a
nd ‘i
f the
y w
ere
mem
i- pl
ete,
inoo
rrec
t, o
r).
gram
s ©
and
« d
isbu
rses
au
ary
form
fu d
ays
if Pa
rt o
f] b
ers
of
an u
nem
ploy
- ate
dec
lara
tion
by in
- th
em to
nat
iona
l ad-
ns
uran
ce, J
anua
ry 1
949-
Cont
inue
d
Bene
fit
Am
ount
Qua
lifyi
ng c
ondi
tions
A
dmin
istr
atio
n
Dur
atio
n Pa
rtia
l une
mpl
oym
ent
Qua
lifyi
ng p
erio
d W
aitin
g pe
riod
D
isqu
alifi
catio
ns
Bene
fits
for
tota
l une
m- N
one
spec
ified
___
··-··
·-·-··
Non
e re
quir
ed
for
wor
ker
Non
e. N
o be
nefit
is p
aid
Tem
pora
ry: R
efus
al o
f Min
iste
r of
La
bor
and
ploy
men
t are
pai
d on
a
curr
ently
cov
ered
in fo
r 1
day'
s un
empl
oy- s
uita
ble
w o
r k,
4-1
3 So
cial
Wel
fare
, gen
eral
dai
ly b
asis
. ge
nera
l soc
ial s
ecur
ity m
ent i
n w
eek,
un
less
it w
eeks
(lon
ger
if re
- su
perv
isio
n. N
atio
nal O
nly
full
days
• of
une
m- p
rogr
am. i
s th
e fir
st o
r la
st
day
peat
ed);
volu
ntar
y qu
it- S
ocia
l Se
curi
ty O
ffic
e, p
loym
ent a
re
norm
ally
Per
sons
not
cur
rent
ly c
on-
of a
3-d
ay p
erio
d of
un-
ting
with
out
good
rea
- pub
lic-la
w a
genc
y in
co
nsid
ered
day
s of
un-
trib
utin
g ar
e en
title
d to
em
ploy
men
t, e
on, 4
-13
wee
ks (l
onge
r M
inis
try,
col
lect
s co
n-
empl
oym
ent,
but
hal
f- b
enef
it if
abl
e to
wor
k if
repe
ated
); in
com
- tr
ibut
ions
for
all p
ro- d
ays
can
be
adde
d to
and
if th
ey w
ere
mem
- pl
ete,
in c
orre
ct,
o r
gra
ms
and
disb
urse
s fo
rm fu
ll da
ys if
par
t of
hers
of a
n un
empl
oy- l
ate
decl
arat
ion
by in
• th
em to
nat
iona
l ad
- a fi
xed
patt
ern
of u
n- m
ent f
und
betw
een
Jan-
s u
r e
d ,
1-1
3 w
e e
k
s m
inis
trat
ive
bodi
es. e
mpl
oym
ent
due
"to
in- u
ary
1, 1
938,
and
May
(lo
nger
if r
epea
ted)
; ob-
Fun
d fo
r M
aint
enan
ce te
rmitt
ent w
ork
or
rota
- 10,
194
0, o
r w
ho p
aid
at ta
inin
g fa
lse
stam
ping
of t
he U
n em
ploy
ed ,
tion
of w
ork�
For
por
t- le
ast
3 m
onth
s of
con
- of c
ontr
ol c
ard,
etc
., un
der
Min
iste
r, m
ain-
wor
kers
, hal
f-da
ys m
ay tr
ibut
ions
for
old-
age
1-13
w
eeks
(lon
ger
if ta
ins
publ
ic e
mpl
oy-
be c
onsi
dere
d fo
r be
nefit
. pen
sion
s be
twee
n Ja
n- r
epea
ted)
; beg
ging
, 13
men
t off
ices
, con
duct
s
[18]
File
A04
- Gr
afik
en
Adob
e TE
SS
PDF O
CR X
AB
BYY
Goog
le D
rive
OCR
OCR
X ig
nore
d th
e ta
ble
on th
e pa
ge
It ca
n be
seen
that
TES
S an
d PD
F OC
R X
requ
ire so
me
post
-pro
cess
ing,
inclu
ding
de
letin
g un
nece
ssar
y lin
e sp
lits a
nd
mer
ging
wor
ds w
ith d
ashe
s. On
the
othe
r Ad
obe,
Goo
gle
OCR
and
ABBY
Y do
not
re
quire
such
pos
t-pro
cess
ing.
Dass
der
Bun
desk
anzle
r zum
Beg
inn
der
Legi
slatu
rper
iode
ein
e Re
gier
ungs
erkl
ärun
g vo
r dem
Deu
tsch
en
Bund
esta
g ab
gibt
, ist
ein
e se
lbst
vers
tänd
liche
par
lam
enta
risch
e Tr
aditi
on. S
elbs
tver
stän
dlich
e Pr
axis
ist
über
dies
gew
orde
n, d
ass d
ie
Regi
erun
gser
klä-
ru
ng st
ets d
en le
tzte
n Ak
t der
Re
gier
ungs
bild
ung
dars
tellt
. Dem
nac
er
folg
t zun
ächs
t die
Wah
l Ern
ennu
ng u
nd
Vere
idig
ung
des B
unde
skan
zlers
, dan
n di
e Er
nenn
ung
und
V er
eidi
gng
der
Bu
ndes
min
ister
, und
ers
t im
Ans
chlu
ss
dara
n gi
bt d
er B
unde
skan
zler s
eine
ers
te
Regi
erun
gser
klär
ung
ab. B
isher
hab
en si
ch
jede
nfal
ls al
le B
unde
skan
zler a
n di
ese
zeitl
iche
Abfo
lge
geha
lten.
Ke
ine
allg
emei
ne P
raxis
hat
sich
alle
rdin
gs
bezü
glich
der
Fra
ge h
erau
sgeb
ildet
,. in
nerh
alb
wel
cher
Fris
t der
Kan
zler s
eine
Re
gier
ungs
erkl
ärun
g ab
gebe
n so
ll. K
onra
d Ad
enau
er in
form
iert
e de
n Bu
ndes
tag
im
Jahr
194
9 nu
r fün
f Tag
e na
ch se
iner
Wah
l üb
er se
in R
egie
rung
spro
gram
m, w
ähre
nd
sich
Helm
ut K
ohl 1
983
nach
sein
er W
ahl
ganz
e 37
Tag
e Ze
it lie
ß.
Dass
der
Bun
desk
anzle
r zum
Beg
inn
der
Legi
slatu
rper
iode
ein
e Re
gier
ungs
erkl
'a'ru
ng
vor d
em D
euts
chen
Bun
dest
ag a
bgib
t, ist
ei
ne se
lbst
vers
tänd
liche
par
lam
enta
risch
e Tr
aditi
on. S
elbs
tver
stän
dlich
e Pr
axis
ist
über
dies
gew
orde
n, d
ass d
ie
Regi
erun
gser
klä-
ru
ng st
ets d
en le
tzte
n Ak
t der
Re
gier
ungs
bild
ung
dars
tellt
. Dem
nach
er
folg
t zun
ächs
t di
e W
ahl,
Erne
nnun
g un
d Ve
reid
igun
g de
s Bu
ndes
kanz
lers
, dan
n di
e Er
nenn
ung
und
Vere
idig
ung
der B
unde
smin
ister
, und
ers
t im
Ans
chlu
ss d
aran
gib
t der
Bun
desk
anzle
r se
ine
erst
e Re
gier
ungs
erkl
ärun
g ab
. Bish
er
habe
n sic
h je
denf
alls
alle
Bun
desk
anzle
r an
di
ese
zeitl
iche
Abfo
lge
geha
lten.
Ke
ine
allg
emei
ne P
raxis
hat
sich
alle
rdin
gs
bezü
glich
der
Fra
ge h
erau
sgeb
ildet
, in
nerh
alb
wel
cher
Fris
t der
Kan
zler s
eine
Re
gier
ungs
erkl
ärun
g ab
gebe
n so
ll. K
onra
d Ad
enau
er in
form
iert
e de
n Bu
ndes
tag
im
Jahr
194
9 nu
r fün
f Tag
e na
ch se
iner
Wah
l üb
er
sein
Reg
ieru
ngsp
rogr
amm
, wäh
rend
sich
He
lmut
Koh
l 198
3 na
ch se
iner
Wah
l gan
ze
37
Tage
Zei
t lie
ß.
Dass
der
Bun
desk
anzle
r zum
Beg
inn
der L
egisl
atur
perio
de e
ine
Regi
erun
gser
klär
ung
vor d
em D
euts
chen
Bun
dest
ag
abgi
bt, i
st e
ine
selb
stve
rstä
ndlic
he
parla
men
taris
che
Trad
ition
. Sel
bstv
erst
ändl
iche
Prax
is ist
übe
rdie
s gew
orde
n, d
ass d
ie
Regi
erun
gser
klä-
ru
ng st
ets d
en le
tzte
n Ak
t der
Re
gier
ungs
bild
ung
dars
tellt
. De
mna
ch e
rfolg
t zun
ächs
t di
e W
ahl,
Erne
nnun
g un
d Ve
reid
igun
g de
s Bun
desk
anzle
rs,
dann
die
Ern
ennu
ng u
nd
Vere
idig
ung
der B
unde
smin
ister
, und
er
st im
Ans
chlu
ss d
aran
gib
t der
Bu
ndes
kanz
ler
sein
e er
ste
Regi
erun
gser
klär
ung
ab.
Bish
er h
aben
sich
jede
nfal
ls al
le
Bund
eska
nzle
r an
dies
e ze
itlich
e Ab
folg
e ge
halte
n.
Kein
e al
lgem
eine
Pra
xis h
at si
ch
alle
rdin
gs b
ezüg
lich
der F
rage
he
raus
gebi
ldet
, in
nerh
alb
wel
cher
Fris
t der
Kan
zler
sein
e Re
gier
ungs
erkl
ärun
g ab
gebe
n so
ll. K
onra
d Ad
enau
er in
form
iert
e de
n Bu
ndes
tag
im Ja
hr 1
949
nur f
ünf T
age
nach
se
iner
Wah
l übe
r se
in R
egie
rung
spro
gram
m, w
ähre
nd
sich
Helm
ut K
ohl 1
983
nach
sein
er
Wah
l gan
ze 3
7 Ta
ge Z
eit l
ieß.
Dass
der
Bun
desk
anzle
r zum
Beg
inn
der L
egisl
atur
perio
de e
ine
Regi
erun
gser
klär
ung
vor d
em
Deut
sche
n Bu
ndes
tag
abgi
bt, i
st e
ine
selb
stve
rstä
ndlic
he p
arla
men
taris
che
Trad
ition
. Sel
bstv
erst
ändl
iche
Prax
is ist
übe
rdie
s gew
orde
n, d
ass d
ie
Regi
erun
gser
klär
ung
stet
s den
le
tzte
n Ak
t der
Reg
ieru
ngsb
ildun
g da
rste
llt. D
emna
ch e
rfolg
t zun
ächs
t di
e W
ahl,
Erne
nnun
g un
d Ve
reid
igun
g de
s Bun
desk
anzle
rs,
dann
die
Ern
ennu
ng u
nd V
erei
digu
ng
der B
unde
smin
ister
, und
ers
t im
An
schl
uss d
aran
gib
t der
Bu
ndes
kanz
ler s
eine
ers
te
Regi
erun
gser
klär
ung
ab. B
isher
ha
ben
sich
jede
nfal
ls al
le
Bund
eska
nzle
r an
dies
e ze
itlich
e Ab
folg
e ge
halte
n.
K
eine
allg
emei
ne P
raxis
hat
sich
al
lerd
ings
bez
üglic
h de
r Fra
ge
hera
usge
bild
et, i
nner
halb
wel
cher
Fr
ist d
er K
anzle
r sei
ne
Regi
erun
gser
klär
ung
abge
ben
soll.
Ko
nrad
Ade
naue
r inf
orm
iert
e de
n Bu
ndes
tag
im Ja
hr 1
949
nur f
ünf
Tage
nac
h se
iner
Wah
l übe
r sei
n Re
gier
ungs
prog
ram
m, w
ähre
nd si
ch
Helm
ut K
ohl 1
983
nach
sein
er W
ahl
ganz
e 37
Tag
e Ze
it lie
ß.
Dass
der
Bun
desk
anzle
r zum
Beg
inn
der L
egisl
atur
perio
de e
ine
Regi
erun
gser
klär
ung
vor d
em
Deut
sche
n Bu
ndes
tag
abgi
bt, i
st e
ine
selb
stve
rstä
ndlic
he p
arla
men
taris
che
Trad
ition
. Sel
bstv
erst
ändl
iche
Prax
is ist
übe
rdie
s gew
orde
n, d
ass d
ie
Regi
erun
gser
klä
rung
stet
s den
le
tzte
n Ak
t der
Reg
ieru
ngsb
ildun
g da
rste
llt. D
emna
ch e
rfolg
t zun
ächs
t di
e W
ahl,
Erne
nnun
g un
d Ve
reid
igun
g de
s Bun
desk
anzle
rs,
dann
die
Ern
ennu
ng u
nd V
erei
digu
ng
der B
unde
smin
ister
, und
ers
t im
An
schl
uss d
aran
gib
t der
Bu
ndes
kanz
ler s
eine
ers
te
Regi
erun
gser
klär
ung
ab. B
isher
ha
ben
sich
jede
nfal
ls al
le
Bund
eska
nzle
r an
dies
e ze
itlich
e Ab
folg
e ge
halte
n.
Kein
e al
lgem
eine
Pra
xis h
at si
ch
alle
rdin
gs b
ezüg
lich
der F
rage
he
raus
gebi
ldet
, inn
erha
lb w
elch
er
Frist
der
Kan
zler s
eine
Re
gier
ungs
erkl
ärun
g ab
gebe
n so
ll.
Konr
ad A
dena
uer i
nfor
mie
rte
den
Bund
esta
g im
Jahr
194
9 nu
r fün
f Ta
ge n
ach
sein
er W
ahl ü
ber s
ein
Regi
erun
gspr
ogra
mm
, wäh
rend
sich
He
lmut
Koh
l 198
3 na
ch se
iner
Wah
l ga
nze
37 T
age
Zeit
ließ.
[19]SFB 1342 WeSIS – Technical Papers No. 7
File
A04
- Se
iten
aus
Reic
hsta
gspr
otok
olle
_188
2-83
kop
ie-4
Ad
obe
TESS
PD
F O
CR X
AB
BYY
Goo
gle
Driv
e O
CR
This
file
pro
ved
to b
e ch
alle
ngin
g du
e to
the
old
font
use
d in
the
docu
men
t. G
oogl
e go
es li
ne b
y lin
e bu
t is
the
only
one
w
ho re
cogn
izes
the
char
acte
rs a
t an
adeq
uate
leve
l. PD
F O
CR X
als
o re
cogn
ized
the
font
and
was
ab
le to
pro
perly
seg
men
t the
pag
e in
to
colu
mns
. Ad
obe,
ABB
YY a
nd T
esse
ract
com
plet
ely
faile
d th
e ta
sk. P
roba
bly
due
to th
e m
odel
s no
t bei
ng tr
aine
d on
suc
h ch
arac
ters
. It
can
be s
een
that
TES
S an
d PD
F O
CR X
re
quire
som
e po
st-p
roce
ssin
g, in
clud
ing
dele
ting
unne
cess
ary
line
split
s an
d m
ergi
ng
wor
ds w
ith d
ashe
s. O
n th
e ot
her A
dobe
, G
oogl
e O
CR a
nd A
BBYY
do
not r
equi
re s
uch
post
-pro
cess
ing.
trau
rige·
n2ag
e ge
rauß
gi{f
t. tt
m
ogitc
(}er�
roei
fe b
urc'
() b
ief e
@in
f dj
n(tu
ng in
ben
§ 2
�ud
j b�
tt §
1
u1tb
bas
.Suf
ta11
befo
mm
en b
es @
ef
eter
5, tu��
ttt !
,llet
nen
... m
:uge
n 1u
e1tig
ften
s itt
bem
§ 1
fegt
. af
5eµt
abel
. t)
t, _g
ef_a
grbe
n fo
l'tnt
en.
�en
tt b
ie ir
age
f o lt
egt,
f o tt
Jttb
f1e
1ebe
nfaU
s bu
rdj u
nf e
ren �
ef d
Jfuf
3,
falle
er 1
uie
er ro
oUe,
nid
jt be
eint
rä�
tigt m
erbe
tt ·
benn
es
f dj
eint
fidj
ja in
ber
Stf
)at i
1n 1
uef
entl1
djen
· u1
n ei
t�e
Stra
nsµo
rtft
age
Du
f)an
bel1
t; u1
tb b
ief e
5!
::ran
sµor
t� fr
age
fönn
e1t m
ir m
it u1
1f e
re1n
9�
uti_
gen
. �ef
�ft
tf3
tti�
t _itt
eilt
e an
bete
2ag
e br
inne
1t,
als
b1e1
en1n
e 1f
t, tn
bet
fte
f1dj
be
finbe
t utt
b bi
e 1n
ei1t
_ue
regr
ter
%re
u1tb
Dr.
SDog
rn g
e� f
djilb
ert g
at.
tree
rigen
ßage
ber
euäb
iift @
ci3i
ebrt
tt @
ie
bief
enflt
ereg
reeb
entt
ber
bie
@d3
[enn
nire
ibe
in b
ee @
efet
5, u
nb e
e ge
bt b
een
bere
it ni
eht,
fe u
erfc
berg
en @
ie e
iie ü
brig
en
23er
ti;ei
ie, b
ie w
ir un
e ne
n be
nz ©
efet
; err
fpre
cben
. fi)a
ä tf
t ber
pr
eitif
däe
@ru
nb,
une
wei
ci3e
m ii
i) he
ute
gege
n be
n ‘-
‘Jint
reg
ftim
men
wer
be: .
er
fienä
wei
i iii}
fein
e 3e
it ge
habt
hab
e,
mid
; ge.
infe
rmire
n,
unb
5mei
tenä
bee
beei
b, w
eit i
ci) g
laub
e,
baf3
wir
Iltfig
fiéifi
ü w
eife
bur
nb b
iefe
@in
fcija
itung
in b
en %
2
audi
; ben
@ 1
unb
ba
se 8
ufte
nbrie
mm
en b
ee ©
efeß
rä, w
ee
in m
eine
n bi
egen
w
enig
ften
ä in
bem
@ 1
fein
aig
enta
bei i
ft,
gefü
brbe
n iii
1'tt
iten.
fit
enn
bie
%?r
ege
fe li
egt,
fe w
irb fi
e ie
benf
aiiä
bun
t; ‚
unfe
ren
E3ef
rbin
fi, fe
iie e
r wie
er w
eiie
, ni
cht b
eein
trtic
ijtig
t w
erbe
n; b
een
ee ft
ijein
t fiti
) in
in b
er ii
bet
im w
efen
tiicb
en‘
trau
rigen
Lag
e he
raus
hilft
. Sch
iebe
n Si
e di
efen
Par
agra
phen
übe
r di
e ES
hle
mm
frei
de in
das
Gej
eg,
und
es g
eht d
ann
dam
it ni
cht,
jo v
erfc
herz
en S
ie a
lle ü
brig
en
Vort
heile
, die
wir
uns
von
dem
Gej
et v
erjp
rech
en. D
as if
t de
r pra
ktis
che
Gru
nd,
aus
wel
chem
ich
heut
e ge
gen
den
Antr
ag fl
imm
en w
erde
: er
ften
s w
eil i
ch fe
ine
Zeit
geha
bt
habe
, mic
h zu
info
rmire
n,
und
zwei
tens
Des
halb
, wei
l ich
gl
aube
, daß
wir
mög
liche
r:
wei
le d
urch
die
fe E
injc
haltu
ng in
de
n S
2 au
ch d
en S
1 u
md
das
AZuf
tand
efom
men
des
Gej
eßes
, w
as in
mei
nen
Auge
n w
enig
ften
s in
dem
$ 1
jehr
ak
zept
abel
it, g
efäh
rden
fünn
ten.
W
enn
die
Frag
e jo
lieg
t, jo
wird
fte
jede
nfal
ls d
urch
) un
sere
n Be
ihlu
ß, fa
lle e
r wie
er
wol
le, n
icht
bee
intr
ächt
igt
wer
den;
den
n es
jche
int f
ih ja
in d
er
That
im w
ejen
tlich
en '
um e
ine
Tran
spor
tfra
ge z
u ha
ndel
n;
und
dief
e Tr
ansp
ort:
fr
age
fünn
en w
ir m
it un
jere
m
heut
igen
Bej
chlu
ß ni
cht i
n ei
ne a
nder
e La
ge b
ringe
n, a
ls
diej
enig
e it,
in d
er ft
e jic
h be
finde
t und
die
mei
n ve
rehr
ter
Freu
nd D
r.
trau
rigei
tSag
e Ije
rauö
tjilft
. Sch
iebe
n Si
e bi
efen
^ara
grap
heni
tber
bie
Sd
)letn
mfr
eibe
in b
aö ©
efeh
, unb
eö
geh
t ban
n ba
mit
nid)
t, fo
ue
rfdj
erge
tt S
ie a
lle ü
brig
en
33or
fljei
le, b
ie m
ir un
ö no
n be
rn
©ef
efj o
crfp
redj
eit.
Saö
ift b
er
praf
tifdj
e ©
runb
, auö
mei
nem
ich
I;eut
e ge
gen
bett
Sln
trag
ftim
nten
in
erbe
: crf
teitö
raei
l idj
fein
e Se
it ge
habt
hoh
e, m
idj j
u in
form
iren,
nn
b gr
aeite
nö b
eölja
lb, m
cil i
ch
glau
be, b
af} m
ir in
ög(ic
f)cr=
mei
fe
burd
j bie
fe (E
infc
haltu
ng in
bei
t § 2
au
dj b
eit §
1 u
nb b
aö
Suft
anbe
foin
tnen
beö
©cf
egeö
, ra
aö in
mei
nen
Slug
ett m
enig
ften
ö in
bem
§ 1
febj
r afg
epta
bel i
ft,
gefä
hrbe
it fö
nnte
n.
SSe
nit b
ie $
rage
f° li
eflt,
fo w
irb
fie je
benf
allö
bur
cf) u
nfer
en
23cf
djlu
fj, fa
lle e
r mie
er r
aollc
, ni
cht b
eein
träd
jtigt
rcer
ben;
bei
m
eö fd
jein
t fid
j ja
in b
er S
Ctja
t im
ro
efen
tlidj
en u
m e
ine
Sran
öpor
tfra
ge g
u fja
nbel
it; u
nb
bief
e Tr
ansp
ort«
frag
e fö
mte
it m
ir m
it un
fere
m h
eutig
en S
Befd
jlufj
nidj
t in
eine
anb
ere
Sage
brin
gen,
al
ö bi
ejei
tige
ift, i
n be
r fie
fid;
bc
fiitb
ct u
nb b
ie m
ein
uere
fjrte
r Sr
eunb
Dr.
Söhn
t ge=
fdjil
bert
l)at
.
(E
nblid
j abe
r, m
eine
Her
ren,
ift
noch
ein
©ru
nb, b
er m
idj b
eftim
mt,
gur 3
cd g
egen
ben
Ein
trag
gu
ftim
men
. Sie
fabe
lt ge
hört
auö
bem
df
tunb
e be
ö Fe
rrit
Sire
ftor
ö im
Si
eidj
öfdj
ajga
mte
, baf
f gm
ifdje
n be
ut b
eutf
djen
Ste
idje
unb
ber
Sd
jmei
g ci
tt S
Oer
trag
öner
ljältn
ifj
befie
lt, m
oita
cfj m
ir bi
e fd
jraci
gcrif
chc
Sd) e
inem
fold
jen
Soll
bc; l
egen
, fo
ift b
aö
r
oeni
gftc
uö
ber b
egin
n ei
nes
9iüd
=
trau
rigen
Lag
e he
raus
hilft
. Sch
iebe
n Si
e di
esen
Par
agra
phen
übe
r Ab
saßg
ebie
t zu
habe
n, w
ähre
nd
unse
re F
abrik
ante
n na
ch d
ie
Schl
emm
krei
de in
das
Gef
eß, u
nd e
s ge
ht d
ann
dam
it D
änem
ark
und
Schw
eden
nic
ht k
omm
en k
önne
n un
d nu
r ihr
nic
ht, s
o ve
rsch
erze
n Si
e al
le ü
brig
en V
orth
eile
, die
wir
uns
eige
nes
Absa
ßgeb
iet h
aben
, wo
sie
noch
die
Kon
kurr
enz
von
von
dem
G
efeß
ver
spre
chen
. Das
ist d
er
prak
tisch
e G
rund
, Sch
wed
en u
nd
Dän
emar
k au
szuh
alte
n ha
ben.
Das
is
t es,
aus
wel
chem
ich
heut
e ge
gen
den
Antr
ag s
timm
en w
erde
: w
orüb
er d
ie R
ügen
sche
n Kr
eide
bruc
hbes
ißer
sic
h be
schw
eren
ers
tens
wei
l ich
kei
ne
Zeit
geha
bt h
abe,
mic
h zu
in
form
iren,
und
— w
ie ic
h m
eine
- fic
h m
it Re
cht b
esch
wer
en. W
enn
und
zwei
tens
des
halb
, wei
l ich
gl
aube
, daß
wir
mög
liche
r in
dies
er
Bezi
ehun
g et
was
ges
cheh
en k
önnt
e,
so w
ürde
dam
it w
eise
dur
ch d
iese
Ei
nsch
altu
ng in
den
§ 2
auc
h de
n §
1 un
d de
r dor
tigen
Indu
strie
ein
D
iens
t gel
eist
et. I
ch k
ann
mic
h da
s Zu
stan
deko
mm
en d
es G
esek
es, w
as
in m
eine
n Au
gen
auf d
iese
Wor
te
besc
hrän
ken
und
bem
erke
, daß
der
An
trag
wen
igst
ens
in d
em §
1 s
ehr
akze
ptab
el is
t, ge
fähr
den
fönn
ten.
vo
rläuf
ig z
urüc
kgez
ogen
wird
, und
w
ir un
s vo
rbeh
alte
n, z
ur
[20]
File
A01
- wor
ld-b
ook-
1988
Ad
obe
TESS
PD
F OCR
X
ABBY
Y Go
ogle
Driv
e OC
R Go
ogle
coul
dn’t
hand
le th
is fil
e pr
oper
ly. It
wa
s abl
e to
ext
ract
the
text
ual s
egm
ents
, ca
ptur
ed tw
o co
lum
ns, b
ut fo
r som
e re
ason
en
ded
up re
peat
ing e
ach
para
grap
h 2o
r 4
times
in a
row.
It
can
be se
en th
at TE
SS an
d PD
F OCR
X
requ
ire so
me
post
-pro
cess
ing,
inclu
ding
de
letin
g unn
eces
sary
line
split
s and
m
ergin
g wor
ds w
ith d
ashe
s. On
the
othe
r Ad
obe,
Goo
gle O
CR an
d AB
BYY d
o no
t re
quire
such
pos
t-pro
cess
ing.
Stra
tegic
ally l
ocat
ed b
etwe
en th
e M
iddl
e Ea
st, C
entra
l Asia
, and
the
Indi
an
subc
ontin
ent,
Afgh
anist
an is
a la
nd
mar
ked
by p
hysic
al an
d so
cial d
ivers
ity.
Phys
ically
, the
land
lock
ed co
untry
rang
es
from
the
high
mou
ntain
s of t
he H
indu
Ku
sh in
the
north
east
to lo
w-lyi
ng d
eser
ts
along
the
west
ern
bord
er. P
usht
uns
(alte
rnat
ively
Pash
tuns
or P
a-th
ans)
com
prise
abou
t 55
perc
ent o
f the
po
pulat
ion,
whi
le Ta
jiks,
who
spea
k Dar
i (a
n Af
ghan
varia
nt o
f Per
sian)
, com
prise
ab
out 3
0 pe
rcen
t; ot
her g
roup
s inc
lude
Uz
beks
, Haz
aras
, Balu
chis,
and
Turk
om.a
ns. T
ribal
dist
inct
ions
(exc
ept
amon
g the
Tajik
s) m
ay cu
t acr
oss e
thni
c cle
avag
es, w
hile
relig
ion
is a m
ajor
un
ifyin
g fac
tor:
90 p
erce
nt o
f the
peo
ple
prof
ess I
slam
(80
perc
ent S
unni
and
the
rem
ainde
r, m
ostly
Haz
ara,
Shi'it
e).
In 1
980,
wom
en co
nstit
uted
ap
prox
imat
ely 1
9 pe
rcen
t of t
he p
aid
work
forc
e, w
ith a
high
er p
erce
ntag
e en
gage
d in
unp
aid ag
ricul
tura
l labo
r. Fe
mal
e pa
rticip
atio
n in
gov-
ernm
ent i
s m
inim
al, al
thou
gh th
ere
is a
wom
en's
bran
ch o
f the
rulin
g par
ty an
d on
e fe
mal
e po
litbu
ro m
embe
r; am
ong t
he va
rious
m
ujah
eddi
n re
sista
nce
grou
ps, f
emal
e pa
rticip
atio
n is
none
xi.st
ent .
.
Stra
tegic
ally l
ocat
ed b
etwe
en th
e M
iddl
e Ea
st, C
entra
l As
ia, an
d th
e In
dian
subc
ontin
ent,
Afgh
anist
an is
a lan
d m
arke
d by
phy
sical
and
socia
l dive
rsity
. Ph
ysica
lly, t
he la
ndlo
cked
coun
try ra
nges
fro
m th
e hi
gh m
ount
ains o
f the
Hi
ndu
Kush
in th
e no
rthea
st to
low-
lying
de
serts
alon
g the
we
Ster
n bo
rder
. PUS
hIUI
IS (a
ltern
ative
ly Pa
shtu
ns o
r Pa-
th
ans)
com
prise
abou
t 55
perc
ent o
f the
po
pulat
ion,
whi
le Ta
jiks,
who
Spea
k Dar
i (a
n Af
ghan
varia
nt o
f Per
sian)
, co
mpr
ise ab
out 3
0 pe
rcen
t; ot
her g
roup
s in
clude
Uzb
eks,
Haza
ras,
Balu
chis,
and
Turk
oman
s. Tr
ibal
dist
inct
ions
(e
xcep
t am
ong t
he Ta
jiks)
may
cut a
cros
s et
hnic
cleav
ages
, wh
ile re
ligio
n is
a m
ajor
uni
fyin
g fac
tor:
90 p
erce
nt o
f the
pe
ople
pro
fess
Islam
(30
perc
ent S
unni
an
d th
e re
main
der,
mos
tly H
azar
a, Sh
i‘ite)
. In
198
0, w
omen
cons
titut
ed
appr
oxim
atel
y 19
perc
ent o
f the
paid
wo
rk fo
rce,
with
a hi
gher
per
cent
age
enga
ged
in u
npaid
agric
ultu
ral la
bor.
Fem
ale
parti
cipat
ion
in go
v-
ernm
ent i
s min
imal,
alth
ough
ther
e is
a wo
men
’s br
anch
of t
he ru
ling p
arty
and
one
fem
ale p
olitb
uro
mem
ber;
amon
g the
vario
us m
ujah
eddr
‘n
resis
tanc
e gr
oups
, fem
ale
parti
cipat
ion
is no
nexis
tent
._
Stra
tegic
ally l
ocat
ed b
etwe
en th
e M
iddl
e Ea
st, C
entra
l As
ia, an
d th
e In
dian
subc
ontin
ent,
Afgh
anist
an is
a lan
d m
arke
d by
phy
sical
and
socia
l di
vers
ity. P
hysic
ally,
the
landl
ocke
d co
untry
rang
es fr
om th
ehigh
m
ount
ains o
fthe
Hind
uKus
h in
the
north
east
to lo
w-Iyi
ng d
eser
ts al
ong
the
west
ern
bord
er. P
usht
uns
(alte
rnat
ively
Pash
tuns
or P
a-
than
s) co
mpr
iseab
out 5
5 pe
rcen
t of
the
popu
latio
n, w
hile
Ta
jiks,
who
spea
k Dar
i (an
Afg
han
varia
nt o
f Per
sian)
, co
mpr
ise ab
out 3
0 pe
rcen
t; ot
her
grou
ps in
clude
Uzb
eks,
Haza
ras,
Balu
chis,
and
Turk
oman
s. Tr
ibal
dist
inct
ions
(e
xcep
t am
ong t
he Ta
jiks)
may
cu
tacr
oss e
thm
ic cle
avag
es,
while
relig
ion
isa m
ajor
uni
fyin
g fa
ctor
: 90
perc
ent o
fthe
peop
le p
rofe
ss Is
lam (8
0 pe
rcen
t Su
nnian
dthe
rem
ainde
r, m
ostly
Haz
ara,
Shi‘it
e).
In 1
980,
wom
en co
nstit
uted
ap
prox
imat
ely 1
9 pe
rcen
t of t
he p
aid
work
forc
e, w
ith a
high
er p
erce
ntag
e en
gage
d in
unp
aid ag
ricul
tura
llabo
r. Fe
male
pa
rticip
atio
n in
gov-
er
nmen
t is m
inim
al, al
thou
gh th
ere
is aw
omen
’s br
anch
of t
he ru
ling p
arty
an
d on
e fe
male
pol
itbur
o m
embe
r; am
ong t
he va
rious
muj
ahed
din
resis
tanc
e gr
oups
, fem
ale
parti
cipat
ion
is no
nexis
tent
.
Stra
tegic
ally l
ocat
ed b
etwe
en th
e M
iddl
e Ea
st, C
entra
l Asia
, and
the
Indi
an su
bcon
tinen
t, Af
ghan
istan
is a
land
mar
ked
by p
hysic
al an
d so
cial
dive
rsity
. Phy
sicall
y, th
e lan
dloc
ked
coun
try ra
nges
from
the
high
m
ount
ains o
f the
Hin
du K
ush
in th
e no
rthea
st to
low-
lying
des
erts
alon
g th
e we
ster
n bo
rder
. Fus
htun
s (a
ltern
ative
ly Pa
shtu
ns o
r Pa-
than
s) co
mpr
ise ab
out 5
5 pe
rcen
t of t
he
popu
latio
n, w
hile
Tajik
s, wh
o sp
eak
Dari
(an
Afgh
an va
riant
of P
ersia
n),
com
prise
abou
t 30
perc
ent ;
oth
er
grou
ps in
clude
Uzb
eks,
Haza
ras,
Balu
chis,
and
Turk
oman
s. Tr
ibal
dist
inct
ions
(exc
ept a
mon
g the
Tajik
s) m
ay cu
t acr
oss e
thni
c cle
avag
es,
while
relig
ion
is a
maj
or u
nify
ing
fact
or: 9
0 pe
rcen
t of t
he p
eopl
e pr
ofes
s Isla
m (8
0 pe
rcen
t Sun
ni an
d th
e re
main
der,
mos
tly H
azar
a, Sh
iite)
. I
n 19
80, w
omen
cons
titut
ed
appr
oxim
atel
y 19
perc
ent o
f the
paid
wo
rkfo
rce,
with
a hi
gher
per
cent
age
enga
ged
in u
npaid
agric
ultu
ral la
bor.
Fem
ale
parti
cipat
ion
in go
vern
men
t is
min
imal,
alth
ough
ther
e is
a wo
men
’s br
anch
of t
he ru
ling p
arty
and
one
fem
ale
polit
buro
mem
ber;
amon
g the
va
rious
muj
ahed
din
resis
tanc
e gr
oups
, fem
ale p
artic
ipat
ion
is no
nexis
tent
.
Stra
tegic
ally l
ocat
ed b
etwe
en th
e M
iddl
e Ea
st, C
entra
l Asia
, and
the
Indi
an su
bcon
tinen
t, Af
ghan
istan
is a
land
mar
ked
by p
hysic
al an
d so
cial
dive
rsity
. Phy
sicall
y, th
e lan
dloc
ked
coun
try ra
nges
from
the
high
m
ount
ains o
fthe
Hind
u Ku
sh in
the
north
east
to lo
w-lyi
ng d
eser
ts al
ong
the
west
ern
bord
er. P
usht
uns
(alte
rnat
ively
Pash
tuns
or P
a- th
ans)
com
prise
abou
t 55
perc
ent o
f the
po
pulat
ion,
whi
le Ta
jiks,
who
spea
k Da
ri (a
n Af
ghan
varia
nt o
f Per
sian)
, co
mpr
ise ab
out 3
0 pe
rcen
t; ot
her
grou
ps in
clude
Uzb
eks,
Haza
ras,
Balu
chis,
and
Turk
oman
s. Tr
ibal
dist
inct
ions
( ex
.cept
amon
g the
Ta
jiks)
may
cut a
cros
s eth
nic
cleav
ages
, whi
le re
ligio
n is
a maj
or
unify
ing f
acto
r: 90
per
cent
of t
he
peop
le p
rofe
ss Is
lam (8
0 pe
rcen
t Su
nni a
nd th
e re
rnain
der,
mos
tJy
Haza
ra, S
hi'it
e).
In 1
980,
wom
en co
nstit
uted
ap
prox
imat
ely 1
9 pe
rcen
t of t
he p
aid
work
forc
e, w
ith a
high
er p
erce
ntag
e en
gage
d in
unp
aid ag
ricul
tura
l labo
r. Fe
mal
e pa
rticip
atio
n in
gov-
ern
men
t is
min
imal,
alth
ough
ther
e is
a wo
men
's br
anch
of t
he ru
ling p
arty
an
d on
e fe
male
pol
itbur
o m
embe
r; am
ong t
he va
rious
muj
ahed
din
resis
tanc
e gr
oups
, fem
ale
parti
cipat
ion
is no
nexis
tent
..
[21]SFB 1342 WeSIS – Technical Papers No. 7
File
A01
- w
orld
-boo
k-19
68
Adob
e TE
SS
OCR
X
ABBY
Y Go
ogle
Driv
e O
CR
ABBY
Y di
d no
t hav
e Sp
anish
pre
sele
cted
so
all s
peci
al sy
mbo
ls ar
e no
t rec
ogni
zed.
In
ado
be y
ou c
an h
ave
only
one
mai
n la
ngua
ge fo
r the
doc
umen
t. It
can
be se
en th
at T
ESS
and
OCR
X
requ
ire so
me
post
-pro
cess
ing,
incl
udin
g de
letin
g un
nece
ssar
y lin
e sp
lits a
nd
mer
ging
wor
ds w
ith d
ashe
s. O
n th
e ot
her
Adob
e, G
oogl
e O
CR a
nd A
BBYY
do
not
requ
ire su
ch p
ost-
proc
essin
g.
Pres
iden
t Art
uro
Fron
dizi,
who
ass
umed
of
fice
on M
ay 1
, 195
8, w
as o
uste
d on
M
arch
29,
196
2, b
y a
mili
tary
cou
p.
Alth
ough
he
had
been
subj
ect t
o m
ilita
ry
pres
sure
for s
ome
time
befo
re th
at, h
is ov
erth
row
stem
med
spec
ifica
lly fr
om th
e re
sults
of t
he M
arch
18,
196
2, e
lect
ions
w
hich
con
stitu
ted
a Pe
roni
st v
icto
ry.
Fron
dizi,
aga
inst
mili
tary
obj
ectio
ns, h
ad
perm
itted
the
Pero
nist
s to
take
par
t in
thos
e el
ectio
ns th
roug
h ca
ndid
ates
of
neo-
Pero
nist
par
ties.
Follo
win
g Fr
on-d
izi's
over
thro
w, J
ose
Mar
ia G
uido
, Pr
ovisi
onal
Pre
siden
t of t
he N
atio
nal
Sena
te, w
as in
stal
led
as P
resid
ent o
f the
na
tion
unde
r the
law
of s
ucce
ssio
n, th
ere
bein
g no
vic
e-pr
esid
ent.
A
gene
ral e
lect
ion
was
hel
d on
July
7,
1963
, for
the
first
tim
e in
Arg
entin
a by
pr
opor
tiona
l rep
rese
ntat
ion.
Can
dida
tes
wer
e pr
esen
ted
by U
DELP
A, w
ith P
. E.
Aram
buru
as c
andi
date
; by
the
Unio
n Ci
vica
Rad
ical
Intr
ansig
ente
, led
by
form
er P
resid
ent F
rond
izi (D
r. O
scar
Al
ende
was
the
cand
idat
e); t
he U
nion
Ci
vica
Rad
ical
del
Pue
blo,
led
by D
r. Ar
turo
Illia
; and
seve
ral o
ther
smal
ler
part
ies.
The
part
y of
Dr.
Illia
won
the
elec
tion,
and
his
succ
ess w
as h
aile
d as
a
vict
ory
for m
oder
atio
n an
d a
defe
at fo
r th
e fo
rces
of f
orm
er d
icta
tor J
uan
D.
Pero
n. T
he U
CRI s
plit
befo
re th
e 19
63
elec
tions
, hal
f rem
aini
ng lo
yal t
o Al
ende
, an
d th
e ot
her h
alf,
unde
r Dr.
Fron
dizi,
ta
king
the
nam
e M
ovim
ient
o de
In
te-g
raci
on y
Des
arro
llo (M
ID).
Pres
iden
t Art
uro
Fron
dizi,
who
ass
umed
ofl
ice
on M
ay 1
, 195
8, w
as o
uste
d on
M
arch
29,
196
2, b
y a
mili
tary
cou
p.
Alth
ough
he
had
been
subj
ect t
o m
ilita
ry
pres
sure
for s
ome
time
befo
re th
at, h
is ov
erth
row
stem
med
spec
ifica
lly fr
om th
e re
sults
of t
he M
arch
18,
196
2, e
lect
ions
w
hich
con
stitu
ted
a Pe
roni
st v
icto
ry.
Fron
dizi,
aga
inst
mili
tary
obj
ectio
ns, h
ad
perm
itted
the
Pero
nist
s to
take
par
t in
thos
e el
ectio
ns th
roug
h ca
ndid
ates
of
neo-
Pero
nist
par
ties.
Follo
win
g Fr
on-
dizi’
s ove
rthr
ow, J
osé
Mar
ia G
uido
, Pr
ovisi
onal
Pre
siden
t of t
he N
atio
nal
Sena
te, w
as in
stal
led
as P
resid
ent o
f the
na
tion
unde
r the
law
of s
ucce
ssio
n,
ther
e be
ing
no V
ice-
pres
iden
t. A
gene
ral e
lect
ion
was
hel
d on
July
7,
1963
, for
the
first
tim
e in
Arg
entin
a by
pro
port
iona
l rep
rese
ntat
ion.
Ca
ndid
ates
wer
e pr
esen
ted
by U
DELP
A,
with
P.
E. A
ram
buru
as c
andi
date
; by
the
Unié
n Ci
vica
Rad
ical
Intr
ansig
ente
, led
by
form
er P
resid
ent F
rond
izi (D
r. O
scar
Al
ende
was
the
cand
idat
e); t
he U
nién
Ci
vica
Rad
ical
del
Pue
blo,
led
by D
r. Ar
turo
Illia
; and
seve
ral o
ther
smal
ler
part
ies.
The
part
y of
Dr.
Illia
won
the
elec
tion,
and
his
succ
ess w
as h
aile
d as
a
Vict
ory
for m
oder
atio
n an
d a
defe
at fo
r th
e fo
rces
of f
orm
er d
icta
tor J
uan
D.
Peré
n. T
he U
CRI s
plit
befo
re th
e 19
63
elec
tions
, hal
f rem
aini
ng lo
yal t
o Al
ende
, an
d th
e ot
her h
alf,
unde
r Dr.
Fron
dizi,
ta
king
the
nam
e M
ovim
ient
o de
Inte
—
grac
ién
y De
sarr
ollo
(MID
).
Faile
d Pr
esid
ent A
rtur
o Fr
ondi
zi, w
ho
assu
med
offi
ce o
n M
ay 1
, 195
8, w
as
oust
ed o
n M
arch
29,
196
2, b
y a
mili
tary
cou
p. A
lthou
gh h
e ha
d be
en
subj
ect t
o m
ilita
ry p
ress
ure
for s
ome
time
befo
re th
at, h
is ov
erth
row
st
emm
ed sp
ecifi
cally
from
the
resu
lts
of th
e M
arch
18,
196
2, e
lect
ions
w
hich
con
stitu
ted
a Pe
roni
st v
icto
ry.
Fron
dizi,
aga
inst
mili
tary
obj
ectio
ns,
had
perm
itted
the
Pero
nist
s to
take
pa
rt in
thos
e el
ectio
ns th
roug
h ca
ndid
ates
of n
eo-P
eron
ist p
artie
s. Fo
llow
ing
Fron
-dizi
’s ov
erth
row
, Jos
e M
aria
Gui
do, P
rovi
siona
l Pre
siden
t of
the
Nat
iona
l Sen
ate,
was
inst
alle
d as
Pr
esid
ent o
f the
nat
ion
unde
r the
law
of
succ
essio
n, th
ere
bein
g no
vic
e-pr
esid
ent.
A ge
nera
l ele
ctio
n w
as h
eld
on Ju
ly 7
, 19
63, f
or th
e fir
st ti
me
in A
rgen
tina
by p
ropo
rtio
nal r
epre
sent
atio
n.
Cand
idat
es w
ere
pres
ente
d by
UD
ELPA
, with
P. E
. Ara
mbu
ru a
s ca
ndid
ate;
by
the
Unio
n Ci
vica
Ra
dica
l Int
rans
igen
te, l
ed b
y fo
rmer
Pr
esid
ent F
rond
izi (D
r. O
scar
Ale
nde
was
the
cand
idat
e) ;
the
Unio
n Ci
vica
Ra
dica
l del
Pue
blo,
led
by D
r. Ar
turo
Ill
ia; a
nd se
vera
l oth
er sm
alle
r pa
rtie
s. Th
e pa
rty
of D
r. Ill
ia w
on th
e el
ectio
n, a
nd h
is su
cces
s was
hai
led
as a
vic
tory
for m
oder
atio
n an
d a
defe
at fo
r the
forc
es o
f for
mer
di
ctat
or Ju
an D
. Per
on. T
he U
CRI s
plit
befo
re th
e 19
63 e
lect
ions
, hal
f re
mai
ning
loya
l to
Alen
de, a
nd th
e ot
her h
alf,
unde
r Dr.
Fron
dizi,
taki
ng
the
nam
e M
ovim
ient
o de
Inte
grat
ion
y De
sarr
ollo
(MID
).
Pres
iden
t Art
uro
Fron
dizi,
who
as
sum
ed o
ffice
on
May
I, 1
958,
was
ou
sted
on
Mar
ch 2
9, 1
962,
by
a m
ilita
ry c
oup.
Alth
ough
he
had
been
su
bjec
t to
mili
tary
pre
ssur
e fo
r som
e tim
e be
fore
that
, his
over
thro
w
stem
med
spec
ifica
lly fr
om th
e re
sults
of
the
Mar
ch 1
8, 1
962,
ele
ctio
ns
whi
ch c
onst
itute
d a
Pero
nist
vic
tory
. Fr
ondi
zi, a
gain
st m
ilita
ry o
bjec
tions
, ha
d pe
rmitt
ed th
e Pe
roni
sts t
o ta
ke
part
in th
ose
elec
tions
thro
ugh
cand
idat
es o
f neo
-Per
onist
par
ties.
Follo
win
g Fr
on- d
izi's
over
thro
w, J
os
e M
aria
Gui
do, P
rovi
siona
l Pr
esid
ent o
f the
Nat
iona
l Sen
ate,
was
in
stal
led
as P
resid
ent o
f the
nat
ion
unde
r the
law
of s
ucce
ssio
n, th
ere
bein
g no
vic
e-pr
esid
ent.
A
gene
ral e
lect
ion
was
hel
d on
July
7,
1963
, for
the
first
tim
e in
Arg
entin
a by
pro
port
iona
l rep
rese
ntat
ion.
Ca
ndid
ates
wer
e pr
esen
ted
by
UDEL
PA, w
ith P
. E. A
ram
buru
as
cand
idat
e; b
y th
e U
ni6n
Civ
ica
Radi
cal I
ntra
nsig
ente
, led
by
form
er
Pres
iden
t Fro
ndizi
(Dr.
Osc
ar A
lend
e w
as th
e ca
ndid
ate)
; the
Uni
on C
ivic
a Ra
dica
l del
Pue
blo,
led
by D
r. Ar
turo
Ill
ia; a
nd se
vera
l oth
er sm
alle
r pa
rtie
s. Th
e pa
rty
of D
r. Ill
ia w
on th
e el
ectio
n, a
nd h
is su
cces
s was
hai
led
as a
vic
tory
for m
oder
a tio
n an
d a
defe
at fo
r the
forc
es o
f for
m e
r dic
ta
tor J
uan
D. P
er6n
. The
UCR
I spl
it be
fore
the
1963
ele
ctio
ns, h
alf
rem
aini
ng lo
yal t
o Al
ende
, and
the
othe
r hal
f, un
der D
r. Fr
ondi
zi, ta
king
th
e na
me
Mov
imie
nto
de In
te-
grac
i6n
y De
sarr
ollo
( M
ID).
[22]
Table nested in text. ABBYY outputs the result as a xls file. The file includes the text split line by line as in the original file but the tables are format-ted properly if the software was able to recognize them.
Some dots and very small characters were unrecog-nized, and represented by random symbols. The over-all structure of the table is re-flected adequately, though.
OCR tables OCR tables
File A02 – USSSA_Relplacement rates pensions 1965-75
OrIgInal
[23]SFB 1342 WeSIS – Technical Papers No. 7
OCr result (aBBYY FInereader)Ta
ble
1 —
Repl
acem
ent r
ate
of s
ocia
l sec
urity
old
-age
pen
sion
s fo
r m
en w
ith a
vera
ge e
arni
ngs
m m
anuf
actu
ring,
sel
ecte
d co
untri
es 1
[Ret
irem
ent a
s of
Jan
1 o
f yea
r in
dica
ted]
Pens
ion
as p
erce
nt o
f ear
ning
s in
yea
r be
fore
ret
irem
ent
Cou
ntry
Year
s w
orke
d
Si
ngle
wor
ker
Aged
cou
ple
1965
1969
1972
1973
1974
1975
1965
1969
1972
1973
1974
1975
Aust
ria _
___
. ....
. . -
4067
6563
6261
5467
65~
03
6261
54
Can
ada
*...
4021
2227
‚ 30
3139
4239
- 42
4648
57
Den
mar
k....
......
403o
2930
3030
2951
4244
4443
43
Fran
ce ..
.....
37 5
4942
44*
4744
4665
5660
6260
6o
Fede
ral R
epub
lic o
f Ger
man
y _
_ ...
4048
5649
1 49
4950
4856
4949
4950
Italy.
. . .
......
.40
CO
6765
6764
6760
6765
6764
67
Net
herla
nds.
.....
5035
3635
3837
3850
5150
5353
54
Nor
way
......
.....
4025
3437
3940
4138
4951
5354
55
Swed
en ..
... ..
...£0
3139
454o
5059
4452
58o7
6276
Switz
erla
nd...
......
.....
Sinc
e 19
4828
2631
3935
3645
4246
5853
53
Uni
ted
King
dom
3 ....
.Si
nce
1961
Si
nce
1951
2321
2222
2226
3633
3433
3339
Uni
ted
Stat
es...
... ..
......
.29
2934
3836
3344
4450
5754
57
[24]
File A02 – USSSA_SSC Expenditures_sel.countries_1990.pdf
Table nested in text. ABBYY outputs the result as a xls file. The file in-cludes the text split line by line as in the original file but the tables are formatted properly if the soft-ware was able to recognize them.
Some dots and very small char-acters were unrecognized, and represented by random symbols. The overall structure of the table is reflected adequately, though. A row has also been split in two but it does not ruin the overall structure. Such minor issues can be changed in the interface e.g. adding an ad-ditional line separator in the table.
OrIgInal
[25]SFB 1342 WeSIS – Technical Papers No. 7
OCr result (aBBY FInereader)
File
A02
– U
SSSA
_SSC
Exp
endi
ture
s_se
l.cou
ntrie
s_19
90.p
df
Tabl
e ne
sted
in te
xt. A
BBYY
out
puts
the
resu
lt as
a xl
s file
. The
file
inclu
des t
he te
xt sp
lit li
ne b
y lin
e as
in th
e or
igin
al fi
le b
ut th
e ta
bles
are
form
atte
d pr
oper
ly if
the
softw
are
was
abl
e to
reco
gnize
them
. The
full
file
can
be fo
und
in th
e ap
prop
riate
pro
ject
fold
er.
Som
e do
ts a
nd v
ery
smal
l cha
ract
ers w
ere
unre
cogn
ized,
and
repr
esen
ted
by ra
ndom
sym
bols.
The
ove
rall
stru
ctur
e of
the
tabl
e is
refle
cted
ade
quat
ely,
thou
gh. A
row
has
also
bee
n sp
lit in
two
but i
t doe
s not
ruin
the
over
all
stru
ctur
e. S
uch
min
or is
sues
can
be ch
ange
d in
the
inte
rface
e.g
. add
ing
an a
dditi
onal
line
sepa
rato
r in
the
tabl
e.
Orig
inal
OCR
Res
ult
Tabl
e 1.
—Ex
pend
iture
s fo
r soc
ial s
ecur
ity a
nd h
ealth
car
e,' b
y co
untr
y,
1980
and
198
3
Shar
e of
GN
P sp
ent o
n so
cial
sec
urity
Pe
rcen
tage
ch
ange
Soci
al s
ecur
ity
heal
th
com
pone
nt a
s sh
are
of G
NP
Shar
e of
G
NP
spen
t on
to
tal
heal
th c
are
Shar
e of
GN
P sp
ent o
n so
cial
se
curit
y pl
us
tota
l hea
lth c
are
Perc
enta
ge
chan
ge
Cou
ntry
(1
) 198
0 (2
) 198
3 (3
) (4
)2
1980
(5
)2
1983
(6
) 19
80
(7)
1983
3 (8
) 19
80
(9)
1983
(1
0)
Can
ada.
......
......
....
14.8
16
.6
12.2
5.
3 4.
8 7.
4 8.
7 16
.9
20.5
21
.3
Uni
ted
Stat
es...
......
...
12.4
13
.6
9.7
4.1
3.0
9.1
10.8
17
.4
21.4
23
.0
Net
herla
nds.
......
......
29
.4
33.3
13
.3
5.7
6.0
9.1
9.3
32.8
36
.5
11.3
U
nite
d K
ingd
om...
......
. Fe
dera
l Rep
ublic
of
16.5
19
.9
20.6
4.
2 4.
7 5.
8 5.
9 18
.1
21.1
16
.6
Ger
man
y....
......
.....
24.1
27
.4
13.7
5.
9 5.
9 9.
6 9.
5 27
.8
31.0
11
.5
Fran
ce ..
......
......
....
27.5
30
.4
10.5
5.
5 6.
4 8.
8 9.
2 30
.8
33.2
7.
8 Sw
eden
......
......
.....
32.1
34
.1
6.2
7.5
8.4
9.4
9.8
34.0
35
.5
4.4
Japa
n....
......
......
...
10.8
12
.0
11.1
4.
4 4.
7 6.
0 6.
7 12
.4
14.0
12
.9
[27]SFB 1342 WeSIS – Technical Papers No. 7
OCr result (aBBY FInereader)O
CR R
esul
t
Be
nefit
A
mou
nt
Dur
atio
n Q
ualif
ying
con
ditio
ns
Adm
inist
ratio
n Pa
rtial
une
mpl
oym
ent
Qua
lifyi
ng p
erio
d W
aitin
g pe
riod
Disq
ualif
icat
ions
Bene
fits f
or to
tal u
nem
ploy
men
t are
pai
d on
a
daily
ba
sis.
Onl
y fu
ll da
ys
of
unem
ploy
men
t ar
e no
rmal
ly c
onsid
ered
da
ys o
f une
mpl
oym
ent,
but h
alfd
ays
can
be a
dded
to
form
ful
l da
ys i
f pa
rt of
a
fixed
pat
tern
of
unem
ploy
men
t du
etto
in
term
itten
t wor
k or
rota
tion
of w
ork.
For
po
rt-w
orke
rs,
half-
days
m
ay
be
cons
ider
ed
for
bene
fit.
Whe
n pa
Hia
l un
empl
oym
ent o
f 11
day
a w
eek
occu
rs,
this
is co
nsid
ered
a d
ay o
f une
mpl
oym
ent
whe
ii ac
com
pani
ed
by
at
leas
t 2
addi
tiona
l da
ys i
n sa
me
wee
k. S
peci
al
regu
latio
ns g
over
n se
ason
al a
nd h
ome
wor
kers
. The
latte
r mus
t hav
e 1
full
wee
k of
une
mpl
oym
ent t
o ob
tain
ben
efit.
Non
e sp
ecifi
ed__
____
____
____
_ N
one
requ
ired
for
wor
ker
] cu
rren
tly
cove
red
in
gene
ral
soci
al
secu
rity
prog
ram
. Pe
rson
s no
t cu
rren
tly
cont
ribut
ing
are
entit
led
to b
enef
it if
able
to
wor
k an
d if
they
wer
e m
embe
rs o
f an
unem
ploy
men
t fu
nd
betw
een
Janu
ary
1,19
38, a
nd M
ay 1
0, 1
940,
or w
ho p
aid
at
leas
t 3
mon
ths
of c
ontri
butio
ns f
or o
ld-
age
pens
ions
bet
wee
n Ja
nuar
y 1,
193
5 an
d M
ay 1
0,19
40.
'Ton
e. N
o be
nefit
is p
aid
for 1
day
’s
unem
ploy
men
t in
wee
k, u
nles
s it
is th
e fir
st o
r las
t day
of a
3-d
ay p
erio
d of
une
mpl
oym
ent.
Tem
pora
ry: R
efus
al o
f sui
tabl
e w
ork,
4-
13 w
eeks
(lon
ger i
f rep
eate
d) ;
volu
ntar
y qu
ittin
g w
ithou
t goo
d re
ason
, 4-1
3 w
eeks
(lo
nger
if
repe
ated
); in
com
plet
e,
inoo
rrec
t, or
fate
dec
lara
tion
by in
sure
d,
1-13
w
eeks
(lo
nger
if
repe
ated
); ob
tain
ing
false
sta
mpi
ng o
f con
trol c
ard,
et
c.;
1-13
wee
ks (
long
er i
f re
peat
ed);
begg
ing,
13
w
eeks
; no
torio
us
misc
ondu
ct,
13
wee
ks,
(long
er
if re
peat
ed).
Abs
olut
e:
Dur
ing
labo
r di
sput
e; a
s res
ult o
f stri
ke, i
f stri
ke c
alle
d w
ith a
ssen
t of w
orke
r.
Min
ister
of
La
bor
and
Soci
al
Wel
fare
, ge
nera
l su
perv
ision
. N
atio
nal
Soci
al
Secu
rity
Offi
ce,
publ
ic-la
w
agen
cy
in
Min
istry
, co
llect
s co
ntrib
utio
ns
for
all
prog
ram
s an
d di
sbur
ses
them
to
natio
nal
adm
inist
rativ
e bo
dies
. Fu
nd
for
Mai
nten
ance
of
the
Une
mpl
oyed
, un
der
Min
ister
, m
aint
ains
pub
lic e
mpl
oym
ent
a of
fices
, co
nduc
ts vo
catio
nal
train
ing,
re
ceiv
es
fund
s fro
m
Nat
iona
l So
cial
Se
curit
y O
ffice
an
d di
sbur
ses
them
to
pa
ying
age
ncie
s. Re
gion
al o
ffice
s of
Fun
d op
erat
e in
are
as n
amed
, by
Min
ister
. The
O
ffice
an
d th
e .
Fund
ha
ve
,eqU
al
; re
pres
enta
tion
by w
orke
rs a
nd E
mpl
oyer
s on
go
vern
ing
bodi
es,
with
a
neut
ral
chai
rman
ap
poin
ted
by
the
Min
ister
. Be
nefit
s are
pai
d ei
ther
by
com
mun
es o
r by
appr
oved
wor
kers
* or
gani
zatio
ns w
ith a
t le
ast
50,0
00
mem
bers
. A
ppea
ls ar
e ad
min
ister
ed b
y lo
cal C
laim
s Com
miss
ions
, iir
ith
equa
l w
orke
r an
d:
empl
oyer
re
pres
enta
tion,
and
by
• na
tiona
l A
ppea
ls
Boar
d na
med
by
Min
ister
and
hav
ing
equa
l, w
orke
r and
em
ploy
er re
pres
enta
tion.
No
prov
ision
in la
w...
......
j =
12 w
eeks
in 1
yea
r :#W
. : .
lijii
liiiL
\ ■
I" .
!..•■
:,h .!«
!;? it
:" ?
■ 'i-
Not
spec
ified
in la
w...
......
... ■
: 8
days
......
......
......
......
......
...
Not
spec
ified
in la
w„.
......
. M
inist
ry
of
Labo
r an
d So
cial
W
elfa
re,
supe
rsta
te
Soci
al
Insu
ranc
e In
stitu
te,
adim
nist
ratio
n. G
over
ning
bod
y in
clud
es
repr
esen
tativ
es o
f 3 m
inist
ries
and
of tr
ade
unio
ns.
[29]SFB 1342 WeSIS – Technical Papers No. 7
OCr result (aBBYY FInereader)
OCR
Res
ult
ALB
ANIA
Cas
h B
enef
its fo
r Ins
ured
Wor
kers
(exc
ept
perm
anen
t dis
abili
ty)
Perm
anen
t Dis
abili
ty a
nd M
edic
al B
enef
its fo
r In
sure
d W
orke
rs
Surv
ivor
Ben
efits
and
Med
ical
Ben
efits
for
Dep
ende
nts
Adm
inis
trat
ive
Org
aniz
atio
n O
id^g
e pe
nsio
n: 7
0% o
f ave
rage
ear
ning
s dur
ing
last
yea
r or 3
ye
an o
r hi
ghes
t 3
of l
ast
10 y
ean,
inv
erse
ly a
ccor
ding
to
earn
ings
. Min
imum
and
max
imum
pen
sions
: 330
and
900
leks
a
mon
th.
Dep
ende
nts’
sup
plem
ents
Of
12 y
ean’
con
tinuo
us
wor
k):
20%
of
pens
ion
for
1 de
pend
ent.
Redu
ced
pens
ion:
Pr
opor
tiona
te to
yea
n of
wor
k.
Inva
lidity
pen
sion:
70%
of a
vera
ge e
arni
ngs d
urin
g la
st y
ear o
r 3
year
s M
inim
um a
nd m
axim
um p
ensio
n, 3
80 a
nd 7
00 le
ks a
m
onth
. Inc
rem
ent o
f 10%
to 2
0% fo
r 5-1
5 ye
ars
of c
ontin
uous
w
ork
for
1 em
ploy
er;
max
imum
pen
sion,
100
% o
f ea
rnin
gs.
Cons
tant
-atte
ndan
ce
supp
lem
ent:
15%
of
ea
rnin
gs
Parti
al
inva
lidity
pen
sion:
40%
of e
arni
ngs
or u
p to
60%
if c
hang
e in
oc
cupa
tion
nece
ssar
y w
hich
redu
ces e
arni
ngs.
Redu
ced
pens
ion
if le
ss th
an fu
ll ye
ar o
f cov
erag
e.
Sarti
var
pens
ion:
40%
of
earn
ings
of
insu
red
for
1 su
rviv
or;
30%
for
2 s
urvi
vors
; 63
% f
or 3
or
mor
e su
rviv
ors
Elig
ible
su
rviv
ors:
Spou
se a
nd p
aren
ts i
f ag
ed,
inva
lid,
or c
arin
g fo
r or
phan
; and
chi
ldre
n, b
roth
ers
and
siste
rs u
nder
age
16
(20
if st
uden
t, no
lim
it if
inva
lid).
Pens
ion
divi
ded
equa
lly if
surv
ivor
s liv
e ap
art.
Fune
ral g
rant
Lum
p su
m o
f 300
leks
Min
istry
of
Fi
nanc
e,
gene
ral
supe
rvisi
on.
Stat
e So
cial
Ins
uran
ce O
ffice
, adm
inist
ratio
n of
pro
gram
th
roug
h di
stric
t offi
ces
Sktn
ess
bene
fit:
70%
of
earn
ings
, or
83%
if
10 y
ean
of
cont
inuo
us w
ork
for 1
empl
oyer
. Pay
able
from
1st
day
of i
llnes
s fo
r dur
atio
n of
sic
knes
s. M
smuk
y ke
nsfit
: 73%
of e
arni
ngs,
or
93%
if 3
yea
n of
con
tinuo
us w
ork
for 1
em
ploy
er. P
ayab
le fo
r 3
wee
ks b
efor
e and
7 w
eeks
afte
r con
finem
ent;
may
be e
xten
ded
to to
tal o
f 13
wee
ks. B
irth
gran
t of 2
80 le
ks fo
r lay
ette
and
food
.
Med
ical
ben
efits
: Med
ical
serv
ices
pro
vide
d di
rect
ly to
pat
ient
s th
roug
h fa
cilit
ies
of M
inist
ry o
f H
ealth
. In
clud
es m
edic
al
treat
men
t, ac
com
mod
atio
n, an
d m
edic
ines
in h
ospi
tal;
treat
men
t in
hom
e if
urge
nt; o
ffice
trea
tmen
t, de
ntal
car
e, a
nd a
pplia
nces
as
aut
horiz
ed b
y re
gula
tion.
Med
ical
ben
efits
for
dep
ende
nts:
Sam
e m
edic
al b
enef
its a
s in
sure
d, e
xcep
t no
cos
t sh
arin
g fo
r ho
spita
lizat
ion
in c
ase
of
mat
erni
ty c
are
or fo
r chi
ldre
n un
der a
ge 6
. Wife
rece
ives
sam
e bi
rth g
rant
as w
orki
ng w
oman
.
Min
istry
of
Fi
nanc
e,
gene
ral
supe
rvisi
on.
Stat
e So
cial
In
sura
nce
Offi
ce,
adm
inist
ratio
n of
ca
sh
bene
fits
Min
istry
of
Hea
lth,
prov
ision
of
med
ical
be
nefit
s w
ith a
ssist
ance
of
Dist
rict
Hea
lth S
ervi
ces
thro
ugh
clin
ics,
hosp
itals
mat
erni
ty h
omes
and
oth
er
faci
litie
s ope
rate
d by
them
.
Teoy
oear
j di
soki
lity
bene
fit (
wor
k in
jury
): 93
% o
f ea
rnin
gs
Paya
ble
from
1st
day
of d
isabi
lity
until
reco
very
or c
ertif
icat
ion
of p
erm
anen
t disa
bilit
y.
of av
erag
e ear
ning
s dur
ing
last
yea
r or 3
yea
rs if
tota
lly d
isabl
ed.
Cons
tant
-atte
ndan
ce
supp
lem
ent
15%
of
ea
rnin
gs
Parti
al
disa
bilit
y pe
nsio
n: 3
0% o
f ea
rnin
gs M
edic
al b
enef
its (
wor
k in
jury
): Sa
me
as fo
r ord
inar
y sic
knes
s abo
ve; a
lso a
pplia
nces
.
Surv
ivor
pen
sion
(wor
k in
jury
): 45
% o
f ear
ning
s of i
nsur
ed fo
r 1
surv
ivor
, 60%
for
2 su
rviv
ors;
80%
for
3 or
mor
e su
rviv
ors
Elig
ible
surv
ivor
s Spo
use
and
pare
nts i
f age
d, in
valid
, or c
arin
g fo
r orp
han;
and
chi
ldre
n, b
roth
ers
and
siste
rs u
nder
age
16
(20
if st
uden
t, no
lim
it if
inva
lid).
Pens
ion
divi
ded
equa
lly i
f su
rviv
ors l
ive
apar
t Fun
eral
gra
nt L
ump
sum
of 3
00 le
ks
Min
istry
of
Fi
nanc
e,
gene
ral
supe
rvisi
on.
Stat
e So
cial
In
sura
nce
Offi
ce,
adm
inist
ratio
n of
ca
sh
bene
fits
Min
istry
of
Hea
lth,
prov
ision
of
med
ical
be
nefit
s
[30]
File A04 – Grafiken
File
A04
– G
rafik
en
Orig
inal
O
CR R
esul
t
Bund
eska
nzle
r D
atum
der
Wah
l des
Bu
ndes
kanz
lers
D
atum
der
R
egie
rung
serk
läru
ng
Ver
zöge
rung
(in
Tag
en)
Adenauer
15.9.1949
20.0.1949
5 9.10.1953
20.10.1953
11
22.10.1957
29.10.1957
7
7.11.1961
29.11.1961
22
Erhard
16.10.1963
18.10.1963
2
20.10.1965
10.11.1965
21
Kiesinger
1.12.1966
13.12.1966
12
Brandt
21.10.1969
28.10.1969
7
14.12.1972
18.1.1973
35
Schm
idt
16.5.1974
17.5.1974
1 15.12.1976
16.12.1976
1
5.11.1980
24.11.1980
19
Kohl
1.10.1982
13.10.1982
12
29.3.1983
4.5.1983
37
11.3.1987
18.3.1987
7 17.1.1991
30.1.1991
13
15.11.1994
23.11.1994
8
Schröder
27.10.1998
10.11.1998
14
22.10.2002
29.10.2002
7
O
rIg
Inal
OC
r re
sult
(aBB
YY F
Iner
ead
er)
[32]
OCr result (aBBYY FInereader)
OCR
Res
ult
I2£
Kra
nken
kass
en
insg
esam
t2 ’
Oits
- kr
ankc
nkas
sen
Bet
riebs
- kr
anke
nkas
sen
Inoa
ngs-
kr
anke
nkas
sen
See-
K
rank
enka
sse
Brm
des-
kn
apps
chaf
t
Kas
sen
Mit-
gl
iede
r K
asse
n M
it-
glie
der
Kas
sen
Mit-
gl
iede
r [K
asse
n M
it-;
giie
der
Kas
sen
Mit-
gl
iede r
| Kas
sen
Mit-
gl
iede
r
1
68
1 68
1-
ii 1
857
—-
—
t 85
7
1
«■—
—
—
—
i
1 10
0 —
—
1
100
_
_ —
* —
—
I 0
,5
3 2S
8 —
—
3
258
•a
m
—
—
—
I 1
ido
3 97
4 —
3 97
4
-
_ —
—
*v '
F ift
2 7
7506
—
—
7
7506
—
—
_ _
—
• -*
—
Jfti
1 55
2 —
1 55
2
_ _
—
—
—
{0,4
5
8806
—
—
5
8806
_
—-
—
—
/0,5
7
3456
—
—
7
3456
_ _
—
—
—
70,*
11
11
918
—
11
11
918
_
_ —
—
—
70
,«
5 18
67
—
5
1867
__
_
_ —
—
—
10
,9
6 23
68
—
—
6 23
68
_,
_ _
—
—
—
11.0
16
23
2024
—
15
3080
8 __
■
_ _
—
—
11,i
2 14
911
—
—-
1 10
0 _
_ —
—
—
—
11
.8
24
1725
01
—
—
17
3472
9 7
1377
72
—
—
—
—
11.9
15
38
286
_ —
12
12
421
1 12
446
i 80
37
—
—
12.0
3
1643
—
—
* 3
1643
_
_ _
—
—
—
12,1
2
2754
6 —
—
1
25
1 27
521
—
—
—
—
12,2
8
4955
09
—
—
4 12
8060
3
3971
7 _
—
—
—
12,3
14
20
1436
7 —
—
* 2
8630
9
1128
76
—
—
—
—
12,6
2
2644
1 —
—
1
1148
0 1
1496
1 —
—
—
—
12
,7
1 55
045
—
1
5504
5 —
—
—
_
—
—
12,8
24
34
4920
4 12
32
3226
2 3
875
7 65
535
—
—
1 15
0108
12
,9
1 64
8 —
‘ -
1
648
—
—
—
—
—
—
13,3
1
2176
—
—
1
2176
—
—
—
—
—
—
13
,5
1 9
—
—
1 9
—
—
—
—
—
—
Insg
es.
214
7526
792
13
3839
135
157
6084
77
31
4368
15
i 80
37
1 1
150
108
Insg
es.
12,5
4 12
,96
Dur
chsc
hn 1
1,73
ttl
iche
r B
eitra
gssa
tz 1
2,20
[33]SFB 1342 WeSIS – Technical Papers No. 7
File A04 – trendshealthinsurance1968_2015.pdf
OrIgInalFi
le A
04 –
tren
dshe
alth
insu
ranc
e196
8_20
15.p
df
Orig
inal
OCR
Res
ult
Tabl
e 1.
Per
cent
ages
(and
sta
ndar
d er
rors
) of p
erso
ns u
nder
65
year
s of
age
with
hea
lth in
sura
nce
cove
rage
, by
cove
rage
type
, and
with
out h
ealth
insu
ranc
e:
Uni
ted
Stat
es, s
elec
ted
year
s 19
68-2
015
Year
Sa
mpl
e si
ze
Priv
ate
cove
rage
(any
)1 Pr
ivat
e co
vera
ge (e
mpl
oyer
)2 Pr
ivat
e co
vera
ge (o
ther
)3 19
68
120,
670
79.3
(0.3
9)
___
___
1970
44
,373
78
.7 (0
.53)
68
.6 (0
.60)
10
.0 (0
.37)
19
72
119,
939
77.3
(0.3
9)
69.4
(0.4
3)
7.8
(0.1
8)
1974
10
4,72
7 79
.7 (0
.31)
70
.5 (0
.35)
9.
6 (0
.18)
19
76
101,
594
78.9
(0.3
1)
68.5
(0.3
2)
10.3
(0.1
9)
1978
98
,465
79
.3 (0
.34)
70
.2 (0
.35)
9.
2 (0
.19)
19
80
91,4
25
79.4
(0.3
8)
71.4
(0.4
0)
8.0
(0.2
0)
1982
92
,489
78
.1 (0
.53)
70
.3 (0
.55)
7.
9 (0
.21)
19
84
46,7
29
76.9
(0.6
4)
68.4
(0.6
7)
8.7
(0.2
7)
1986
93
,396
76
.7 (0
.62)
69
.1 (0
.62)
7.
7 (0
.21)
19
89
54,8
60
76.8
(0.7
1)
69.3
(0.7
6)
7.6
(0.3
3)
1990
10
2,68
4 75
.9 (0
.51)
68
.3 (0
.51)
7.
6 (0
.19)
19
91
105,
053
74.2
(0.4
3)
66.4
(0.4
7)
7.8
(0.2
8)
1992
10
5,31
6 73
.6 (0
.48)
62
.8 (0
.52)
10
.8 (0
.31)
19
93
113,
042
72.0
(0.4
6)
64.9
(0.4
5)
7.1
(0.1
8)
1994
10
1,60
8 69
.9 (0
.50)
64
.0 (0
.48)
5.
9 (0
.17)
19
95
90,5
12
71.3
(0.4
2)
65.6
(0.4
3)
5.7
(0.1
6)
1996
56
,268
71
.2 (0
.55)
65
.1 (0
.57)
6.
1 (0
.22)
19
97
91,2
75
70.7
(0.3
6)
66.4
(0.3
6)
4.2
(0.1
3)
1998
87
,020
72
.1 (0
.36)
67
.5 (0
.37)
4.
6 (0
.14)
[34]
OCr result (aBBYY FInereader)
File
A04
– tr
ends
heal
thin
sura
nce1
968_
2015
Orig
inal
OCR
Res
ult
Tabl
e 1.
Per
cent
ages
(and
sta
ndar
d er
rors
) of p
erso
ns u
nder
65
year
s of
age
with
hea
lth in
sura
nce
cove
rage
, by
cove
rage
type
, and
with
out h
ealth
insu
ranc
e:
Uni
ted
Stat
es, s
elec
ted
year
s 19
68-2
015
Year
Sa
mpl
e si
ze
Priv
ate
cove
rage
(any
)1 Pr
ivat
e co
vera
ge (e
mpl
oyer
)2 Pr
ivat
e co
vera
ge (o
ther
)3 19
68
120,
670
79.3
(0.3
9)
___
___
1970
44
,373
78
.7 (0
.53)
68
.6 (0
.60)
10
.0 (0
.37)
19
72
119,
939
77.3
(0.3
9)
69.4
(0.4
3)
7.8
(0.1
8)
1974
10
4,72
7 79
.7 (0
.31)
70
.5 (0
.35)
9.
6 (0
.18)
19
76
101,
594
78.9
(0.3
1)
68.5
(0.3
2)
10.3
(0.1
9)
1978
98
,465
79
.3 (0
.34)
70
.2 (0
.35)
9.
2 (0
.19)
19
80
91,4
25
79.4
(0.3
8)
71.4
(0.4
0)
8.0
(0.2
0)
1982
92
,489
78
.1 (0
.53)
70
.3 (0
.55)
7.
9 (0
.21)
19
84
46,7
29
76.9
(0.6
4)
68.4
(0.6
7)
8.7
(0.2
7)
1986
93
,396
76
.7 (0
.62)
69
.1 (0
.62)
7.
7 (0
.21)
19
89
54,8
60
76.8
(0.7
1)
69.3
(0.7
6)
7.6
(0.3
3)
1990
10
2,68
4 75
.9 (0
.51)
68
.3 (0
.51)
7.
6 (0
.19)
19
91
105,
053
74.2
(0.4
3)
66.4
(0.4
7)
7.8
(0.2
8)
1992
10
5,31
6 73
.6 (0
.48)
62
.8 (0
.52)
10
.8 (0
.31)
19
93
113,
042
72.0
(0.4
6)
64.9
(0.4
5)
7.1
(0.1
8)
1994
10
1,60
8 69
.9 (0
.50)
64
.0 (0
.48)
5.
9 (0
.17)
19
95
90,5
12
71.3
(0.4
2)
65.6
(0.4
3)
5.7
(0.1
6)
1996
56
,268
71
.2 (0
.55)
65
.1 (0
.57)
6.
1 (0
.22)
19
97
91,2
75
70.7
(0.3
6)
66.4
(0.3
6)
4.2
(0.1
3)
1998
87
,020
72
.1 (0
.36)
67
.5 (0
.37)
4.
6 (0
.14)
[35]SFB 1342 WeSIS – Technical Papers No. 7
File A06 – Auto_Outline of Foreign Social Insurance.pdf
File
A06
– A
uto_
Out
line
of F
orei
gn S
ocia
l Ins
uran
ce.p
df
Orig
inal
OrIgInal
[36]
OCr result (aBBYY FInereader)
OCR
Res
ult
Con
trib
utio
ns
Old
-age
pen
sion
C
ount
ry a
nd
year
law
en
acte
d C
over
age
Insu
red
Em
ploy
er
Gov
ernm
ent
Con
ditio
ns fo
r re
ceip
t A
mou
nt
Aus
tral
ia:
1938
(n
ot
yet
effe
ctiv
e).
Em
ploy
ed
pers
ons
aged
16-
65 in
the
case
of
men
, 16
-60
in t
he
case
of
w
omen
. Im
porta
nt
ex
elus
ion.
—N
onm
anua
l em
ploy
ees a
t a r
ate
of
rem
uner
atio
n ex
ceed
ing
£365
a
year
.
Shar
ed
equa
lly
by
insu
red
and
empl
oyer
. W
eekl
y ra
te
for
sickn
ess
(cha
rt II
) and
inva
lidity
»—
men
, Is.
3d.;
wom
en, I
s. 2d
.
Subs
idy
for
sickn
ess
and
inva
lidity
in
sura
nce—
£ 1
00,
(MX
) a
ye>a
r fo
r ad
min
istra
tion;
pl
us
10s.
per
insu
red
per
year
un
til
defic
it of
in
itial
gen
erat
ion
is pa
id fo
r.
Bel
gium
: 192
4 ...
W
age
earn
ers
over
ag
e 14
. V
aryi
ng w
ith w
age
clas
ses
and
shar
ed
equa
lly b
y in
sure
d an
d em
ploy
er.
Subs
idy
for
each
pe
nsio
n;
fund
s re
quir
ed
for
orph
ans’
pen
sions
an
d tr
ansit
iona
l bo
nuse
s.
Age
65;
may
be
clai
med
by
men
at
age
60 a
nd b
y w
omen
at
age
55.
No
qual
ifyin
g pe
riod
.
Bas
ic a
mou
nt e
quiv
alen
t to
co
ntri
butio
ns.
Plus
G
over
nmen
t su
bsid
y of
50
perc
ent
of
pens
ion,
in
crea
sed
for
pers
ons
born
be
fore
188
4 an
d re
duce
d if
pens
ion
is cl
aim
ed
befo
re
age
65.
Max
imum
G
over
nmen
t su
bsid
y 1,
200
fran
cs
a ye
ar.
Bon
us
for
tran
sitio
nal g
ener
atio
n.
1925
....
Sala
ried
em
ploy
ees .
.. 3
perc
ent o
f sal
ary
up to
18,
000
fran
cs
a ye
ar.
4 pe
rcen
t of
sa
lary
up
to
18
,000
fr
ancs
a
year
un
til
1960
; in
crea
sing
th
erea
fter
and
reac
hing
5
perc
ent i
n 19
91.
Subs
idy
for
each
pe
nsio
n;
fund
s re
quir
ed
for
tran
sitio
nal
bonu
ses.
Age
65
for
men
, 60
for
wom
en; m
ay b
e cl
aim
ed 1
0 ye
ars
earl
ier.
No
qual
ifyin
g pe
riod
.
Bas
ic a
mou
nt e
quiv
alen
t to
co
ntri
butio
ns.
Plus
G
over
nmen
t su
bsid
y of
50
perc
ent
of
pens
ion,
in
crea
sed
for
pers
ons
born
be
fore
188
4 an
d re
duce
d if
pens
ion
is cl
aim
ed
befo
re
age
65.
Max
imum
G
over
nmen
t su
bsid
y 1,
200
fran
cs
a ye
ar.
Bon
us
for
tran
sitio
nal g
ener
atio
n.