Please disable Adblockers and enable JavaScript for domain CEWebS.cs.univie.ac.at! We have NO ADS, but they may interfere with some of our course material.

Installation

Source Code 
The prototype is, as mentioned in chapter 4 divided into two separate implementa- 
tions, the extraction tool and the web application. The source code is available at a 
GitHub repository.  
To test the service, two drawings, which are also part of the evaluation, are provided at the 
repository as well. 
 
Requirements: 
• python, version: 3.7, https://www.python.org/downloads/ 
• pip3 install redis flask PyPDF2 bs4 numpy pandas sklearn 
• start redis service 
 
 
Extraction Tool 
Link to repository: https://github.com/bscheibel/masterthesis_extraction.git 
 
Requirements in more detail: 
• beautifulsoup4 (bs4), version: 4.8.0 (downloaded via pip3) 
• numpy, version: 1.17.4 (downloaded via pip3) 
• pandas, version: 0.25.3 (downloaded via pip3) 
• sklearn, version: 0.25.3 (downloaded via pip3) 
 
 
Adapting the path in main.py: 
• config_path = path to project directory 
 
Input parameters if tool is used as a standalone command-line tool: 
• uuid = unique identifier consisting of up to 4 numbers, can be set randomly 
• path to the PDF file 
• redis parameters e.g. "localhost" 
• (optional) a custom EPS 
 
 
run from the command line, or called by another application: 
by using the command ”python3 main.py” with the needed input parameters e.g. python3 main.py 123  
ODER "python -m flask run" 
 
 
Web Service 
Link to repository: https://github.com/bscheibel/masterthesis_extraction.git 
 
Requirements in more detail: 
• flask, version: 1.1.1 (downloaded via pip3) 
• PyPDF2, version: 1.26.0 (downloaded via pip3) 
 
 
Adapting the paths in views.py: 
• path = path to project directory 
• db_params = redis configuration (e.g. localhost) 
• path_extraction = path to extraction tool directory  
• path_image = directory where uploaded images are stored 
 
 
Adapting the paths and setting the parameters 
For the web application, four paths have to be set.  
The first path is called ”path” and refers to the project directory. The other one is 
called ”path extraction” and refers to the directory of the extraction tool, as this tool is 
called by the web application. The path where the image, which is used for vizualisation 
in the browser, is set in the variable ”path image”. Lastly, the configuration for ”redis” 
has to be set in ”db params”.  
 
 
Additionally, the regulatory documents that can be 
provided have to be put into the folder ”app/static/isos”. 
Otherwise, no additional information can be provided, as regulatory documents are 
copyright protected and can not be put publicly on the repository.  
 
No input parameters for the application itself are needed, as the user will input the PDF file using the browser. 
 
If all these requirements are met, the web application can be run by entering ”python -m flask run” 
at the command line while being in the project directory. 
Letzte Änderung: 14.09.2020, 13:48 | 444 Worte